Esplin, M Sean; Manuck, Tracy A.; Varner, Michael W.; Christensen, Bryce; Biggio, Joseph; Bukowski, Radek; Parry, Samuel; Zhang, Heping; Huang, Hao; Andrews, William; Saade, George; Sadovsky, Yoel; Reddy, Uma M.; Ilekis, John
2015-01-01
Objective We sought to employ an innovative tool based on common biological pathways to identify specific phenotypes among women with spontaneous preterm birth (SPTB), in order to enhance investigators' ability to identify to highlight common mechanisms and underlying genetic factors responsible for SPTB. Study Design A secondary analysis of a prospective case-control multicenter study of SPTB. All cases delivered a preterm singleton at SPTB ≤34.0 weeks gestation. Each woman was assessed for the presence of underlying SPTB etiologies. A hierarchical cluster analysis was used to identify groups of women with homogeneous phenotypic profiles. One of the phenotypic clusters was selected for candidate gene association analysis using VEGAS software. Results 1028 women with SPTB were assigned phenotypes. Hierarchical clustering of the phenotypes revealed five major clusters. Cluster 1 (N=445) was characterized by maternal stress, cluster 2 (N=294) by premature membrane rupture, cluster 3 (N=120) by familial factors, and cluster 4 (N=63) by maternal comorbidities. Cluster 5 (N=106) was multifactorial, characterized by infection (INF), decidual hemorrhage (DH) and placental dysfunction (PD). These three phenotypes were highly correlated by Chi-square analysis [PD and DH (p<2.2e-6); PD and INF (p=6.2e-10); INF and DH (p=0.0036)]. Gene-based testing identified the INS (insulin) gene as significantly associated with cluster 3 of SPTB. Conclusion We identified 5 major clusters of SPTB based on a phenotype tool and hierarchal clustering. There was significant correlation between several of the phenotypes. The INS gene was associated with familial factors underlying SPTB. PMID:26070700
Clinical Characteristics of Exacerbation-Prone Adult Asthmatics Identified by Cluster Analysis.
Kim, Mi Ae; Shin, Seung Woo; Park, Jong Sook; Uh, Soo Taek; Chang, Hun Soo; Bae, Da Jeong; Cho, You Sook; Park, Hae Sim; Yoon, Ho Joo; Choi, Byoung Whui; Kim, Yong Hoon; Park, Choon Sik
2017-11-01
Asthma is a heterogeneous disease characterized by various types of airway inflammation and obstruction. Therefore, it is classified into several subphenotypes, such as early-onset atopic, obese non-eosinophilic, benign, and eosinophilic asthma, using cluster analysis. A number of asthmatics frequently experience exacerbation over a long-term follow-up period, but the exacerbation-prone subphenotype has rarely been evaluated by cluster analysis. This prompted us to identify clusters reflecting asthma exacerbation. A uniform cluster analysis method was applied to 259 adult asthmatics who were regularly followed-up for over 1 year using 12 variables, selected on the basis of their contribution to asthma phenotypes. After clustering, clinical profiles and exacerbation rates during follow-up were compared among the clusters. Four subphenotypes were identified: cluster 1 was comprised of patients with early-onset atopic asthma with preserved lung function, cluster 2 late-onset non-atopic asthma with impaired lung function, cluster 3 early-onset atopic asthma with severely impaired lung function, and cluster 4 late-onset non-atopic asthma with well-preserved lung function. The patients in clusters 2 and 3 were identified as exacerbation-prone asthmatics, showing a higher risk of asthma exacerbation. Two different phenotypes of exacerbation-prone asthma were identified among Korean asthmatics using cluster analysis; both were characterized by impaired lung function, but the age at asthma onset and atopic status were different between the two. Copyright © 2017 The Korean Academy of Asthma, Allergy and Clinical Immunology · The Korean Academy of Pediatric Allergy and Respiratory Disease
Is It Feasible to Identify Natural Clusters of TSC-Associated Neuropsychiatric Disorders (TAND)?
Leclezio, Loren; Gardner-Lubbe, Sugnet; de Vries, Petrus J
2018-04-01
Tuberous sclerosis complex (TSC) is a genetic disorder with multisystem involvement. The lifetime prevalence of TSC-Associated Neuropsychiatric Disorders (TAND) is in the region of 90% in an apparently unique, individual pattern. This "uniqueness" poses significant challenges for diagnosis, psycho-education, and intervention planning. To date, no studies have explored whether there may be natural clusters of TAND. The purpose of this feasibility study was (1) to investigate the practicability of identifying natural TAND clusters, and (2) to identify appropriate multivariate data analysis techniques for larger-scale studies. TAND Checklist data were collected from 56 individuals with a clinical diagnosis of TSC (n = 20 from South Africa; n = 36 from Australia). Using R, the open-source statistical platform, mean squared contingency coefficients were calculated to produce a correlation matrix, and various cluster analyses and exploratory factor analysis were examined. Ward's method rendered six TAND clusters with good face validity and significant convergence with a six-factor exploratory factor analysis solution. The "bottom-up" data-driven strategies identified a "scholastic" cluster of TAND manifestations, an "autism spectrum disorder-like" cluster, a "dysregulated behavior" cluster, a "neuropsychological" cluster, a "hyperactive/impulsive" cluster, and a "mixed/mood" cluster. These feasibility results suggest that a combination of cluster analysis and exploratory factor analysis methods may be able to identify clinically meaningful natural TAND clusters. Findings require replication and expansion in larger dataset, and could include quantification of cluster or factor scores at an individual level. Copyright © 2018 Elsevier Inc. All rights reserved.
Using cluster analysis to identify phenotypes and validation of mortality in men with COPD.
Chen, Chiung-Zuei; Wang, Liang-Yi; Ou, Chih-Ying; Lee, Cheng-Hung; Lin, Chien-Chung; Hsiue, Tzuen-Ren
2014-12-01
Cluster analysis has been proposed to examine phenotypic heterogeneity in chronic obstructive pulmonary disease (COPD). The aim of this study was to use cluster analysis to define COPD phenotypes and validate them by assessing their relationship with mortality. Male subjects with COPD were recruited to identify and validate COPD phenotypes. Seven variables were assessed for their relevance to COPD, age, FEV(1) % predicted, BMI, history of severe exacerbations, mMRC, SpO(2), and Charlson index. COPD groups were identified by cluster analysis and validated prospectively against mortality during a 4-year follow-up. Analysis of 332 COPD subjects identified five clusters from cluster A to cluster E. Assessment of the predictive validity of these clusters of COPD showed that cluster E patients had higher all cause mortality (HR 18.3, p < 0.0001), and respiratory cause mortality (HR 21.5, p < 0.0001) than those in the other four groups. Cluster E patients also had higher all cause mortality (HR 14.3, p = 0.0002) and respiratory cause mortality (HR 10.1, p = 0.0013) than patients in cluster D alone. COPD patient with severe airflow limitation, many symptoms, and a history of frequent severe exacerbations was a novel and distinct clinical phenotype predicting mortality in men with COPD.
Esplin, M Sean; Manuck, Tracy A; Varner, Michael W; Christensen, Bryce; Biggio, Joseph; Bukowski, Radek; Parry, Samuel; Zhang, Heping; Huang, Hao; Andrews, William; Saade, George; Sadovsky, Yoel; Reddy, Uma M; Ilekis, John
2015-09-01
We sought to use an innovative tool that is based on common biologic pathways to identify specific phenotypes among women with spontaneous preterm birth (SPTB) to enhance investigators' ability to identify and to highlight common mechanisms and underlying genetic factors that are responsible for SPTB. We performed a secondary analysis of a prospective case-control multicenter study of SPTB. All cases delivered a preterm singleton at SPTB ≤34.0 weeks' gestation. Each woman was assessed for the presence of underlying SPTB causes. A hierarchic cluster analysis was used to identify groups of women with homogeneous phenotypic profiles. One of the phenotypic clusters was selected for candidate gene association analysis with the use of VEGAS software. One thousand twenty-eight women with SPTB were assigned phenotypes. Hierarchic clustering of the phenotypes revealed 5 major clusters. Cluster 1 (n = 445) was characterized by maternal stress; cluster 2 (n = 294) was characterized by premature membrane rupture; cluster 3 (n = 120) was characterized by familial factors, and cluster 4 (n = 63) was characterized by maternal comorbidities. Cluster 5 (n = 106) was multifactorial and characterized by infection (INF), decidual hemorrhage (DH), and placental dysfunction (PD). These 3 phenotypes were correlated highly by χ(2) analysis (PD and DH, P < 2.2e-6; PD and INF, P = 6.2e-10; INF and DH, (P = .0036). Gene-based testing identified the INS (insulin) gene as significantly associated with cluster 3 of SPTB. We identified 5 major clusters of SPTB based on a phenotype tool and hierarch clustering. There was significant correlation between several of the phenotypes. The INS gene was associated with familial factors that were underlying SPTB. Copyright © 2015 Elsevier Inc. All rights reserved.
Cluster analysis in phenotyping a Portuguese population.
Loureiro, C C; Sa-Couto, P; Todo-Bom, A; Bousquet, J
2015-09-03
Unbiased cluster analysis using clinical parameters has identified asthma phenotypes. Adding inflammatory biomarkers to this analysis provided a better insight into the disease mechanisms. This approach has not yet been applied to asthmatic Portuguese patients. To identify phenotypes of asthma using cluster analysis in a Portuguese asthmatic population treated in secondary medical care. Consecutive patients with asthma were recruited from the outpatient clinic. Patients were optimally treated according to GINA guidelines and enrolled in the study. Procedures were performed according to a standard evaluation of asthma. Phenotypes were identified by cluster analysis using Ward's clustering method. Of the 72 patients enrolled, 57 had full data and were included for cluster analysis. Distribution was set in 5 clusters described as follows: cluster (C) 1, early onset mild allergic asthma; C2, moderate allergic asthma, with long evolution, female prevalence and mixed inflammation; C3, allergic brittle asthma in young females with early disease onset and no evidence of inflammation; C4, severe asthma in obese females with late disease onset, highly symptomatic despite low Th2 inflammation; C5, severe asthma with chronic airflow obstruction, late disease onset and eosinophilic inflammation. In our study population, the identified clusters were mainly coincident with other larger-scale cluster analysis. Variables such as age at disease onset, obesity, lung function, FeNO (Th2 biomarker) and disease severity were important for cluster distinction. Copyright © 2015. Published by Elsevier España, S.L.U.
Identifying At-Risk Students in General Chemistry via Cluster Analysis of Affective Characteristics
ERIC Educational Resources Information Center
Chan, Julia Y. K.; Bauer, Christopher F.
2014-01-01
The purpose of this study is to identify academically at-risk students in first-semester general chemistry using affective characteristics via cluster analysis. Through the clustering of six preselected affective variables, three distinct affective groups were identified: low (at-risk), medium, and high. Students in the low affective group…
Using Cluster Analysis to Examine Husband-Wife Decision Making
ERIC Educational Resources Information Center
Bonds-Raacke, Jennifer M.
2006-01-01
Cluster analysis has a rich history in many disciplines and although cluster analysis has been used in clinical psychology to identify types of disorders, its use in other areas of psychology has been less popular. The purpose of the current experiments was to use cluster analysis to investigate husband-wife decision making. Cluster analysis was…
CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks.
Li, Min; Li, Dongyan; Tang, Yu; Wu, Fangxiang; Wang, Jianxin
2017-08-31
Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster.
CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks
Li, Min; Li, Dongyan; Tang, Yu; Wang, Jianxin
2017-01-01
Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster. PMID:28858211
Fong, Allan; Clark, Lindsey; Cheng, Tianyi; Franklin, Ella; Fernandez, Nicole; Ratwani, Raj; Parker, Sarah Henrickson
2017-07-01
The objective of this paper is to identify attribute patterns of influential individuals in intensive care units using unsupervised cluster analysis. Despite the acknowledgement that culture of an organisation is critical to improving patient safety, specific methods to shift culture have not been explicitly identified. A social network analysis survey was conducted and an unsupervised cluster analysis was used. A total of 100 surveys were gathered. Unsupervised cluster analysis was used to group individuals with similar dimensions highlighting three general genres of influencers: well-rounded, knowledge and relational. Culture is created locally by individual influencers. Cluster analysis is an effective way to identify common characteristics among members of an intensive care unit team that are noted as highly influential by their peers. To change culture, identifying and then integrating the influencers in intervention development and dissemination may create more sustainable and effective culture change. Additional studies are ongoing to test the effectiveness of utilising these influencers to disseminate patient safety interventions. This study offers an approach that can be helpful in both identifying and understanding influential team members and may be an important aspect of developing methods to change organisational culture. © 2017 John Wiley & Sons Ltd.
Identifying Subgroups of Tinnitus Using Novel Resting State fMRI Biomarkers and Cluster Analysis
2016-10-01
AWARD NUMBER: W81XWH-15-2-0032 TITLE: Identifying Subgroups of Tinnitus Using Novel Resting State fMRI Biomarkers and Cluster Analysis PRINCIPAL...4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER Identifying Subgroups of Tinnitus Using Novel Resting State fMRI Biomarkers and Cluster Analysis 5b...Public Release; Distribution Unlimited 13. SUPPLEMENTARY NOTES 14. ABSTRACT The subject of the project is FY14 PRMRP Topic Area – Tinnitus . The broad
Phenotypes Determined by Cluster Analysis in Moderate to Severe Bronchial Asthma.
Youroukova, Vania M; Dimitrova, Denitsa G; Valerieva, Anna D; Lesichkova, Spaska S; Velikova, Tsvetelina V; Ivanova-Todorova, Ekaterina I; Tumangelova-Yuzeir, Kalina D
2017-06-01
Bronchial asthma is a heterogeneous disease that includes various subtypes. They may share similar clinical characteristics, but probably have different pathological mechanisms. To identify phenotypes using cluster analysis in moderate to severe bronchial asthma and to compare differences in clinical, physiological, immunological and inflammatory data between the clusters. Forty adult patients with moderate to severe bronchial asthma out of exacerbation were included. All underwent clinical assessment, anthropometric measurements, skin prick testing, standard spirometry and measurement fraction of exhaled nitric oxide. Blood eosinophilic count, serum total IgE and periostin levels were determined. Two-step cluster approach, hierarchical clustering method and k-mean analysis were used for identification of the clusters. We have identified four clusters. Cluster 1 (n=14) - late-onset, non-atopic asthma with impaired lung function, Cluster 2 (n=13) - late-onset, atopic asthma, Cluster 3 (n=6) - late-onset, aspirin sensitivity, eosinophilic asthma, and Cluster 4 (n=7) - early-onset, atopic asthma. Our study is the first in Bulgaria in which cluster analysis is applied to asthmatic patients. We identified four clusters. The variables with greatest force for differentiation in our study were: age of asthma onset, duration of diseases, atopy, smoking, blood eosinophils, nonsteroidal anti-inflammatory drugs hypersensitivity, baseline FEV1/FVC and symptoms severity. Our results support the concept of heterogeneity of bronchial asthma and demonstrate that cluster analysis can be an useful tool for phenotyping of disease and personalized approach to the treatment of patients.
Identification and validation of asthma phenotypes in Chinese population using cluster analysis.
Wang, Lei; Liang, Rui; Zhou, Ting; Zheng, Jing; Liang, Bing Miao; Zhang, Hong Ping; Luo, Feng Ming; Gibson, Peter G; Wang, Gang
2017-10-01
Asthma is a heterogeneous airway disease, so it is crucial to clearly identify clinical phenotypes to achieve better asthma management. To identify and prospectively validate asthma clusters in a Chinese population. Two hundred eighty-four patients were consecutively recruited and 18 sociodemographic and clinical variables were collected. Hierarchical cluster analysis was performed by the Ward method followed by k-means cluster analysis. Then, a prospective 12-month cohort study was used to validate the identified clusters. Five clusters were successfully identified. Clusters 1 (n = 71) and 3 (n = 81) were mild asthma phenotypes with slight airway obstruction and low exacerbation risk, but with a sex differential. Cluster 2 (n = 65) described an "allergic" phenotype, cluster 4 (n = 33) featured a "fixed airflow limitation" phenotype with smoking, and cluster 5 (n = 34) was a "low socioeconomic status" phenotype. Patients in clusters 2, 4, and 5 had distinctly lower socioeconomic status and more psychological symptoms. Cluster 2 had a significantly increased risk of exacerbations (risk ratio [RR] 1.13, 95% confidence interval [CI] 1.03-1.25), unplanned visits for asthma (RR 1.98, 95% CI 1.07-3.66), and emergency visits for asthma (RR 7.17, 95% CI 1.26-40.80). Cluster 4 had an increased risk of unplanned visits (RR 2.22, 95% CI 1.02-4.81), and cluster 5 had increased emergency visits (RR 12.72, 95% CI 1.95-69.78). Kaplan-Meier analysis confirmed that cluster grouping was predictive of time to the first asthma exacerbation, unplanned visit, emergency visit, and hospital admission (P < .0001 for all comparisons). We identified 3 clinical clusters as "allergic asthma," "fixed airflow limitation," and "low socioeconomic status" phenotypes that are at high risk of severe asthma exacerbations and that have management implications for clinical practice in developing countries. Copyright © 2017 American College of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
Rennard, Stephen I; Locantore, Nicholas; Delafont, Bruno; Tal-Singer, Ruth; Silverman, Edwin K; Vestbo, Jørgen; Miller, Bruce E; Bakke, Per; Celli, Bartolomé; Calverley, Peter M A; Coxson, Harvey; Crim, Courtney; Edwards, Lisa D; Lomas, David A; MacNee, William; Wouters, Emiel F M; Yates, Julie C; Coca, Ignacio; Agustí, Alvar
2015-03-01
Chronic obstructive pulmonary disease (COPD) is a heterogeneous disease that likely includes clinically relevant subgroups. To identify subgroups of COPD in ECLIPSE (Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints) subjects using cluster analysis and to assess clinically meaningful outcomes of the clusters during 3 years of longitudinal follow-up. Factor analysis was used to reduce 41 variables determined at recruitment in 2,164 patients with COPD to 13 main factors, and the variables with the highest loading were used for cluster analysis. Clusters were evaluated for their relationship with clinically meaningful outcomes during 3 years of follow-up. The relationships among clinical parameters were evaluated within clusters. Five subgroups were distinguished using cross-sectional clinical features. These groups differed regarding outcomes. Cluster A included patients with milder disease and had fewer deaths and hospitalizations. Cluster B had less systemic inflammation at baseline but had notable changes in health status and emphysema extent. Cluster C had many comorbidities, evidence of systemic inflammation, and the highest mortality. Cluster D had low FEV1, severe emphysema, and the highest exacerbation and COPD hospitalization rate. Cluster E was intermediate for most variables and may represent a mixed group that includes further clusters. The relationships among clinical variables within clusters differed from that in the entire COPD population. Cluster analysis using baseline data in ECLIPSE identified five COPD subgroups that differ in outcomes and inflammatory biomarkers and show different relationships between clinical parameters, suggesting the clusters represent clinically and biologically different subtypes of COPD.
Suicide in the oldest old: an observational study and cluster analysis.
Sinyor, Mark; Tan, Lynnette Pei Lin; Schaffer, Ayal; Gallagher, Damien; Shulman, Kenneth
2016-01-01
The older population are at a high risk for suicide. This study sought to learn more about the characteristics of suicide in the oldest-old and to use a cluster analysis to determine if oldest-old suicide victims assort into clinically meaningful subgroups. Data were collected from a coroner's chart review of suicide victims in Toronto from 1998 to 2011. We compared two age groups (65-79 year olds, n = 335, and 80+ year olds, n = 191) and then conducted a hierarchical agglomerative cluster analysis using Ward's method to identify distinct clusters in the 80+ group. The younger and older age groups differed according to marital status, living circumstances and pattern of stressors. The cluster analysis identified three distinct clusters in the 80+ group. Cluster 1 was the largest (n = 124) and included people who were either married or widowed who had significantly more depression and somewhat more medical health stressors. In contrast, cluster 2 (n = 50) comprised people who were almost all single and living alone with significantly less identified depression and slightly fewer medical health stressors. All members of cluster 3 (n = 17) lived in a retirement residence or nursing home, and this group had the highest rates of depression, dementia, other mental illness and past suicide attempts. This is the first study to use the cluster analysis technique to identify meaningful subgroups among suicide victims in the oldest-old. The results reveal different patterns of suicide in the older population that may be relevant for clinical care. Copyright © 2015 John Wiley & Sons, Ltd.
Visualizing Confidence in Cluster-Based Ensemble Weather Forecast Analyses.
Kumpf, Alexander; Tost, Bianca; Baumgart, Marlene; Riemer, Michael; Westermann, Rudiger; Rautenhaus, Marc
2018-01-01
In meteorology, cluster analysis is frequently used to determine representative trends in ensemble weather predictions in a selected spatio-temporal region, e.g., to reduce a set of ensemble members to simplify and improve their analysis. Identified clusters (i.e., groups of similar members), however, can be very sensitive to small changes of the selected region, so that clustering results can be misleading and bias subsequent analyses. In this article, we - a team of visualization scientists and meteorologists-deliver visual analytics solutions to analyze the sensitivity of clustering results with respect to changes of a selected region. We propose an interactive visual interface that enables simultaneous visualization of a) the variation in composition of identified clusters (i.e., their robustness), b) the variability in cluster membership for individual ensemble members, and c) the uncertainty in the spatial locations of identified trends. We demonstrate that our solution shows meteorologists how representative a clustering result is, and with respect to which changes in the selected region it becomes unstable. Furthermore, our solution helps to identify those ensemble members which stably belong to a given cluster and can thus be considered similar. In a real-world application case we show how our approach is used to analyze the clustering behavior of different regions in a forecast of "Tropical Cyclone Karl", guiding the user towards the cluster robustness information required for subsequent ensemble analysis.
ERIC Educational Resources Information Center
Zettergren, Peter
2007-01-01
A modern clustering technique was applied to age-10 and age-13 sociometric data with the purpose of identifying longitudinally stable peer status clusters. The study included 445 girls from a Swedish longitudinal study. The identified temporally stable clusters of rejected, popular, and average girls were essentially larger than corresponding…
Identification and characterization of near-fatal asthma phenotypes by cluster analysis.
Serrano-Pariente, J; Rodrigo, G; Fiz, J A; Crespo, A; Plaza, V
2015-09-01
Near-fatal asthma (NFA) is a heterogeneous clinical entity and several profiles of patients have been described according to different clinical, pathophysiological and histological features. However, there are no previous studies that identify in a unbiased way--using statistical methods such as clusters analysis--different phenotypes of NFA. Therefore, the aim of the present study was to identify and to characterize phenotypes of near fatal asthma using a cluster analysis. Over a period of 2 years, 33 Spanish hospitals enrolled 179 asthmatics admitted for an episode of NFA. A cluster analysis using two-steps algorithm was performed from data of 84 of these cases. The analysis defined three clusters of patients with NFA: cluster 1, the largest, including older patients with clinical and therapeutic criteria of severe asthma; cluster 2, with an high proportion of respiratory arrest (68%), impaired consciousness level (82%) and mechanical ventilation (93%); and cluster 3, which included younger patients, characterized by an insufficient anti-inflammatory treatment and frequent sensitization to Alternaria alternata and soybean. These results identify specific asthma phenotypes involved in NFA, confirming in part previous findings observed in studies with a clinical approach. The identification of patients with a specific NFA phenotype could suggest interventions to prevent future severe asthma exacerbations. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Identifying novel phenotypes of acute heart failure using cluster analysis of clinical variables.
Horiuchi, Yu; Tanimoto, Shuzou; Latif, A H M Mahbub; Urayama, Kevin Y; Aoki, Jiro; Yahagi, Kazuyuki; Okuno, Taishi; Sato, Yu; Tanaka, Tetsu; Koseki, Keita; Komiyama, Kota; Nakajima, Hiroyoshi; Hara, Kazuhiro; Tanabe, Kengo
2018-07-01
Acute heart failure (AHF) is a heterogeneous disease caused by various cardiovascular (CV) pathophysiology and multiple non-CV comorbidities. We aimed to identify clinically important subgroups to improve our understanding of the pathophysiology of AHF and inform clinical decision-making. We evaluated detailed clinical data of 345 consecutive AHF patients using non-hierarchical cluster analysis of 77 variables, including age, sex, HF etiology, comorbidities, physical findings, laboratory data, electrocardiogram, echocardiogram and treatment during hospitalization. Cox proportional hazards regression analysis was performed to estimate the association between the clusters and clinical outcomes. Three clusters were identified. Cluster 1 (n=108) represented "vascular failure". This cluster had the highest average systolic blood pressure at admission and lung congestion with type 2 respiratory failure. Cluster 2 (n=89) represented "cardiac and renal failure". They had the lowest ejection fraction (EF) and worst renal function. Cluster 3 (n=148) comprised mostly older patients and had the highest prevalence of atrial fibrillation and preserved EF. Death or HF hospitalization within 12-month occurred in 23% of Cluster 1, 36% of Cluster 2 and 36% of Cluster 3 (p=0.034). Compared with Cluster 1, risk of death or HF hospitalization was 1.74 (95% CI, 1.03-2.95, p=0.037) for Cluster 2 and 1.82 (95% CI, 1.13-2.93, p=0.014) for Cluster 3. Cluster analysis may be effective in producing clinically relevant categories of AHF, and may suggest underlying pathophysiology and potential utility in predicting clinical outcomes. Copyright © 2018 Elsevier B.V. All rights reserved.
ERIC Educational Resources Information Center
DiStefano, Christine; Kamphaus, R. W.
2006-01-01
Two classification methods, latent class cluster analysis and cluster analysis, are used to identify groups of child behavioral adjustment underlying a sample of elementary school children aged 6 to 11 years. Behavioral rating information across 14 subscales was obtained from classroom teachers and used as input for analyses. Both the procedures…
Psychosocial Costs of Racism to Whites: Exploring Patterns through Cluster Analysis
ERIC Educational Resources Information Center
Spanierman, Lisa B.; Poteat, V. Paul; Beer, Amanda M.; Armstrong, Patrick Ian
2006-01-01
Participants (230 White college students) completed the Psychosocial Costs of Racism to Whites (PCRW) Scale. Using cluster analysis, we identified 5 distinct cluster groups on the basis of PCRW subscale scores: the unempathic and unaware cluster contained the lowest empathy scores; the insensitive and afraid cluster consisted of low empathy and…
Walthouwer, Michel Jean Louis; Oenema, Anke; Soetens, Katja; Lechner, Lilian; de Vries, Hein
2014-11-01
Developing nutrition education interventions based on clusters of dietary patterns can only be done adequately when it is clear if distinctive clusters of dietary patterns can be derived and reproduced over time, if cluster membership is stable, and if it is predictable which type of people belong to a certain cluster. Hence, this study aimed to: (1) identify clusters of dietary patterns among Dutch adults, (2) test the reproducibility of these clusters and stability of cluster membership over time, and (3) identify sociodemographic predictors of cluster membership and cluster transition. This study had a longitudinal design with online measurements at baseline (N=483) and 6 months follow-up (N=379). Dietary intake was assessed with a validated food frequency questionnaire. A hierarchical cluster analysis was performed, followed by a K-means cluster analysis. Multinomial logistic regression analyses were conducted to identify the sociodemographic predictors of cluster membership and cluster transition. At baseline and follow-up, a comparable three-cluster solution was derived, distinguishing a healthy, moderately healthy, and unhealthy dietary pattern. Male and lower educated participants were significantly more likely to have a less healthy dietary pattern. Further, 251 (66.2%) participants remained in the same cluster, 45 (11.9%) participants changed to an unhealthier cluster, and 83 (21.9%) participants shifted to a healthier cluster. Men and people living alone were significantly more likely to shift toward a less healthy dietary pattern. Distinctive clusters of dietary patterns can be derived. Yet, cluster membership is unstable and only few sociodemographic factors were associated with cluster membership and cluster transition. These findings imply that clusters based on dietary intake may not be suitable as a basis for nutrition education interventions. Copyright © 2014 Elsevier Ltd. All rights reserved.
Lee, Alexandra J; Chang, Ivan; Burel, Julie G; Lindestam Arlehamn, Cecilia S; Mandava, Aishwarya; Weiskopf, Daniela; Peters, Bjoern; Sette, Alessandro; Scheuermann, Richard H; Qian, Yu
2018-04-17
Computational methods for identification of cell populations from polychromatic flow cytometry data are changing the paradigm of cytometry bioinformatics. Data clustering is the most common computational approach to unsupervised identification of cell populations from multidimensional cytometry data. However, interpretation of the identified data clusters is labor-intensive. Certain types of user-defined cell populations are also difficult to identify by fully automated data clustering analysis. Both are roadblocks before a cytometry lab can adopt the data clustering approach for cell population identification in routine use. We found that combining recursive data filtering and clustering with constraints converted from the user manual gating strategy can effectively address these two issues. We named this new approach DAFi: Directed Automated Filtering and Identification of cell populations. Design of DAFi preserves the data-driven characteristics of unsupervised clustering for identifying novel cell subsets, but also makes the results interpretable to experimental scientists through mapping and merging the multidimensional data clusters into the user-defined two-dimensional gating hierarchy. The recursive data filtering process in DAFi helped identify small data clusters which are otherwise difficult to resolve by a single run of the data clustering method due to the statistical interference of the irrelevant major clusters. Our experiment results showed that the proportions of the cell populations identified by DAFi, while being consistent with those by expert centralized manual gating, have smaller technical variances across samples than those from individual manual gating analysis and the nonrecursive data clustering analysis. Compared with manual gating segregation, DAFi-identified cell populations avoided the abrupt cut-offs on the boundaries. DAFi has been implemented to be used with multiple data clustering methods including K-means, FLOCK, FlowSOM, and the ClusterR package. For cell population identification, DAFi supports multiple options including clustering, bisecting, slope-based gating, and reversed filtering to meet various autogating needs from different scientific use cases. © 2018 International Society for Advancement of Cytometry. © 2018 International Society for Advancement of Cytometry.
Grimsley, Jasmine M S; Gadziola, Marie A; Wenstrup, Jeffrey J
2012-01-01
Mouse pups vocalize at high rates when they are cold or isolated from the nest. The proportions of each syllable type produced carry information about disease state and are being used as behavioral markers for the internal state of animals. Manual classifications of these vocalizations identified 10 syllable types based on their spectro-temporal features. However, manual classification of mouse syllables is time consuming and vulnerable to experimenter bias. This study uses an automated cluster analysis to identify acoustically distinct syllable types produced by CBA/CaJ mouse pups, and then compares the results to prior manual classification methods. The cluster analysis identified two syllable types, based on their frequency bands, that have continuous frequency-time structure, and two syllable types featuring abrupt frequency transitions. Although cluster analysis computed fewer syllable types than manual classification, the clusters represented well the probability distributions of the acoustic features within syllables. These probability distributions indicate that some of the manually classified syllable types are not statistically distinct. The characteristics of the four classified clusters were used to generate a Microsoft Excel-based mouse syllable classifier that rapidly categorizes syllables, with over a 90% match, into the syllable types determined by cluster analysis.
Gonzalez, Robert; Suppes, Trisha; Zeitzer, Jamie; McClung, Colleen; Tamminga, Carol; Tohen, Mauricio; Forero, Angelica; Dwivedi, Alok; Alvarado, Andres
2018-02-19
Multiple types of chronobiological disturbances have been reported in bipolar disorder, including characteristics associated with general activity levels, sleep, and rhythmicity. Previous studies have focused on examining the individual relationships between affective state and chronobiological characteristics. The aim of this study was to conduct a variable cluster analysis in order to ascertain how mood states are associated with chronobiological traits in bipolar I disorder (BDI). We hypothesized that manic symptomatology would be associated with disturbances of rhythm. Variable cluster analysis identified five chronobiological clusters in 105 BDI subjects. Cluster 1, comprising subjective sleep quality was associated with both mania and depression. Cluster 2, which comprised variables describing the degree of rhythmicity, was associated with mania. Significant associations between mood state and cluster analysis-identified chronobiological variables were noted. Disturbances of mood were associated with subjectively assessed sleep disturbances as opposed to objectively determined, actigraphy-based sleep variables. No associations with general activity variables were noted. Relationships between gender and medication classes in use and cluster analysis-identified chronobiological characteristics were noted. Exploratory analyses noted that medication class had a larger impact on these relationships than the number of psychiatric medications in use. In a BDI sample, variable cluster analysis was able to group related chronobiological variables. The results support our primary hypothesis that mood state, particularly mania, is associated with chronobiological disturbances. Further research is required in order to define these relationships and to determine the directionality of the associations between mood state and chronobiological characteristics.
Ecological tolerances of Miocene larger benthic foraminifera from Indonesia
NASA Astrophysics Data System (ADS)
Novak, Vibor; Renema, Willem
2018-01-01
To provide a comprehensive palaeoenvironmental reconstruction based on larger benthic foraminifera (LBF), a quantitative analysis of their assemblage composition is needed. Besides microfacies analysis which includes environmental preferences of foraminiferal taxa, statistical analyses should also be employed. Therefore, detrended correspondence analysis and cluster analysis were performed on relative abundance data of identified LBF assemblages deposited in mixed carbonate-siliciclastic (MCS) systems and blue-water (BW) settings. Studied MCS system localities include ten sections from the central part of the Kutai Basin in East Kalimantan, ranging from late Burdigalian to Serravallian age. The BW samples were collected from eleven sections of the Bulu Formation on Central Java, dated as Serravallian. Results from detrended correspondence analysis reveal significant differences between these two environmental settings. Cluster analysis produced five clusters of samples; clusters 1 and 2 comprise dominantly MCS samples, clusters 3 and 4 with dominance of BW samples, and cluster 5 showing a mixed composition with both MCS and BW samples. The results of cluster analysis were afterwards subjected to indicator species analysis resulting in the interpretation that generated three groups among LBF taxa: typical assemblage indicators, regularly occurring taxa and rare taxa. By interpreting the results of detrended correspondence analysis, cluster analysis and indicator species analysis, along with environmental preferences of identified LBF taxa, a palaeoenvironmental model is proposed for the distribution of LBF in Miocene MCS systems and adjacent BW settings of Indonesia.
Won, Jong Chul; Im, Yong-Jin; Lee, Ji-Hyun; Kim, Chong Hwa; Kwon, Hyuk Sang; Cha, Bong-Yun; Park, Tae Sun
2017-01-01
Patients with diabetic peripheral neuropathy (DPN) is the most common complication. However, patients are usually suffering from not only diverse sensory deficit but also neuropathy-related discomforts. The aim of this study is to identify distinct groups of patients with DPN with respect to its clinical impacts on symptom patterns and comorbidities. A hierarchical cluster analysis and factor analysis were performed to identify relevant subgroups of patients with DPN ( n = 1338) and symptom patterns. Patients with DPN were divided into three clusters: asymptomatic (cluster 1, n = 448, 33.5%), moderate symptoms with disturbed sleep (cluster 2, n = 562, 42.0%), and severe symptoms with decreased quality of life (cluster 3, n = 328, 24.5%). Patients in cluster 3, compared with clusters 1 and 2, were characterized by higher levels of HbA1c and more severe pain and physical impairments. Patients in cluster 2 had moderate pain levels but disturbed sleep patterns comparable to those in cluster 3. The frequency of symptoms on each item of MNSI by "painful" symptom pattern showed a similar distribution pattern with increasing intensities along the three clusters. Cluster and factor analysis endorsed the use of comprehensive and symptomatic subgrouping to individualize the evaluation of patients with DPN.
Ning, P; Guo, Y F; Sun, T Y; Zhang, H S; Chai, D; Li, X M
2016-09-01
To study the distinct clinical phenotype of chronic airway diseases by hierarchical cluster analysis and two-step cluster analysis. A population sample of adult patients in Donghuamen community, Dongcheng district and Qinghe community, Haidian district, Beijing from April 2012 to January 2015, who had wheeze within the last 12 months, underwent detailed investigation, including a clinical questionnaire, pulmonary function tests, total serum IgE levels, blood eosinophil level and a peak flow diary. Nine variables were chosen as evaluating parameters, including pre-salbutamol forced expired volume in one second(FEV1)/forced vital capacity(FVC) ratio, pre-salbutamol FEV1, percentage of post-salbutamol change in FEV1, residual capacity, diffusing capacity of the lung for carbon monoxide/alveolar volume adjusted for haemoglobin level, peak expiratory flow(PEF) variability, serum IgE level, cumulative tobacco cigarette consumption (pack-years) and respiratory symptoms (cough and expectoration). Subjects' different clinical phenotype by hierarchical cluster analysis and two-step cluster analysis was identified. (1) Four clusters were identified by hierarchical cluster analysis. Cluster 1 was chronic bronchitis in smokers with normal pulmonary function. Cluster 2 was chronic bronchitis or mild chronic obstructive pulmonary disease (COPD) patients with mild airflow limitation. Cluster 3 included COPD patients with heavy smoking, poor quality of life and severe airflow limitation. Cluster 4 recognized atopic patients with mild airflow limitation, elevated serum IgE and clinical features of asthma. Significant differences were revealed regarding pre-salbutamol FEV1/FVC%, pre-salbutamol FEV1% pred, post-salbutamol change in FEV1%, maximal mid-expiratory flow curve(MMEF)% pred, carbon monoxide diffusing capacity per liter of alveolar(DLCO)/(VA)% pred, residual volume(RV)% pred, total serum IgE level, smoking history (pack-years), St.George's respiratory questionnaire(SGRQ) score, acute exacerbation in the past one year, PEF variability and allergic dermatitis (P<0.05). (2) Four clusters were also identified by two-step cluster analysis as followings, cluster 1, COPD patients with moderate to severe airflow limitation; cluster 2, asthma and COPD patients with heavy smoking, airflow limitation and increased airways reversibility; cluster 3, patients having less smoking and normal pulmonary function with wheezing but no chronic cough; cluster 4, chronic bronchitis patients with normal pulmonary function and chronic cough. Significant differences were revealed regarding gender distribution, respiratory symptoms, pre-salbutamol FEV1/FVC%, pre-salbutamol FEV1% pred, post-salbutamol change in FEV1%, MMEF% pred, DLCO/VA% pred, RV% pred, PEF variability, total serum IgE level, cumulative tobacco cigarette consumption (pack-years), and SGRQ score (P<0.05). By different cluster analyses, distinct clinical phenotypes of chronic airway diseases are identified. Thus, individualized treatments may guide doctors to provide based on different phenotypes.
Sputum neutrophil counts are associated with more severe asthma phenotypes using cluster analysis.
Moore, Wendy C; Hastie, Annette T; Li, Xingnan; Li, Huashi; Busse, William W; Jarjour, Nizar N; Wenzel, Sally E; Peters, Stephen P; Meyers, Deborah A; Bleecker, Eugene R
2014-06-01
Clinical cluster analysis from the Severe Asthma Research Program (SARP) identified 5 asthma subphenotypes that represent the severity spectrum of early-onset allergic asthma, late-onset severe asthma, and severe asthma with chronic obstructive pulmonary disease characteristics. Analysis of induced sputum from a subset of SARP subjects showed 4 sputum inflammatory cellular patterns. Subjects with concurrent increases in eosinophil (≥2%) and neutrophil (≥40%) percentages had characteristics of very severe asthma. To better understand interactions between inflammation and clinical subphenotypes, we integrated inflammatory cellular measures and clinical variables in a new cluster analysis. Participants in SARP who underwent sputum induction at 3 clinical sites were included in this analysis (n = 423). Fifteen variables, including clinical characteristics and blood and sputum inflammatory cell assessments, were selected using factor analysis for unsupervised cluster analysis. Four phenotypic clusters were identified. Cluster A (n = 132) and B (n = 127) subjects had mild-to-moderate early-onset allergic asthma with paucigranulocytic or eosinophilic sputum inflammatory cell patterns. In contrast, these inflammatory patterns were present in only 7% of cluster C (n = 117) and D (n = 47) subjects who had moderate-to-severe asthma with frequent health care use despite treatment with high doses of inhaled or oral corticosteroids and, in cluster D, reduced lung function. The majority of these subjects (>83%) had sputum neutrophilia either alone or with concurrent sputum eosinophilia. Baseline lung function and sputum neutrophil percentages were the most important variables determining cluster assignment. This multivariate approach identified 4 asthma subphenotypes representing the severity spectrum from mild-to-moderate allergic asthma with minimal or eosinophil-predominant sputum inflammation to moderate-to-severe asthma with neutrophil-predominant or mixed granulocytic inflammation. Published by Mosby, Inc.
Sputum neutrophils are associated with more severe asthma phenotypes using cluster analysis
Moore, Wendy C.; Hastie, Annette T.; Li, Xingnan; Li, Huashi; Busse, William W.; Jarjour, Nizar N.; Wenzel, Sally E.; Peters, Stephen P.; Meyers, Deborah A.; Bleecker, Eugene R.
2013-01-01
Background Clinical cluster analysis from the Severe Asthma Research Program (SARP) identified five asthma subphenotypes that represent the severity spectrum of early onset allergic asthma, late onset severe asthma and severe asthma with COPD characteristics. Analysis of induced sputum from a subset of SARP subjects showed four sputum inflammatory cellular patterns. Subjects with concurrent increases in eosinophils (≥2%) and neutrophils (≥40%) had characteristics of very severe asthma. Objective To better understand interactions between inflammation and clinical subphenotypes we integrated inflammatory cellular measures and clinical variables in a new cluster analysis. Methods Participants in SARP at three clinical sites who underwent sputum induction were included in this analysis (n=423). Fifteen variables including clinical characteristics and blood and sputum inflammatory cell assessments were selected by factor analysis for unsupervised cluster analysis. Results Four phenotypic clusters were identified. Cluster A (n=132) and B (n=127) subjects had mild-moderate early onset allergic asthma with paucigranulocytic or eosinophilic sputum inflammatory cell patterns. In contrast, these inflammatory patterns were present in only 7% of Cluster C (n=117) and D (n=47) subjects who had moderate-severe asthma with frequent health care utilization despite treatment with high doses of inhaled or oral corticosteroids, and in Cluster D, reduced lung function. The majority these subjects (>83%) had sputum neutrophilia either alone or with concurrent sputum eosinophilia. Baseline lung function and sputum neutrophils were the most important variables determining cluster assignment. Conclusion This multivariate approach identified four asthma subphenotypes representing the severity spectrum from mild-moderate allergic asthma with minimal or eosinophilic predominant sputum inflammation to moderate-severe asthma with neutrophilic predominant or mixed granulocytic inflammation. PMID:24332216
EXPLORING FUNCTIONAL CONNECTIVITY IN FMRI VIA CLUSTERING.
Venkataraman, Archana; Van Dijk, Koene R A; Buckner, Randy L; Golland, Polina
2009-04-01
In this paper we investigate the use of data driven clustering methods for functional connectivity analysis in fMRI. In particular, we consider the K-Means and Spectral Clustering algorithms as alternatives to the commonly used Seed-Based Analysis. To enable clustering of the entire brain volume, we use the Nyström Method to approximate the necessary spectral decompositions. We apply K-Means, Spectral Clustering and Seed-Based Analysis to resting-state fMRI data collected from 45 healthy young adults. Without placing any a priori constraints, both clustering methods yield partitions that are associated with brain systems previously identified via Seed-Based Analysis. Our empirical results suggest that clustering provides a valuable tool for functional connectivity analysis.
Zhang, Jiang; Liu, Qi; Chen, Huafu; Yuan, Zhen; Huang, Jin; Deng, Lihua; Lu, Fengmei; Zhang, Junpeng; Wang, Yuqing; Wang, Mingwen; Chen, Liangyin
2015-01-01
Clustering analysis methods have been widely applied to identifying the functional brain networks of a multitask paradigm. However, the previously used clustering analysis techniques are computationally expensive and thus impractical for clinical applications. In this study a novel method, called SOM-SAPC that combines self-organizing mapping (SOM) and supervised affinity propagation clustering (SAPC), is proposed and implemented to identify the motor execution (ME) and motor imagery (MI) networks. In SOM-SAPC, SOM was first performed to process fMRI data and SAPC is further utilized for clustering the patterns of functional networks. As a result, SOM-SAPC is able to significantly reduce the computational cost for brain network analysis. Simulation and clinical tests involving ME and MI were conducted based on SOM-SAPC, and the analysis results indicated that functional brain networks were clearly identified with different response patterns and reduced computational cost. In particular, three activation clusters were clearly revealed, which include parts of the visual, ME and MI functional networks. These findings validated that SOM-SAPC is an effective and robust method to analyze the fMRI data with multitasks.
Liu, Shelley H; Li, Yan; Liu, Bian
2018-05-17
Chronic kidney disease is a leading cause of death in the United States. We used cluster analysis to explore patterns of chronic kidney disease in 500 of the largest US cities. After adjusting for socio-demographic characteristics, we found that unhealthy behaviors, prevention measures, and health outcomes related to chronic kidney disease differ between cities in Utah and those in the rest of the United States. Cluster analysis can be useful for identifying geographic regions that may have important policy implications for preventing chronic kidney disease.
A Cluster of Legionella-Associated Pneumonia Cases in a Population of Military Recruits
2007-06-01
this cluster may suggest a previously unrecognized suscep- FIG. 1. Phylogenic analysis of the training center strain (represented by the MCRD consensus...military recruits during population- based surveillance for pneumonia pathogens. Results were confirmed by sequence analysis . Cases cluster tightly...17 April 2007 A Legionella cluster was identified through retrospective PCR analysis of 240 throat swab samples from X-ray-confirmed pneumonia cases
Symptom clusters and quality of life among patients with advanced heart failure
Yu, Doris SF; Chan, Helen YL; Leung, Doris YP; Hui, Elsie; Sit, Janet WH
2016-01-01
Objectives To identify symptom clusters among patients with advanced heart failure (HF) and the independent relationships with their quality of life (QoL). Methods This is the secondary data analysis of a cross-sectional study which interviewed 119 patients with advanced HF in the geriatric unit of a regional hospital in Hong Kong. The symptom profile and QoL were assessed by using the Edmonton Symptom Assessment Scale (ESAS) and the McGill QoL Questionnaire. Exploratory factor analysis was used to identify the symptom clusters. Hierarchical regression analysis was used to examine the independent relationships with their QoL, after adjusting the effects of age, gender, and comorbidities. Results The patients were at an advanced age (82.9 ± 6.5 years). Three distinct symptom clusters were identified: they were the distress cluster (including shortness of breath, anxiety, and depression), the decondition cluster (fatigue, drowsiness, nausea, and reduced appetite), and the discomfort cluster (pain, and sense of generalized discomfort). These three symptom clusters accounted for 63.25% of variance of the patients' symptom experience. The small to moderate correlations between these symptom clusters indicated that they were rather independent of one another. After adjusting the age, gender and comorbidities, the distress (β = −0.635, P < 0.001), the decondition (β = −0.148, P = 0.01), and the discomfort (β = −0.258, P < 0.001) symptom clusters independently predicted their QoL. Conclusions This study identified the distinctive symptom clusters among patients with advanced HF. The results shed light on the need to develop palliative care interventions for optimizing the symptom control for this life-limiting disease. PMID:27403150
Knutson, Stacy T.; Westwood, Brian M.; Leuthaeuser, Janelle B.; Turner, Brandon E.; Nguyendac, Don; Shea, Gabrielle; Kumar, Kiran; Hayden, Julia D.; Harper, Angela F.; Brown, Shoshana D.; Morris, John H.; Ferrin, Thomas E.; Babbitt, Patricia C.
2017-01-01
Abstract Protein function identification remains a significant problem. Solving this problem at the molecular functional level would allow mechanistic determinant identification—amino acids that distinguish details between functional families within a superfamily. Active site profiling was developed to identify mechanistic determinants. DASP and DASP2 were developed as tools to search sequence databases using active site profiling. Here, TuLIP (Two‐Level Iterative clustering Process) is introduced as an iterative, divisive clustering process that utilizes active site profiling to separate structurally characterized superfamily members into functionally relevant clusters. Underlying TuLIP is the observation that functionally relevant families (curated by Structure‐Function Linkage Database, SFLD) self‐identify in DASP2 searches; clusters containing multiple functional families do not. Each TuLIP iteration produces candidate clusters, each evaluated to determine if it self‐identifies using DASP2. If so, it is deemed a functionally relevant group. Divisive clustering continues until each structure is either a functionally relevant group member or a singlet. TuLIP is validated on enolase and glutathione transferase structures, superfamilies well‐curated by SFLD. Correlation is strong; small numbers of structures prevent statistically significant analysis. TuLIP‐identified enolase clusters are used in DASP2 GenBank searches to identify sequences sharing functional site features. Analysis shows a true positive rate of 96%, false negative rate of 4%, and maximum false positive rate of 4%. F‐measure and performance analysis on the enolase search results and comparison to GEMMA and SCI‐PHY demonstrate that TuLIP avoids the over‐division problem of these methods. Mechanistic determinants for enolase families are evaluated and shown to correlate well with literature results. PMID:28054422
Knutson, Stacy T; Westwood, Brian M; Leuthaeuser, Janelle B; Turner, Brandon E; Nguyendac, Don; Shea, Gabrielle; Kumar, Kiran; Hayden, Julia D; Harper, Angela F; Brown, Shoshana D; Morris, John H; Ferrin, Thomas E; Babbitt, Patricia C; Fetrow, Jacquelyn S
2017-04-01
Protein function identification remains a significant problem. Solving this problem at the molecular functional level would allow mechanistic determinant identification-amino acids that distinguish details between functional families within a superfamily. Active site profiling was developed to identify mechanistic determinants. DASP and DASP2 were developed as tools to search sequence databases using active site profiling. Here, TuLIP (Two-Level Iterative clustering Process) is introduced as an iterative, divisive clustering process that utilizes active site profiling to separate structurally characterized superfamily members into functionally relevant clusters. Underlying TuLIP is the observation that functionally relevant families (curated by Structure-Function Linkage Database, SFLD) self-identify in DASP2 searches; clusters containing multiple functional families do not. Each TuLIP iteration produces candidate clusters, each evaluated to determine if it self-identifies using DASP2. If so, it is deemed a functionally relevant group. Divisive clustering continues until each structure is either a functionally relevant group member or a singlet. TuLIP is validated on enolase and glutathione transferase structures, superfamilies well-curated by SFLD. Correlation is strong; small numbers of structures prevent statistically significant analysis. TuLIP-identified enolase clusters are used in DASP2 GenBank searches to identify sequences sharing functional site features. Analysis shows a true positive rate of 96%, false negative rate of 4%, and maximum false positive rate of 4%. F-measure and performance analysis on the enolase search results and comparison to GEMMA and SCI-PHY demonstrate that TuLIP avoids the over-division problem of these methods. Mechanistic determinants for enolase families are evaluated and shown to correlate well with literature results. © 2017 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.
Lee, Junghee; Rizzo, Shemra; Altshuler, Lori; Glahn, David C; Miklowitz, David J; Sugar, Catherine A; Wynn, Jonathan K; Green, Michael F
2017-02-01
Bipolar disorder (BD) and schizophrenia (SZ) show substantial overlap. It has been suggested that a subgroup of patients might contribute to these overlapping features. This study employed a cross-diagnostic cluster analysis to identify subgroups of individuals with shared cognitive phenotypes. 143 participants (68 BD patients, 39 SZ patients and 36 healthy controls) completed a battery of EEG and performance assessments on perception, nonsocial cognition and social cognition. A K-means cluster analysis was conducted with all participants across diagnostic groups. Clinical symptoms, functional capacity, and functional outcome were assessed in patients. A two-cluster solution across 3 groups was the most stable. One cluster including 44 BD patients, 31 controls and 5 SZ patients showed better cognition (High cluster) than the other cluster with 24 BD patients, 35 SZ patients and 5 controls (Low cluster). BD patients in the High cluster performed better than BD patients in the Low cluster across cognitive domains. Within each cluster, participants with different clinical diagnoses showed different profiles across cognitive domains. All patients are in the chronic phase and out of mood episode at the time of assessment and most of the assessment were behavioral measures. This study identified two clusters with shared cognitive phenotype profiles that were not proxies for clinical diagnoses. The finding of better social cognitive performance of BD patients than SZ patients in the Lowe cluster suggest that relatively preserved social cognition may be important to identify disease process distinct to each disorder. Copyright © 2016 Elsevier B.V. All rights reserved.
Using cluster analysis to organize and explore regional GPS velocities
Simpson, Robert W.; Thatcher, Wayne; Savage, James C.
2012-01-01
Cluster analysis offers a simple visual exploratory tool for the initial investigation of regional Global Positioning System (GPS) velocity observations, which are providing increasingly precise mappings of actively deforming continental lithosphere. The deformation fields from dense regional GPS networks can often be concisely described in terms of relatively coherent blocks bounded by active faults, although the choice of blocks, their number and size, can be subjective and is often guided by the distribution of known faults. To illustrate our method, we apply cluster analysis to GPS velocities from the San Francisco Bay Region, California, to search for spatially coherent patterns of deformation, including evidence of block-like behavior. The clustering process identifies four robust groupings of velocities that we identify with four crustal blocks. Although the analysis uses no prior geologic information other than the GPS velocities, the cluster/block boundaries track three major faults, both locked and creeping.
Cluster Analysis to Identify Possible Subgroups in Tinnitus Patients.
van den Berge, Minke J C; Free, Rolien H; Arnold, Rosemarie; de Kleine, Emile; Hofman, Rutger; van Dijk, J Marc C; van Dijk, Pim
2017-01-01
In tinnitus treatment, there is a tendency to shift from a "one size fits all" to a more individual, patient-tailored approach. Insight in the heterogeneity of the tinnitus spectrum might improve the management of tinnitus patients in terms of choice of treatment and identification of patients with severe mental distress. The goal of this study was to identify subgroups in a large group of tinnitus patients. Data were collected from patients with severe tinnitus complaints visiting our tertiary referral tinnitus care group at the University Medical Center Groningen. Patient-reported and physician-reported variables were collected during their visit to our clinic. Cluster analyses were used to characterize subgroups. For the selection of the right variables to enter in the cluster analysis, two approaches were used: (1) variable reduction with principle component analysis and (2) variable selection based on expert opinion. Various variables of 1,783 tinnitus patients were included in the analyses. Cluster analysis (1) included 976 patients and resulted in a four-cluster solution. The effect of external influences was the most discriminative between the groups, or clusters, of patients. The "silhouette measure" of the cluster outcome was low (0.2), indicating a "no substantial" cluster structure. Cluster analysis (2) included 761 patients and resulted in a three-cluster solution, comparable to the first analysis. Again, a "no substantial" cluster structure was found (0.2). Two cluster analyses on a large database of tinnitus patients revealed that clusters of patients are mostly formed by a different response of external influences on their disease. However, both cluster outcomes based on this dataset showed a poor stability, suggesting that our tinnitus population comprises a continuum rather than a number of clearly defined subgroups.
Using Data Mining Results to Improve Educational Video Game Design
ERIC Educational Resources Information Center
Kerr, Deirdre
2015-01-01
This study uses information about in-game strategy use, identified through cluster analysis of actions in an educational video game, to make data-driven modifications to the game in order to reduce construct-irrelevant behavior. The examination of student strategies identified through cluster analysis indicated that (a) it was common for students…
Dorfman, David M; LaPlante, Charlotte D; Pozdnyakova, Olga; Li, Betty
2015-11-01
In our high-sensitivity flow cytometric approach for systemic mastocytosis (SM), we identified mast cell event clustering as a new diagnostic criterion for the disease. To objectively characterize mast cell gated event distributions, we performed cluster analysis using FLOCK, a computational approach to identify cell subsets in multidimensional flow cytometry data in an unbiased, automated fashion. FLOCK identified discrete mast cell populations in most cases of SM (56/75 [75%]) but only a minority of non-SM cases (17/124 [14%]). FLOCK-identified mast cell populations accounted for 2.46% of total cells on average in SM cases and 0.09% of total cells on average in non-SM cases (P < .0001) and were predictive of SM, with a sensitivity of 75%, a specificity of 86%, a positive predictive value of 76%, and a negative predictive value of 85%. FLOCK analysis provides useful diagnostic information for evaluating patients with suspected SM, and may be useful for the analysis of other hematopoietic neoplasms. Copyright© by the American Society for Clinical Pathology.
NASA Astrophysics Data System (ADS)
Schaefer, Andreas M.; Daniell, James E.; Wenzel, Friedemann
2017-07-01
Earthquake clustering is an essential part of almost any statistical analysis of spatial and temporal properties of seismic activity. The nature of earthquake clusters and subsequent declustering of earthquake catalogues plays a crucial role in determining the magnitude-dependent earthquake return period and its respective spatial variation for probabilistic seismic hazard assessment. This study introduces the Smart Cluster Method (SCM), a new methodology to identify earthquake clusters, which uses an adaptive point process for spatio-temporal cluster identification. It utilises the magnitude-dependent spatio-temporal earthquake density to adjust the search properties, subsequently analyses the identified clusters to determine directional variation and adjusts its search space with respect to directional properties. In the case of rapid subsequent ruptures like the 1992 Landers sequence or the 2010-2011 Darfield-Christchurch sequence, a reclassification procedure is applied to disassemble subsequent ruptures using near-field searches, nearest neighbour classification and temporal splitting. The method is capable of identifying and classifying earthquake clusters in space and time. It has been tested and validated using earthquake data from California and New Zealand. A total of more than 1500 clusters have been found in both regions since 1980 with M m i n = 2.0. Utilising the knowledge of cluster classification, the method has been adjusted to provide an earthquake declustering algorithm, which has been compared to existing methods. Its performance is comparable to established methodologies. The analysis of earthquake clustering statistics lead to various new and updated correlation functions, e.g. for ratios between mainshock and strongest aftershock and general aftershock activity metrics.
Clustering of Health Behaviors and Cardiorespiratory Fitness Among U.S. Adolescents.
Hartz, Jacob; Yingling, Leah; Ayers, Colby; Adu-Brimpong, Joel; Rivers, Joshua; Ahuja, Chaarushi; Powell-Wiley, Tiffany M
2018-05-01
Decreased cardiorespiratory fitness (CRF) is associated with an increased risk of cardiovascular disease. However, little is known how the interaction of diet, physical activity (PA), and sedentary time (ST) affects CRF among adolescents. By using a nationally representative sample of U.S. adolescents, we used cluster analysis to investigate the interactions of these behaviors with CRF. We hypothesized that distinct clustering patterns exist and that less healthy clusters are associated with lower CRF. We used 2003-2004 National Health and Nutrition Examination Survey data for persons aged 12-19 years (N = 1,225). PA and ST were measured objectively by an accelerometer, and the American Heart Association Healthy Diet Score quantified diet quality. Maximal oxygen consumption (V˙O 2 max) was measured by submaximal treadmill exercise test. We performed cluster analysis to identify sex-specific clustering of diet, PA, and ST. Adjusting for accelerometer wear time, age, body mass index, race/ethnicity, and the poverty-to-income ratio, we performed sex-stratified linear regression analysis to evaluate the association of cluster with V˙O 2 max. Three clusters were identified for girls and boys. For girls, there was no difference across clusters for age (p = .1), weight (p = .3), and BMI (p = .5), and no relationship between clusters and V˙O 2 max. For boys, the youngest cluster (p < .01) had three healthy behaviors, weighed less, and was associated with a higher V˙O 2 max compared with the two older clusters. We observed clustering of diet, PA, and ST in U.S. adolescents. Specific patterns were associated with lower V˙O 2 max for boys, suggesting that our clusters may help identify adolescent boys most in need of interventions. Published by Elsevier Inc.
Moens, Katrien; Siegert, Richard J; Taylor, Steve; Namisango, Eve; Harding, Richard
2015-01-01
Symptom research across conditions has historically focused on single symptoms, and the burden of multiple symptoms and their interactions has been relatively neglected especially in people living with HIV. Symptom cluster studies are required to set priorities in treatment planning, and to lessen the total symptom burden. This study aimed to identify and compare symptom clusters among people living with HIV attending five palliative care facilities in two sub-Saharan African countries. Data from cross-sectional self-report of seven-day symptom prevalence on the 32-item Memorial Symptom Assessment Scale-Short Form were used. A hierarchical cluster analysis was conducted using Ward's method applying squared Euclidean Distance as the similarity measure to determine the clusters. Contingency tables, X2 tests and ANOVA were used to compare the clusters by patient specific characteristics and distress scores. Among the sample (N=217) the mean age was 36.5 (SD 9.0), 73.2% were female, and 49.1% were on antiretroviral therapy (ART). The cluster analysis produced five symptom clusters identified as: 1) dermatological; 2) generalised anxiety and elimination; 3) social and image; 4) persistently present; and 5) a gastrointestinal-related symptom cluster. The patients in the first three symptom clusters reported the highest physical and psychological distress scores. Patient characteristics varied significantly across the five clusters by functional status (worst functional physical status in cluster one, p<0.001); being on ART (highest proportions for clusters two and three, p=0.012); global distress (F=26.8, p<0.001), physical distress (F=36.3, p<0.001) and psychological distress subscale (F=21.8, p<0.001) (all subscales worst for cluster one, best for cluster four). The greatest burden is associated with cluster one, and should be prioritised in clinical management. Further symptom cluster research in people living with HIV with longitudinally collected symptom data to test cluster stability and identify common symptom trajectories is recommended.
Haakensen, Vilde D; Lingjaerde, Ole Christian; Lüders, Torben; Riis, Margit; Prat, Aleix; Troester, Melissa A; Holmen, Marit M; Frantzen, Jan Ole; Romundstad, Linda; Navjord, Dina; Bukholm, Ida K; Johannesen, Tom B; Perou, Charles M; Ursin, Giske; Kristensen, Vessela N; Børresen-Dale, Anne-Lise; Helland, Aslaug
2011-11-01
Increased understanding of the variability in normal breast biology will enable us to identify mechanisms of breast cancer initiation and the origin of different subtypes, and to better predict breast cancer risk. Gene expression patterns in breast biopsies from 79 healthy women referred to breast diagnostic centers in Norway were explored by unsupervised hierarchical clustering and supervised analyses, such as gene set enrichment analysis and gene ontology analysis and comparison with previously published genelists and independent datasets. Unsupervised hierarchical clustering identified two separate clusters of normal breast tissue based on gene-expression profiling, regardless of clustering algorithm and gene filtering used. Comparison of the expression profile of the two clusters with several published gene lists describing breast cells revealed that the samples in cluster 1 share characteristics with stromal cells and stem cells, and to a certain degree with mesenchymal cells and myoepithelial cells. The samples in cluster 1 also share many features with the newly identified claudin-low breast cancer intrinsic subtype, which also shows characteristics of stromal and stem cells. More women belonging to cluster 1 have a family history of breast cancer and there is a slight overrepresentation of nulliparous women in cluster 1. Similar findings were seen in a separate dataset consisting of histologically normal tissue from both breasts harboring breast cancer and from mammoplasty reductions. This is the first study to explore the variability of gene expression patterns in whole biopsies from normal breasts and identified distinct subtypes of normal breast tissue. Further studies are needed to determine the specific cell contribution to the variation in the biology of normal breasts, how the clusters identified relate to breast cancer risk and their possible link to the origin of the different molecular subtypes of breast cancer.
Clinical phenotypes and survival of pre-capillary pulmonary hypertension in systemic sclerosis.
Launay, David; Montani, David; Hassoun, Paul M; Cottin, Vincent; Le Pavec, Jérôme; Clerson, Pierre; Sitbon, Olivier; Jaïs, Xavier; Savale, Laurent; Weatherald, Jason; Sobanski, Vincent; Mathai, Stephen C; Shafiq, Majid; Cordier, Jean-François; Hachulla, Eric; Simonneau, Gérald; Humbert, Marc
2018-01-01
Pre-capillary pulmonary hypertension (PH) in systemic sclerosis (SSc) is a heterogeneous condition with an overall bad prognosis. The objective of this study was to identify and characterize homogeneous phenotypes by a cluster analysis in SSc patients with PH. Patients were identified from two prospective cohorts from the US and France. Clinical, pulmonary function, high-resolution chest tomography, hemodynamic and survival data were extracted. We performed cluster analysis using the k-means method and compared survival between clusters using Cox regression analysis. Cluster analysis of 200 patients identified four homogenous phenotypes. Cluster C1 included patients with mild to moderate risk pulmonary arterial hypertension (PAH) with limited or no interstitial lung disease (ILD) and low DLCO with a 3-year survival of 81.5% (95% CI: 71.4-88.2). C2 had pre-capillary PH due to extensive ILD and worse 3-year survival compared to C1 (adjusted hazard ratio [HR] 3.14; 95% CI 1.66-5.94; p = 0.0004). C3 had severe PAH and a trend towards worse survival (HR 2.53; 95% CI 0.99-6.49; p = 0.052). Cluster C4 and C1 were similar with no difference in survival (HR 0.65; 95% CI 0.19-2.27, p = 0.507) but with a higher DLCO in C4. PH in SSc can be characterized into distinct clusters that differ in prognosis.
Snell, Deborah L; Surgenor, Lois J; Hay-Smith, E Jean C; Williman, Jonathan; Siegert, Richard J
2015-01-01
Outcomes after mild traumatic brain injury (MTBI) vary, with slow or incomplete recovery for a significant minority. This study examines whether groups of cases with shared psychological factors but with different injury outcomes could be identified using cluster analysis. This is a prospective observational study following 147 adults presenting to a hospital-based emergency department or concussion services in Christchurch, New Zealand. This study examined associations between baseline demographic, clinical, psychological variables (distress, injury beliefs and symptom burden) and outcome 6 months later. A two-step approach to cluster analysis was applied (Ward's method to identify clusters, K-means to refine results). Three meaningful clusters emerged (high-adapters, medium-adapters, low-adapters). Baseline cluster-group membership was significantly associated with outcomes over time. High-adapters appeared recovered by 6-weeks and medium-adapters revealed improvements by 6-months. The low-adapters continued to endorse many symptoms, negative recovery expectations and distress, being significantly at risk for poor outcome more than 6-months after injury (OR (good outcome) = 0.12; CI = 0.03-0.53; p < 0.01). Cluster analysis supported the notion that groups could be identified early post-injury based on psychological factors, with group membership associated with differing outcomes over time. Implications for clinical care providers regarding therapy targets and cases that may benefit from different intensities of intervention are discussed.
Obstructive Sleep Apnea: A Cluster Analysis at Time of Diagnosis
Grillet, Yves; Richard, Philippe; Stach, Bruno; Vivodtzev, Isabelle; Timsit, Jean-Francois; Lévy, Patrick; Tamisier, Renaud; Pépin, Jean-Louis
2016-01-01
Background The classification of obstructive sleep apnea is on the basis of sleep study criteria that may not adequately capture disease heterogeneity. Improved phenotyping may improve prognosis prediction and help select therapeutic strategies. Objectives: This study used cluster analysis to investigate the clinical clusters of obstructive sleep apnea. Methods An ascending hierarchical cluster analysis was performed on baseline symptoms, physical examination, risk factor exposure and co-morbidities from 18,263 participants in the OSFP (French national registry of sleep apnea). The probability for criteria to be associated with a given cluster was assessed using odds ratios, determined by univariate logistic regression. Results: Six clusters were identified, in which patients varied considerably in age, sex, symptoms, obesity, co-morbidities and environmental risk factors. The main significant differences between clusters were minimally symptomatic versus sleepy obstructive sleep apnea patients, lean versus obese, and among obese patients different combinations of co-morbidities and environmental risk factors. Conclusions Our cluster analysis identified six distinct clusters of obstructive sleep apnea. Our findings underscore the high degree of heterogeneity that exists within obstructive sleep apnea patients regarding clinical presentation, risk factors and consequences. This may help in both research and clinical practice for validating new prevention programs, in diagnosis and in decisions regarding therapeutic strategies. PMID:27314230
Vellone, Ercole; Fida, Roberta; Ghezzi, Valerio; D'Agostino, Fabio; Biagioli, Valentina; Paturzo, Marco; Strömberg, Anna; Alvaro, Rosaria; Jaarsma, Tiny
Self-care is important in heart failure (HF) treatment, but patients may have difficulties and be inconsistent in its performance. Inconsistencies in self-care behaviors may mirror patterns of self-care in HF patients that are worth identifying to provide interventions tailored to patients. The aims of this study are to identify clusters of HF patients in relation to self-care behaviors and to examine and compare the profile of each HF patient cluster considering the patient's sociodemographics, clinical variables, quality of life, and hospitalizations. This was a secondary analysis of data from a cross-sectional study in which we enrolled 1192 HF patients across Italy. A cluster analysis was used to identify clusters of patients based on the European Heart Failure Self-care Behaviour Scale factor scores. Analysis of variance and χ test were used to examine the characteristics of each cluster. Patients were 72.4 years old on average, and 58% were men. Four clusters of patients were identified: (1) high consistent adherence with high consulting behaviors, characterized by younger patients, with higher formal education and higher income, less clinically compromised, with the best physical and mental quality of life (QOL) and lowest hospitalization rates; (2) low consistent adherence with low consulting behaviors, characterized mainly by male patients, with lower formal education and lowest income, more clinically compromised, and worse mental QOL; (3) inconsistent adherence with low consulting behaviors, characterized by patients who were less likely to have a caregiver, with the longest illness duration, the highest number of prescribed medications, and the best mental QOL; (4) and inconsistent adherence with high consulting behaviors, characterized by patients who were mostly female, with lower formal education, worst cognitive impairment, worst physical and mental QOL, and higher hospitalization rates. The 4 clusters identified in this study and their associated characteristics could be used to tailor interventions aimed at improving self-care behaviors in HF patients.
Groundwater quality assessment of urban Bengaluru using multivariate statistical techniques
NASA Astrophysics Data System (ADS)
Gulgundi, Mohammad Shahid; Shetty, Amba
2018-03-01
Groundwater quality deterioration due to anthropogenic activities has become a subject of prime concern. The objective of the study was to assess the spatial and temporal variations in groundwater quality and to identify the sources in the western half of the Bengaluru city using multivariate statistical techniques. Water quality index rating was calculated for pre and post monsoon seasons to quantify overall water quality for human consumption. The post-monsoon samples show signs of poor quality in drinking purpose compared to pre-monsoon. Cluster analysis (CA), principal component analysis (PCA) and discriminant analysis (DA) were applied to the groundwater quality data measured on 14 parameters from 67 sites distributed across the city. Hierarchical cluster analysis (CA) grouped the 67 sampling stations into two groups, cluster 1 having high pollution and cluster 2 having lesser pollution. Discriminant analysis (DA) was applied to delineate the most meaningful parameters accounting for temporal and spatial variations in groundwater quality of the study area. Temporal DA identified pH as the most important parameter, which discriminates between water quality in the pre-monsoon and post-monsoon seasons and accounts for 72% seasonal assignation of cases. Spatial DA identified Mg, Cl and NO3 as the three most important parameters discriminating between two clusters and accounting for 89% spatial assignation of cases. Principal component analysis was applied to the dataset obtained from the two clusters, which evolved three factors in each cluster, explaining 85.4 and 84% of the total variance, respectively. Varifactors obtained from principal component analysis showed that groundwater quality variation is mainly explained by dissolution of minerals from rock water interactions in the aquifer, effect of anthropogenic activities and ion exchange processes in water.
ERIC Educational Resources Information Center
Mun, Eun Young; von Eye, Alexander; Bates, Marsha E.; Vaschillo, Evgeny G.
2008-01-01
Model-based cluster analysis is a new clustering procedure to investigate population heterogeneity utilizing finite mixture multivariate normal densities. It is an inferentially based, statistically principled procedure that allows comparison of nonnested models using the Bayesian information criterion to compare multiple models and identify the…
Jabson, Jennifer M.; Bowen, Deborah; Weinberg, Janice; Kroenke, Candyce; Luo, Juhua; Messina, Catherine; Shumaker, Sally; Tindle, Hilary A.
2016-01-01
BACKGROUND Strategies for identifying the most relevant psychosocial predictors in studies of racial/ethnic minority women’s health are limited because they largely exclude cultural influences and they assume that psychosocial predictors are independent. This paper proposes and tests an empirical solution. METHODS Hierarchical cluster analysis, conducted with data from 140,652 Women’s Health Initiative participants, identified clusters among individual psychosocial predictors. Multivariable analyses tested associations between clusters and health outcomes. RESULTS A Social Cluster and a Stress Cluster were identified. The Social Cluster was positively associated with well-being and inversely associated with chronic disease index, and the Stress Cluster was inversely associated with well-being and positively associated with chronic disease index. As hypothesized, the magnitude of association between clusters and outcomes differed by race/ethnicity. CONCLUSIONS By identifying psychosocial clusters and their associations with health, we have taken an important step toward understanding how individual psychosocial predictors interrelate and how empirically formed Stress and Social clusters relate to health outcomes. This study has also demonstrated important insight about differences in associations between these psychosocial clusters and health among racial/ethnic minorities. These differences could signal the best pathways for intervention modification and tailoring. PMID:27279761
Impact of Sampling Density on the Extent of HIV Clustering
Novitsky, Vlad; Moyo, Sikhulile; Lei, Quanhong; DeGruttola, Victor
2014-01-01
Abstract Identifying and monitoring HIV clusters could be useful in tracking the leading edge of HIV transmission in epidemics. Currently, greater specificity in the definition of HIV clusters is needed to reduce confusion in the interpretation of HIV clustering results. We address sampling density as one of the key aspects of HIV cluster analysis. The proportion of viral sequences in clusters was estimated at sampling densities from 1.0% to 70%. A set of 1,248 HIV-1C env gp120 V1C5 sequences from a single community in Botswana was utilized in simulation studies. Matching numbers of HIV-1C V1C5 sequences from the LANL HIV Database were used as comparators. HIV clusters were identified by phylogenetic inference under bootstrapped maximum likelihood and pairwise distance cut-offs. Sampling density below 10% was associated with stochastic HIV clustering with broad confidence intervals. HIV clustering increased linearly at sampling density >10%, and was accompanied by narrowing confidence intervals. Patterns of HIV clustering were similar at bootstrap thresholds 0.7 to 1.0, but the extent of HIV clustering decreased with higher bootstrap thresholds. The origin of sampling (local concentrated vs. scattered global) had a substantial impact on HIV clustering at sampling densities ≥10%. Pairwise distances at 10% were estimated as a threshold for cluster analysis of HIV-1 V1C5 sequences. The node bootstrap support distribution provided additional evidence for 10% sampling density as the threshold for HIV cluster analysis. The detectability of HIV clusters is substantially affected by sampling density. A minimal genotyping density of 10% and sampling density of 50–70% are suggested for HIV-1 V1C5 cluster analysis. PMID:25275430
Stynes, Siobhán; Konstantinou, Kika; Ogollah, Reuben; Hay, Elaine M; Dunn, Kate M
2018-04-01
Traditionally, low back-related leg pain (LBLP) is diagnosed clinically as referred leg pain or sciatica (nerve root involvement). However, within the spectrum of LBLP, we hypothesised that there may be other unrecognised patient subgroups. This study aimed to identify clusters of patients with LBLP using latent class analysis and describe their clinical course. The study population was 609 LBLP primary care consulters. Variables from clinical assessment were included in the latent class analysis. Characteristics of the statistically identified clusters were compared, and their clinical course over 1 year was described. A 5 cluster solution was optimal. Cluster 1 (n = 104) had mild leg pain severity and was considered to represent a referred leg pain group with no clinical signs, suggesting nerve root involvement (sciatica). Cluster 2 (n = 122), cluster 3 (n = 188), and cluster 4 (n = 69) had mild, moderate, and severe pain and disability, respectively, and response to clinical assessment items suggested categories of mild, moderate, and severe sciatica. Cluster 5 (n = 126) had high pain and disability, longer pain duration, and more comorbidities and was difficult to map to a clinical diagnosis. Most improvement for pain and disability was seen in the first 4 months for all clusters. At 12 months, the proportion of patients reporting recovery ranged from 27% for cluster 5 to 45% for cluster 2 (mild sciatica). This is the first study that empirically shows the variability in profile and clinical course of patients with LBLP including sciatica. More homogenous groups were identified, which could be considered in future clinical and research settings.
VanderKnyff, Jeremy; Friedman, Daniela B; Tanner, Andrea
2015-01-01
Using a sample of YouTube videos posted on the YouTube channels of organ procurement organizations, a content analysis was conducted to identify the frames used to strategically communicate prodonation messages. A total of 377 videos were coded for general characteristics, format, speaker characteristics, organs discussed, structure, problem definition, and treatment. Principal components analysis identified message frames, and k-means cluster analysis established distinct groupings of videos on the basis of the strength of their relationship to message frames. Analysis of these frames and clusters found that organ procurement organizations present multiple, and sometimes competing, video types and message frames on YouTube. This study serves as important formative research that will inform future studies to measure the effectiveness of the distinct message frames and clusters identified.
McKenna, J.E.
2003-01-01
The biosphere is filled with complex living patterns and important questions about biodiversity and community and ecosystem ecology are concerned with structure and function of multispecies systems that are responsible for those patterns. Cluster analysis identifies discrete groups within multivariate data and is an effective method of coping with these complexities, but often suffers from subjective identification of groups. The bootstrap testing method greatly improves objective significance determination for cluster analysis. The BOOTCLUS program makes cluster analysis that reliably identifies real patterns within a data set more accessible and easier to use than previously available programs. A variety of analysis options and rapid re-analysis provide a means to quickly evaluate several aspects of a data set. Interpretation is influenced by sampling design and a priori designation of samples into replicate groups, and ultimately relies on the researcher's knowledge of the organisms and their environment. However, the BOOTCLUS program provides reliable, objectively determined groupings of multivariate data.
Diao, K; Farmani, R; Fu, G; Astaraie-Imani, M; Ward, S; Butler, D
2014-01-01
Large water distribution systems (WDSs) are networks with both topological and behavioural complexity. Thereby, it is usually difficult to identify the key features of the properties of the system, and subsequently all the critical components within the system for a given purpose of design or control. One way is, however, to more explicitly visualize the network structure and interactions between components by dividing a WDS into a number of clusters (subsystems). Accordingly, this paper introduces a clustering strategy that decomposes WDSs into clusters with stronger internal connections than external connections. The detected cluster layout is very similar to the community structure of the served urban area. As WDSs may expand along with urban development in a community-by-community manner, the correspondingly formed distribution clusters may reveal some crucial configurations of WDSs. For verification, the method is applied to identify all the critical links during firefighting for the vulnerability analysis of a real-world WDS. Moreover, both the most critical pipes and clusters are addressed, given the consequences of pipe failure. Compared with the enumeration method, the method used in this study identifies the same group of the most critical components, and provides similar criticality prioritizations of them in a more computationally efficient time.
Penfold, Robert B; Burgess, James F; Lee, Austin F; Li, Mingfei; Miller, Christopher J; Nealon Seibert, Marjorie; Semla, Todd P; Mohr, David C; Kazis, Lewis E; Bauer, Mark S
2018-02-01
To identify space-time clusters of changes in prescribing aripiprazole for bipolar disorder among providers in the VA. VA administrative data from 2002 to 2010 were used to identify prescriptions of aripiprazole for bipolar disorder. Prescriber characteristics were obtained using the Personnel and Accounting Integrated Database. We conducted a retrospective space-time cluster analysis using the space-time permutation statistic. All VA service users with a diagnosis of bipolar disorder were included in the patient population. Individuals with any schizophrenia spectrum diagnoses were excluded. We also identified all clinicians who wrote a prescription for any bipolar disorder medication. The study population included 32,630 prescribers. Of these, 8,643 wrote qualifying prescriptions. We identified three clusters of aripiprazole prescribing centered in Massachusetts, Ohio, and the Pacific Northwest. Clusters were associated with prescribing by VA-employed (vs. contracted) prescribers. Nurses with prescribing privileges were more likely to make a prescription for aripiprazole in cluster locations compared with psychiatrists. Primary care physicians were less likely. Early prescribing of aripiprazole for bipolar disorder clustered geographically and was associated with prescriber subgroups. These methods support prospective surveillance of practice changes and identification of associated health system characteristics. © Health Research and Educational Trust.
A latent profile analysis of Asian American men's and women's adherence to cultural values.
Wong, Y Joel; Nguyen, Chi P; Wang, Shu-Yi; Chen, Weilin; Steinfeldt, Jesse A; Kim, Bryan S K
2012-07-01
The goal of this study was to identify diverse profiles of Asian American women's and men's adherence to values that are salient in Asian cultures (i.e., conformity to norms, family recognition through achievement, emotional self-control, collectivism, and humility). To this end, the authors conducted a latent profile analysis using the 5 subscales of the Asian American Values Scale-Multidimensional in a sample of 214 Asian Americans. The analysis uncovered a four-cluster solution. In general, Clusters 1 and 2 were characterized by relatively low and moderate levels of adherence to the 5 dimensions of cultural values, respectively. Cluster 3 was characterized by the highest level of adherence to the cultural value of family recognition through achievement, whereas Cluster 4 was typified by the highest levels of adherence to collectivism, emotional self-control, and humility. Clusters 3 and 4 were associated with higher levels of depressive symptoms than Cluster 1. Furthermore, Asian American women and Asian American men had lower odds of being in Cluster 4 and Cluster 3, respectively. These findings attest to the importance of identifying specific patterns of adherence to cultural values when examining the relationship between Asian Americans' cultural orientation and mental health status.
Olson, Ryan; Thompson, Sharon V.; Wipfli, Brad; Hanson, Ginger; Elliot, Diane L.; Anger, W. Kent; Bodner, Todd; Hammer, Leslie B.; Hohn, Elliot; Perrin, Nancy A.
2015-01-01
Objective Our objectives were to describe a sample of truck drivers, identify clusters of drivers with similar patterns in behaviors affecting energy balance (sleep, diet, and exercise), and test for cluster differences in health and psychosocial factors. Methods Participants’ (n=452, BMI M=37.2, 86.4% male) self-reported behaviors were dichotomized prior to hierarchical cluster analysis, which identified groups with similar behavior co-variation. Cluster differences were tested with generalized estimating equations. Results Five behavioral clusters were identified that differed significantly in age, smoking status, diabetes prevalence, lost work days, stress, and social support, but not in BMI. Cluster 2, characterized by the best sleep quality, had significantly lower lost workdays and stress than other clusters. Conclusions Weight management interventions for drivers should explicitly address sleep, and may be maximally effective after establishing socially supportive work environments that reduce stress exposures. PMID:26949883
Olson, Ryan; Thompson, Sharon V; Wipfli, Brad; Hanson, Ginger; Elliot, Diane L; Anger, W Kent; Bodner, Todd; Hammer, Leslie B; Hohn, Elliot; Perrin, Nancy A
2016-03-01
The objectives of the study were to describe a sample of truck drivers, identify clusters of drivers with similar patterns in behaviors affecting energy balance (sleep, diet, and exercise), and test for cluster differences in health safety, and psychosocial factors. Participants' (n = 452, body mass index M = 37.2, 86.4% male) self-reported behaviors were dichotomized prior to hierarchical cluster analysis, which identified groups with similar behavior covariation. Cluster differences were tested with generalized estimating equations. Five behavioral clusters were identified that differed significantly in age, smoking status, diabetes prevalence, lost work days, stress, and social support, but not in body mass index. Cluster 2, characterized by the best sleep quality, had significantly lower lost workdays and stress than other clusters. Weight management interventions for drivers should explicitly address sleep, and may be maximally effective after establishing socially supportive work environments that reduce stress exposures.
A framework to spatially cluster air pollution monitoring sites in US based on the PM2.5 composition
Austin, Elena; Coull, Brent A.; Zanobetti, Antonella; Koutrakis, Petros
2013-01-01
Background Heterogeneity in the response to PM2.5 is hypothesized to be related to differences in particle composition across monitoring sites which reflect differences in source types as well as climatic and topographic conditions impacting different geographic locations. Identifying spatial patterns in particle composition is a multivariate problem that requires novel methodologies. Objectives Use cluster analysis methods to identify spatial patterns in PM2.5 composition. Verify that the resulting clusters are distinct and informative. Methods 109 monitoring sites with 75% reported speciation data during the period 2003–2008 were selected. These sites were categorized based on their average PM2.5 composition over the study period using k-means cluster analysis. The obtained clusters were validated and characterized based on their physico-chemical characteristics, geographic locations, emissions profiles, population density and proximity to major emission sources. Results Overall 31 clusters were identified. These include 21 clusters with 2 or more sites which were further grouped into 4 main types using hierarchical clustering. The resulting groupings are chemically meaningful and represent broad differences in emissions. The remaining clusters, encompassing single sites, were characterized based on their particle composition and geographic location. Conclusions The framework presented here provides a novel tool which can be used to identify and further classify sites based on their PM2.5 composition. The solution presented is fairly robust and yielded groupings that were meaningful in the context of air-pollution research. PMID:23850585
Knowledge, attitudes towards and acceptability of genetic modification in Germany.
Christoph, Inken B; Bruhn, Maike; Roosen, Jutta
2008-07-01
Genetic modification remains a controversial issue. The aim of this study is to analyse the attitudes towards genetic modification, the knowledge about it and its acceptability in different application areas among German consumers. Results are based on a survey from spring 2005. An exploratory factor analysis is conducted to identify the attitudes towards genetic modification. The identified factors are used in a cluster analysis that identified a cluster of supporters, of opponents and a group of indifferent consumers. Respondents' knowledge of genetics and biotechnology differs among the found clusters without revealing a clear relationship between knowledge and support of genetic modification. The acceptability of genetic modification varies by application area and cluster, and genetically modified non-food products are more widely accepted than food products. The perception of personal health risks has high explanatory power for attitudes and acceptability.
ERIC Educational Resources Information Center
Kerr, Deirdre; Chung, Gregory K. W. K.; Iseli, Markus R.
2011-01-01
Analyzing log data from educational video games has proven to be a challenging endeavor. In this paper, we examine the feasibility of using cluster analysis to extract information from the log files that is interpretable in both the context of the game and the context of the subject area. If cluster analysis can be used to identify patterns of…
Konno, Satoshi; Taniguchi, Natsuko; Makita, Hironi; Nakamaru, Yuji; Shimizu, Kaoruko; Shijubo, Noriharu; Fuke, Satoshi; Takeyabu, Kimihiro; Oguri, Mitsuru; Kimura, Hirokazu; Maeda, Yukiko; Suzuki, Masaru; Nagai, Katsura; Ito, Yoichi M; Wenzel, Sally E; Nishimura, Masaharu
2015-12-01
Smoking may have multifactorial effects on asthma phenotypes, particularly in severe asthma. Cluster analysis has been applied to explore novel phenotypes, which are not based on any a priori hypotheses. To explore novel severe asthma phenotypes by cluster analysis when including cigarette smokers. We recruited a total of 127 subjects with severe asthma, including 59 current or ex-smokers, from our university hospital and its 29 affiliated hospitals/pulmonary clinics. Twelve clinical variables obtained during a 2-day hospital stay were used for cluster analysis. After clustering using clinical variables, the sputum levels of 14 molecules were measured to biologically characterize the clinical clusters. Five clinical clusters were identified, including two characterized by high pack-year exposure to cigarette smoking and low FEV1/FVC. There were marked differences between the two clusters of cigarette smokers. One had high levels of circulating eosinophils, high IgE levels, and a high sinus disease score. The other was characterized by low levels of the same parameters. Sputum analysis revealed increased levels of IL-5 in the former cluster and increased levels of IL-6 and osteopontin in the latter. The other three clusters were similar to those previously reported: young onset/atopic, nonsmoker/less eosinophilic, and female/obese. Key clinical variables were confirmed to be stable and consistent 1 year later. This study reveals two distinct phenotypes of severe asthma in current and former cigarette smokers with potentially different biological pathways contributing to fixed airflow limitation. Clinical trial registered with www.umin.ac.jp (000003254).
The dynamics of cyclone clustering in re-analysis and a high-resolution climate model
NASA Astrophysics Data System (ADS)
Priestley, Matthew; Pinto, Joaquim; Dacre, Helen; Shaffrey, Len
2017-04-01
Extratropical cyclones have a tendency to occur in groups (clusters) in the exit of the North Atlantic storm track during wintertime, potentially leading to widespread socioeconomic impacts. The Winter of 2013/14 was the stormiest on record for the UK and was characterised by the recurrent clustering of intense extratropical cyclones. This clustering was associated with a strong, straight and persistent North Atlantic 250 hPa jet with Rossby wave-breaking (RWB) on both flanks, pinning the jet in place. Here, we provide for the first time an analysis of all clustered events in 36 years of the ERA-Interim Re-analysis at three latitudes (45˚ N, 55˚ N, 65˚ N) encompassing various regions of Western Europe. The relationship between the occurrence of RWB and cyclone clustering is studied in detail. Clustering at 55˚ N is associated with an extended and anomalously strong jet flanked on both sides by RWB. However, clustering at 65(45)˚ N is associated with RWB to the south (north) of the jet, deflecting the jet northwards (southwards). A positive correlation was found between the intensity of the clustering and RWB occurrence to the north and south of the jet. However, there is considerable spread in these relationships. Finally, analysis has shown that the relationships identified in the re-analysis are also present in a high-resolution coupled global climate model (HiGEM). In particular, clustering is associated with the same dynamical conditions at each of our three latitudes in spite of the identified biases in frequency and intensity of RWB.
Using cluster ensemble and validation to identify subtypes of pervasive developmental disorders.
Shen, Jess J; Lee, Phil-Hyoun; Holden, Jeanette J A; Shatkay, Hagit
2007-10-11
Pervasive Developmental Disorders (PDD) are neurodevelopmental disorders characterized by impairments in social interaction, communication and behavior. Given the diversity and varying severity of PDD, diagnostic tools attempt to identify homogeneous subtypes within PDD. Identifying subtypes can lead to targeted etiology studies and to effective type-specific intervention. Cluster analysis can suggest coherent subsets in data; however, different methods and assumptions lead to different results. Several previous studies applied clustering to PDD data, varying in number and characteristics of the produced subtypes. Most studies used a relatively small dataset (fewer than 150 subjects), and all applied only a single clustering method. Here we study a relatively large dataset (358 PDD patients), using an ensemble of three clustering methods. The results are evaluated using several validation methods, and consolidated through an integration step. Four clusters are identified, analyzed and compared to subtypes previously defined by the widely used diagnostic tool DSM-IV.
Using Cluster Ensemble and Validation to Identify Subtypes of Pervasive Developmental Disorders
Shen, Jess J.; Lee, Phil Hyoun; Holden, Jeanette J.A.; Shatkay, Hagit
2007-01-01
Pervasive Developmental Disorders (PDD) are neurodevelopmental disorders characterized by impairments in social interaction, communication and behavior.1 Given the diversity and varying severity of PDD, diagnostic tools attempt to identify homogeneous subtypes within PDD. Identifying subtypes can lead to targeted etiology studies and to effective type-specific intervention. Cluster analysis can suggest coherent subsets in data; however, different methods and assumptions lead to different results. Several previous studies applied clustering to PDD data, varying in number and characteristics of the produced subtypes19. Most studies used a relatively small dataset (fewer than 150 subjects), and all applied only a single clustering method. Here we study a relatively large dataset (358 PDD patients), using an ensemble of three clustering methods. The results are evaluated using several validation methods, and consolidated through an integration step. Four clusters are identified, analyzed and compared to subtypes previously defined by the widely used diagnostic tool DSM-IV.2 PMID:18693920
Cluster Analysis of Clinical Data Identifies Fibromyalgia Subgroups
Docampo, Elisa; Collado, Antonio; Escaramís, Geòrgia; Carbonell, Jordi; Rivera, Javier; Vidal, Javier; Alegre, José
2013-01-01
Introduction Fibromyalgia (FM) is mainly characterized by widespread pain and multiple accompanying symptoms, which hinder FM assessment and management. In order to reduce FM heterogeneity we classified clinical data into simplified dimensions that were used to define FM subgroups. Material and Methods 48 variables were evaluated in 1,446 Spanish FM cases fulfilling 1990 ACR FM criteria. A partitioning analysis was performed to find groups of variables similar to each other. Similarities between variables were identified and the variables were grouped into dimensions. This was performed in a subset of 559 patients, and cross-validated in the remaining 887 patients. For each sample and dimension, a composite index was obtained based on the weights of the variables included in the dimension. Finally, a clustering procedure was applied to the indexes, resulting in FM subgroups. Results Variables clustered into three independent dimensions: “symptomatology”, “comorbidities” and “clinical scales”. Only the two first dimensions were considered for the construction of FM subgroups. Resulting scores classified FM samples into three subgroups: low symptomatology and comorbidities (Cluster 1), high symptomatology and comorbidities (Cluster 2), and high symptomatology but low comorbidities (Cluster 3), showing differences in measures of disease severity. Conclusions We have identified three subgroups of FM samples in a large cohort of FM by clustering clinical data. Our analysis stresses the importance of family and personal history of FM comorbidities. Also, the resulting patient clusters could indicate different forms of the disease, relevant to future research, and might have an impact on clinical assessment. PMID:24098674
ClusterViz: A Cytoscape APP for Cluster Analysis of Biological Network.
Wang, Jianxin; Zhong, Jiancheng; Chen, Gang; Li, Min; Wu, Fang-xiang; Pan, Yi
2015-01-01
Cluster analysis of biological networks is one of the most important approaches for identifying functional modules and predicting protein functions. Furthermore, visualization of clustering results is crucial to uncover the structure of biological networks. In this paper, ClusterViz, an APP of Cytoscape 3 for cluster analysis and visualization, has been developed. In order to reduce complexity and enable extendibility for ClusterViz, we designed the architecture of ClusterViz based on the framework of Open Services Gateway Initiative. According to the architecture, the implementation of ClusterViz is partitioned into three modules including interface of ClusterViz, clustering algorithms and visualization and export. ClusterViz fascinates the comparison of the results of different algorithms to do further related analysis. Three commonly used clustering algorithms, FAG-EC, EAGLE and MCODE, are included in the current version. Due to adopting the abstract interface of algorithms in module of the clustering algorithms, more clustering algorithms can be included for the future use. To illustrate usability of ClusterViz, we provided three examples with detailed steps from the important scientific articles, which show that our tool has helped several research teams do their research work on the mechanism of the biological networks.
2011-01-01
Background Increased understanding of the variability in normal breast biology will enable us to identify mechanisms of breast cancer initiation and the origin of different subtypes, and to better predict breast cancer risk. Methods Gene expression patterns in breast biopsies from 79 healthy women referred to breast diagnostic centers in Norway were explored by unsupervised hierarchical clustering and supervised analyses, such as gene set enrichment analysis and gene ontology analysis and comparison with previously published genelists and independent datasets. Results Unsupervised hierarchical clustering identified two separate clusters of normal breast tissue based on gene-expression profiling, regardless of clustering algorithm and gene filtering used. Comparison of the expression profile of the two clusters with several published gene lists describing breast cells revealed that the samples in cluster 1 share characteristics with stromal cells and stem cells, and to a certain degree with mesenchymal cells and myoepithelial cells. The samples in cluster 1 also share many features with the newly identified claudin-low breast cancer intrinsic subtype, which also shows characteristics of stromal and stem cells. More women belonging to cluster 1 have a family history of breast cancer and there is a slight overrepresentation of nulliparous women in cluster 1. Similar findings were seen in a separate dataset consisting of histologically normal tissue from both breasts harboring breast cancer and from mammoplasty reductions. Conclusion This is the first study to explore the variability of gene expression patterns in whole biopsies from normal breasts and identified distinct subtypes of normal breast tissue. Further studies are needed to determine the specific cell contribution to the variation in the biology of normal breasts, how the clusters identified relate to breast cancer risk and their possible link to the origin of the different molecular subtypes of breast cancer. PMID:22044755
Dong, Skye T; Costa, Daniel S J; Butow, Phyllis N; Lovell, Melanie R; Agar, Meera; Velikova, Galina; Teckle, Paulos; Tong, Allison; Tebbutt, Niall C; Clarke, Stephen J; van der Hoek, Kim; King, Madeleine T; Fayers, Peter M
2016-01-01
Symptom clusters in advanced cancer can influence patient outcomes. There is large heterogeneity in the methods used to identify symptom clusters. To investigate the consistency of symptom cluster composition in advanced cancer patients using different statistical methodologies for all patients across five primary cancer sites, and to examine which clusters predict functional status, a global assessment of health and global quality of life. Principal component analysis and exploratory factor analysis (with different rotation and factor selection methods) and hierarchical cluster analysis (with different linkage and similarity measures) were used on a data set of 1562 advanced cancer patients who completed the European Organization for the Research and Treatment of Cancer Quality of Life Questionnaire-Core 30. Four clusters consistently formed for many of the methods and cancer sites: tense-worry-irritable-depressed (emotional cluster), fatigue-pain, nausea-vomiting, and concentration-memory (cognitive cluster). The emotional cluster was a stronger predictor of overall quality of life than the other clusters. Fatigue-pain was a stronger predictor of overall health than the other clusters. The cognitive cluster and fatigue-pain predicted physical functioning, role functioning, and social functioning. The four identified symptom clusters were consistent across statistical methods and cancer types, although there were some noteworthy differences. Statistical derivation of symptom clusters is in need of greater methodological guidance. A psychosocial pathway in the management of symptom clusters may improve quality of life. Biological mechanisms underpinning symptom clusters need to be delineated by future research. A framework for evidence-based screening, assessment, treatment, and follow-up of symptom clusters in advanced cancer is essential. Copyright © 2016 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.
Ahmad, Tariq; Desai, Nihar; Wilson, Francis; Schulte, Phillip; Dunning, Allison; Jacoby, Daniel; Allen, Larry; Fiuzat, Mona; Rogers, Joseph; Felker, G Michael; O'Connor, Christopher; Patel, Chetan B
2016-01-01
Classification of acute decompensated heart failure (ADHF) is based on subjective criteria that crudely capture disease heterogeneity. Improved phenotyping of the syndrome may help improve therapeutic strategies. To derive cluster analysis-based groupings for patients hospitalized with ADHF, and compare their prognostic performance to hemodynamic classifications derived at the bedside. We performed a cluster analysis on baseline clinical variables and PAC measurements of 172 ADHF patients from the ESCAPE trial. Employing regression techniques, we examined associations between clusters and clinically determined hemodynamic profiles (warm/cold/wet/dry). We assessed association with clinical outcomes using Cox proportional hazards models. Likelihood ratio tests were used to compare the prognostic value of cluster data to that of hemodynamic data. We identified four advanced HF clusters: 1) male Caucasians with ischemic cardiomyopathy, multiple comorbidities, lowest B-type natriuretic peptide (BNP) levels; 2) females with non-ischemic cardiomyopathy, few comorbidities, most favorable hemodynamics; 3) young African American males with non-ischemic cardiomyopathy, most adverse hemodynamics, advanced disease; and 4) older Caucasians with ischemic cardiomyopathy, concomitant renal insufficiency, highest BNP levels. There was no association between clusters and bedside-derived hemodynamic profiles (p = 0.70). For all adverse clinical outcomes, Cluster 4 had the highest risk, and Cluster 2, the lowest. Compared to Cluster 4, Clusters 1-3 had 45-70% lower risk of all-cause mortality. Clusters were significantly associated with clinical outcomes, whereas hemodynamic profiles were not. By clustering patients with similar objective variables, we identified four clinically relevant phenotypes of ADHF patients, with no discernable relationship to hemodynamic profiles, but distinct associations with adverse outcomes. Our analysis suggests that ADHF classification using simultaneous considerations of etiology, comorbid conditions, and biomarker levels, may be superior to bedside classifications.
Jurencák, Roman; Fritzler, Marvin; Tyrrell, Pascal; Hiraki, Linda; Benseler, Susanne; Silverman, Earl
2009-02-01
(1) To evaluate the spectrum of serum autoantibodies in pediatric-onset systemic lupus erythematosus (pSLE) with a focus on ethnic differences; (2) using cluster analysis, to identify patients with similar autoantibody patterns and to determine their clinical associations. A single-center cohort study of all patients with newly diagnosed pSLE seen over an 8-year period was performed. Ethnicity, clinical, and serological data were prospectively collected from 156/169 patients (92%). The frequencies of 10 selected autoantibodies among ethnic groups were compared. Cluster analysis identified groups of patients with similar autoantibody profiles. Associations of these groups with clinical and laboratory features of pSLE were examined. Among our 5 ethnic groups, there were differences only in the prevalence of anti-U1RNP and anti-Sm antibodies, which occurred more frequently in non-Caucasian patients (p < 0.0001, p < 0.01, respectively). Cluster analysis revealed 3 autoantibody clusters. Cluster 1 consisted of anti-dsDNA antibodies. Cluster 2 consisted of anti-dsDNA, antichromatin, antiribosomal P, anti-U1RNP, anti-Sm, anti-Ro and anti-La autoantibody. Cluster 3 consisted of anti-dsDNA, anti-RNP, and anti-Sm autoantibody. The highest proportion of Caucasians was in cluster 1 (p < 0.05), which was characterized by a mild disease with infrequent major organ involvement compared to cluster 2, which had the highest frequency of nephritis, renal failure, serositis, and hemolytic anemia, or cluster 3, which was characterized by frequent neuropsychiatric disease and nephritis. We observed ethnic differences in autoantibody profiles in pSLE. Autoantibodies tended to cluster together and these clusters were associated with different clinical courses.
Identifying Subgroups of Tinnitus Using Novel Resting State fMRI Biomarkers and Cluster Analysis
2017-10-13
AWARD NUMBER: W81XWH-15-2-0032 TITLE: Identifying Subgroups of Tinnitus Using Novel Resting State fMRI Biomarkers and Cluster Analysis...TITLE AND SUBTITLE 5a. CONTRACT NUMBER W81XWH-15-2-0032 5b. GRANT NUMBER Identifying Subgroups of Tinnitus Using Novel Resting State fMRI...Release; Distribution Unlimited 13. SUPPLEMENTARY NOTES 14. ABSTRACT The subject of the project is FY14 PRMRP Topic Area – Tinnitus . The broad goal is
Finch, Caroline F; Stephan, Karen; Shee, Anna Wong; Hill, Keith; Haines, Terry P; Clemson, Lindy; Day, Lesley
2015-01-01
Background There has been limited research investigating the relationship between injurious falls and hospital resource use. The aims of this study were to identify clusters of community-dwelling older people in the general population who are at increased risk of being admitted to hospital following a fall and how those clusters differed in their use of hospital resources. Methods Analysis of routinely collected hospital admissions data relating to 45 374 fall-related admissions in Victorian community-dwelling older adults aged ≥65 years that occurred during 2008/2009 to 2010/2011. Fall-related admission episodes were identified based on being admitted from a private residence to hospital with a principal diagnosis of injury (International Classification of Diseases (ICD)-10-AM codes S00 to T75) and having a first external cause of a fall (ICD-10-AM codes W00 to W19). A cluster analysis was performed to identify homogeneous groups using demographic details of patients and information on the presence of comorbidities. Hospital length of stay (LOS) was compared across clusters using competing risks regression. Results Clusters based on area of residence, demographic factors (age, gender, marital status, country of birth) and the presence of comorbidities were identified. Clusters representing hospitalised fallers with comorbidities were associated with longer LOS compared with other cluster groups. Clusters delineated by demographic factors were also associated with increased LOS. Conclusions All patients with comorbidity, and older women without comorbidities, stay in hospital longer following a fall and hence consume a disproportionate share of hospital resources. These findings have important implications for the targeting of falls prevention interventions for community-dwelling older people. PMID:25618735
Glatman-Freedman, Aharona; Kaufman, Zalman; Kopel, Eran; Bassal, Ravit; Taran, Diana; Valinsky, Lea; Agmon, Vered; Shpriz, Manor; Cohen, Daniel; Anis, Emilia; Shohat, Tamy
2016-08-01
To enhance timely surveillance of bacterial enteric pathogens, space-time cluster analysis was introduced in Israel in May 2013. Stool isolation data of Salmonella, Shigella, and Campylobacter from patients of a large Health Maintenance Organization were analyzed weekly by ArcGIS and SaTScan, and cluster results were sent promptly to local departments of health (LDOHs). During eighteen months, we identified 52 Shigella sonnei clusters, two Salmonella clusters, and no Campylobacter clusters. S. sonnei clusters lasted from one to 33 days and included three to 30 individuals. Thirty-one (60%) of the S. sonnei clusters were known to LDOHs prior to cluster analysis. Clusters not previously known by the LDOHs prompted epidemiologic investigations. In 31 of the 37 (84%) confirmed clusters, educational institutes (nursery schools, kindergartens, and a primary school) were involved. Cluster analysis demonstrated capability to complement enteric disease surveillance. Scaling up the system can further enhance timely detection and control of outbreaks. Copyright © 2016 The British Infection Association. Published by Elsevier Ltd. All rights reserved.
An effective fuzzy kernel clustering analysis approach for gene expression data.
Sun, Lin; Xu, Jiucheng; Yin, Jiaojiao
2015-01-01
Fuzzy clustering is an important tool for analyzing microarray data. A major problem in applying fuzzy clustering method to microarray gene expression data is the choice of parameters with cluster number and centers. This paper proposes a new approach to fuzzy kernel clustering analysis (FKCA) that identifies desired cluster number and obtains more steady results for gene expression data. First of all, to optimize characteristic differences and estimate optimal cluster number, Gaussian kernel function is introduced to improve spectrum analysis method (SAM). By combining subtractive clustering with max-min distance mean, maximum distance method (MDM) is proposed to determine cluster centers. Then, the corresponding steps of improved SAM (ISAM) and MDM are given respectively, whose superiority and stability are illustrated through performing experimental comparisons on gene expression data. Finally, by introducing ISAM and MDM into FKCA, an effective improved FKCA algorithm is proposed. Experimental results from public gene expression data and UCI database show that the proposed algorithms are feasible for cluster analysis, and the clustering accuracy is higher than the other related clustering algorithms.
van der Molen, Thys; Fletcher, Monica; Price, David
Asthma is a highly heterogeneous disease that can be classified into different clinical phenotypes, and treatment may be tailored accordingly. However, factors beyond purely clinical traits, such as patient attitudes and behaviors, can also have a marked impact on treatment outcomes. The objective of this study was to further analyze data from the REcognise Asthma and LInk to Symptoms and Experience (REALISE) Europe survey, to identify distinct patient groups sharing common attitudes toward asthma and its management. Factor analysis of respondent data (N = 7,930) from the REALISE Europe survey consolidated the 34 attitudinal variables provided by the study population into a set of 8 summary factors. Cluster analyses were used to identify patient clusters that showed similar attitudes and behaviors toward each of the 8 summary factors. Five distinct patient clusters were identified and named according to the key characteristics comprising that cluster: "Confident and self-managing," "Confident and accepting of their asthma," "Confident but dependent on others," "Concerned but confident in their health care professional (HCP)," and "Not confident in themselves or their HCP." Clusters showed clear variability in attributes such as degree of confidence in managing their asthma, use of reliever and preventer medication, and level of asthma control. The 5 patient clusters identified in this analysis displayed distinctly different personal attitudes that would require different approaches in the consultation room certainly for asthma but probably also for other chronic diseases. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Firdausiah Mansur, Andi Besse; Yusof, Norazah
2013-01-01
Clustering on Social Learning Network still not explored widely, especially when the network focuses on e-learning system. Any conventional methods are not really suitable for the e-learning data. SNA requires content analysis, which involves human intervention and need to be carried out manually. Some of the previous clustering techniques need…
Selemetas, Nikolaos; Phelan, Paul; O'Kiely, Padraig; de Waal, Theo
2015-03-19
Fasciolosis caused by Fasciola hepatica is a widespread parasitic disease in cattle farms. The aim of this study was to detect clusters of fasciolosis in dairy cow herds in Munster Province, Ireland and to identify significant climatic and environmental predictors of the exposure risk. In total, 1,292 dairy herds across Munster was sampled in September 2012 providing a single bulk tank milk (BTM) sample. The analysis of samples by an in-house antibody-detection enzyme-linked immunosorbent assay (ELISA), showed that 65% of the dairy herds (n = 842) had been exposed to F. hepatica. Using the Getis-Ord Gi* statistic, 16 high-risk and 24 low-risk (P <0.01) clusters of fasciolosis were identified. The spatial distribution of high-risk clusters was more dispersed and mainly located in the northern and western regions of Munster compared to the low-risk clusters that were mostly concentrated in the southern and eastern regions. The most significant classes of variables that could reflect the difference between high-risk and low-risk clusters were the total number of wet-days and rain-days, rainfall, the normalized difference vegetation index (NDVI), temperature and soil type. There was a bigger proportion of well-drained soils among the low-risk clusters, whereas poorly drained soils were more common among the high-risk clusters. These results stress the role of precipitation, grazing, temperature and drainage on the life cycle of F. hepatica in the temperate Irish climate. The findings of this study highlight the importance of cluster analysis for identifying significant differences in climatic and environmental variables between high-risk and low-risk clusters of fasciolosis in Irish dairy herds.
Harper, Angela F; Leuthaeuser, Janelle B; Babbitt, Patricia C; Morris, John H; Ferrin, Thomas E; Poole, Leslie B; Fetrow, Jacquelyn S
2017-02-01
Peroxiredoxins (Prxs or Prdxs) are a large protein superfamily of antioxidant enzymes that rapidly detoxify damaging peroxides and/or affect signal transduction and, thus, have roles in proliferation, differentiation, and apoptosis. Prx superfamily members are widespread across phylogeny and multiple methods have been developed to classify them. Here we present an updated atlas of the Prx superfamily identified using a novel method called MISST (Multi-level Iterative Sequence Searching Technique). MISST is an iterative search process developed to be both agglomerative, to add sequences containing similar functional site features, and divisive, to split groups when functional site features suggest distinct functionally-relevant clusters. Superfamily members need not be identified initially-MISST begins with a minimal representative set of known structures and searches GenBank iteratively. Further, the method's novelty lies in the manner in which isofunctional groups are selected; rather than use a single or shifting threshold to identify clusters, the groups are deemed isofunctional when they pass a self-identification criterion, such that the group identifies itself and nothing else in a search of GenBank. The method was preliminarily validated on the Prxs, as the Prxs presented challenges of both agglomeration and division. For example, previous sequence analysis clustered the Prx functional families Prx1 and Prx6 into one group. Subsequent expert analysis clearly identified Prx6 as a distinct functionally relevant group. The MISST process distinguishes these two closely related, though functionally distinct, families. Through MISST search iterations, over 38,000 Prx sequences were identified, which the method divided into six isofunctional clusters, consistent with previous expert analysis. The results represent the most complete computational functional analysis of proteins comprising the Prx superfamily. The feasibility of this novel method is demonstrated by the Prx superfamily results, laying the foundation for potential functionally relevant clustering of the universe of protein sequences.
Babbitt, Patricia C.; Ferrin, Thomas E.
2017-01-01
Peroxiredoxins (Prxs or Prdxs) are a large protein superfamily of antioxidant enzymes that rapidly detoxify damaging peroxides and/or affect signal transduction and, thus, have roles in proliferation, differentiation, and apoptosis. Prx superfamily members are widespread across phylogeny and multiple methods have been developed to classify them. Here we present an updated atlas of the Prx superfamily identified using a novel method called MISST (Multi-level Iterative Sequence Searching Technique). MISST is an iterative search process developed to be both agglomerative, to add sequences containing similar functional site features, and divisive, to split groups when functional site features suggest distinct functionally-relevant clusters. Superfamily members need not be identified initially—MISST begins with a minimal representative set of known structures and searches GenBank iteratively. Further, the method’s novelty lies in the manner in which isofunctional groups are selected; rather than use a single or shifting threshold to identify clusters, the groups are deemed isofunctional when they pass a self-identification criterion, such that the group identifies itself and nothing else in a search of GenBank. The method was preliminarily validated on the Prxs, as the Prxs presented challenges of both agglomeration and division. For example, previous sequence analysis clustered the Prx functional families Prx1 and Prx6 into one group. Subsequent expert analysis clearly identified Prx6 as a distinct functionally relevant group. The MISST process distinguishes these two closely related, though functionally distinct, families. Through MISST search iterations, over 38,000 Prx sequences were identified, which the method divided into six isofunctional clusters, consistent with previous expert analysis. The results represent the most complete computational functional analysis of proteins comprising the Prx superfamily. The feasibility of this novel method is demonstrated by the Prx superfamily results, laying the foundation for potential functionally relevant clustering of the universe of protein sequences. PMID:28187133
Pérez-Rodrigo, Carmen; Gil, Ángel; González-Gross, Marcela; Ortega, Rosa M.; Serra-Majem, Lluis; Varela-Moreiras, Gregorio; Aranceta-Bartrina, Javier
2015-01-01
Weight gain has been associated with behaviors related to diet, sedentary lifestyle, and physical activity. We investigated dietary patterns and possible meaningful clustering of physical activity, sedentary behavior, and sleep time in Spanish children and adolescents and whether the identified clusters could be associated with overweight. Analysis was based on a subsample (n = 415) of the cross-sectional ANIBES study in Spain. We performed exploratory factor analysis and subsequent cluster analysis of dietary patterns, physical activity, sedentary behaviors, and sleep time. Logistic regression analysis was used to explore the association between the cluster solutions and overweight. Factor analysis identified four dietary patterns, one reflecting a profile closer to the traditional Mediterranean diet. Dietary patterns, physical activity behaviors, sedentary behaviors and sleep time on weekdays in Spanish children and adolescents clustered into two different groups. A low physical activity-poorer diet lifestyle pattern, which included a higher proportion of girls, and a high physical activity, low sedentary behavior, longer sleep duration, healthier diet lifestyle pattern. Although increased risk of being overweight was not significant, the Prevalence Ratios (PRs) for the low physical activity-poorer diet lifestyle pattern were >1 in children and in adolescents. The healthier lifestyle pattern included lower proportions of children and adolescents from low socioeconomic status backgrounds. PMID:26729155
Identification of uncommon objects in containers
Bremer, Peer-Timo; Kim, Hyojin; Thiagarajan, Jayaraman J.
2017-09-12
A system for identifying in an image an object that is commonly found in a collection of images and for identifying a portion of an image that represents an object based on a consensus analysis of segmentations of the image. The system collects images of containers that contain objects for generating a collection of common objects within the containers. To process the images, the system generates a segmentation of each image. The image analysis system may also generate multiple segmentations for each image by introducing variations in the selection of voxels to be merged into a segment. The system then generates clusters of the segments based on similarity among the segments. Each cluster represents a common object found in the containers. Once the clustering is complete, the system may be used to identify common objects in images of new containers based on similarity between segments of images and the clusters.
Segmentation and clustering as complementary sources of information
NASA Astrophysics Data System (ADS)
Dale, Michael B.; Allison, Lloyd; Dale, Patricia E. R.
2007-03-01
This paper examines the effects of using a segmentation method to identify change-points or edges in vegetation. It identifies coherence (spatial or temporal) in place of unconstrained clustering. The segmentation method involves change-point detection along a sequence of observations so that each cluster formed is composed of adjacent samples; this is a form of constrained clustering. The protocol identifies one or more models, one for each section identified, and the quality of each is assessed using a minimum message length criterion, which provides a rational basis for selecting an appropriate model. Although the segmentation is less efficient than clustering, it does provide other information because it incorporates textural similarity as well as homogeneity. In addition it can be useful in determining various scales of variation that may apply to the data, providing a general method of small-scale pattern analysis.
Stewart C. Sanderson; Jeffrey E. Ott; E. Durant McArthur; Kimball T. Harper
2006-01-01
This paper presents a new clustering program named RCLUS that was developed for species (R-mode) analysis of plant community data. RCLUS identifies clusters of co-occurring species that meet a user-specified cutoff level of positive association with each other. The "strict affinity" clustering algorithm in RCLUS builds clusters of species whose pairwise...
A spatial cluster analysis of tractor overturns in Kentucky from 1960 to 2002
Saman, D.M.; Cole, H.P.; Odoi, A.; Myers, M.L.; Carey, D.I.; Westneat, S.C.
2012-01-01
Background: Agricultural tractor overturns without rollover protective structures are the leading cause of farm fatalities in the United States. To our knowledge, no studies have incorporated the spatial scan statistic in identifying high-risk areas for tractor overturns. The aim of this study was to determine whether tractor overturns cluster in certain parts of Kentucky and identify factors associated with tractor overturns. Methods: A spatial statistical analysis using Kulldorff's spatial scan statistic was performed to identify county clusters at greatest risk for tractor overturns. A regression analysis was then performed to identify factors associated with tractor overturns. Results: The spatial analysis revealed a cluster of higher than expected tractor overturns in four counties in northern Kentucky (RR = 2.55) and 10 counties in eastern Kentucky (RR = 1.97). Higher rates of tractor overturns were associated with steeper average percent slope of pasture land by county (p = 0.0002) and a greater percent of total tractors with less than 40 horsepower by county (p<0.0001). Conclusions: This study reveals that geographic hotspots of tractor overturns exist in Kentucky and identifies factors associated with overturns. This study provides policymakers a guide to targeted county-level interventions (e.g., roll-over protective structures promotion interventions) with the intention of reducing tractor overturns in the highest risk counties in Kentucky. ?? 2012 Saman et al.
Cluster analysis and prediction of treatment outcomes for chronic rhinosinusitis.
Soler, Zachary M; Hyer, J Madison; Rudmik, Luke; Ramakrishnan, Viswanathan; Smith, Timothy L; Schlosser, Rodney J
2016-04-01
Current clinical classifications of chronic rhinosinusitis (CRS) have weak prognostic utility regarding treatment outcomes. Simplified discriminant analysis based on unsupervised clustering has identified novel phenotypic subgroups of CRS, but prognostic utility is unknown. We sought to determine whether discriminant analysis allows prognostication in patients choosing surgery versus continued medical management. A multi-institutional prospective study of patients with CRS in whom initial medical therapy failed who then self-selected continued medical management or surgical treatment was used to separate patients into 5 clusters based on a previously described discriminant analysis using total Sino-Nasal Outcome Test-22 (SNOT-22) score, age, and missed productivity. Patients completed the SNOT-22 at baseline and for 18 months of follow-up. Baseline demographic and objective measures included olfactory testing, computed tomography, and endoscopy scoring. SNOT-22 outcomes for surgical versus continued medical treatment were compared across clusters. Data were available on 690 patients. Baseline differences in demographics, comorbidities, objective disease measures, and patient-reported outcomes were similar to previous clustering reports. Three of 5 clusters identified by means of discriminant analysis had improved SNOT-22 outcomes with surgical intervention when compared with continued medical management (surgery was a mean of 21.2 points better across these 3 clusters at 6 months, P < .05). These differences were sustained at 18 months of follow-up. Two of 5 clusters had similar outcomes when comparing surgery with continued medical management. A simplified discriminant analysis based on 3 common clinical variables is able to cluster patients and provide prognostic information regarding surgical treatment versus continued medical management in patients with CRS. Copyright © 2015 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
Analysis of Tropical Cyclone Tracks in the North Indian Ocean
NASA Astrophysics Data System (ADS)
Patwardhan, A.; Paliwal, M.; Mohapatra, M.
2011-12-01
Cyclones are regarded as one of the most dangerous meteorological phenomena of the tropical region. The probability of landfall of a tropical cyclone depends on its movement (trajectory). Analysis of trajectories of tropical cyclones could be useful for identifying potentially predictable characteristics. There is long history of analysis of tropical cyclones tracks. A common approach is using different clustering techniques to group the cyclone tracks on the basis of certain characteristics. Various clustering method have been used to study the tropical cyclones in different ocean basins like western North Pacific ocean (Elsner and Liu, 2003; Camargo et al., 2007), North Atlantic Ocean (Elsner, 2003; Gaffney et al. 2007; Nakamura et al., 2009). In this study, tropical cyclone tracks in the North Indian Ocean basin, for the period 1961-2010 have been analyzed and grouped into clusters based on their spatial characteristics. A tropical cyclone trajectory is approximated as an open curve and described by its first two moments. The resulting clusters have different centroid locations and also differently shaped variance ellipses. These track characteristics are then used in the standard clustering algorithms which allow the whole track shape, length, and location to be incorporated into the clustering methodology. The resulting clusters have different genesis locations and trajectory shapes. We have also examined characteristics such as life span, maximum sustained wind speed, landfall, seasonality, many of which are significantly different across the identified clusters. The clustering approach groups cyclones with higher maximum wind speed and longest life span in to one cluster. Another cluster includes short duration cyclonic events that are mostly deep depressions and significant for rainfall over Eastern and Central India. The clustering approach is likely to prove useful for analysis of events of significance with regard to impacts.
Cohen, Mitchell J; Grossman, Adam D; Morabito, Diane; Knudson, M Margaret; Butte, Atul J; Manley, Geoffrey T
2010-01-01
Advances in technology have made extensive monitoring of patient physiology the standard of care in intensive care units (ICUs). While many systems exist to compile these data, there has been no systematic multivariate analysis and categorization across patient physiological data. The sheer volume and complexity of these data make pattern recognition or identification of patient state difficult. Hierarchical cluster analysis allows visualization of high dimensional data and enables pattern recognition and identification of physiologic patient states. We hypothesized that processing of multivariate data using hierarchical clustering techniques would allow identification of otherwise hidden patient physiologic patterns that would be predictive of outcome. Multivariate physiologic and ventilator data were collected continuously using a multimodal bioinformatics system in the surgical ICU at San Francisco General Hospital. These data were incorporated with non-continuous data and stored on a server in the ICU. A hierarchical clustering algorithm grouped each minute of data into 1 of 10 clusters. Clusters were correlated with outcome measures including incidence of infection, multiple organ failure (MOF), and mortality. We identified 10 clusters, which we defined as distinct patient states. While patients transitioned between states, they spent significant amounts of time in each. Clusters were enriched for our outcome measures: 2 of the 10 states were enriched for infection, 6 of 10 were enriched for MOF, and 3 of 10 were enriched for death. Further analysis of correlations between pairs of variables within each cluster reveals significant differences in physiology between clusters. Here we show for the first time the feasibility of clustering physiological measurements to identify clinically relevant patient states after trauma. These results demonstrate that hierarchical clustering techniques can be useful for visualizing complex multivariate data and may provide new insights for the care of critically injured patients.
Sensory Clusters of Toddlers with Autism Spectrum Disorders: Differences in Affective Symptoms
ERIC Educational Resources Information Center
Ben-Sasson, A.; Cermak, S. A.; Orsmond, G. I.; Tager-Flusberg, H.; Kadlec, M. B.; Carter, A. S.
2008-01-01
Background: Individuals with autism spectrum disorders (ASDs) show variability in their sensory behaviors. In this study we identified clusters of toddlers with ASDs who shared sensory profiles and examined differences in affective symptoms across these clusters. Method: Using cluster analysis 170 toddlers with ASDs were grouped based on parent…
Toyoda, Hiromitsu; Takahashi, Shinji; Hoshino, Masatoshi; Takayama, Kazushi; Iseki, Kazumichi; Sasaoka, Ryuichi; Tsujio, Tadao; Yasuda, Hiroyuki; Sasaki, Takeharu; Kanematsu, Fumiaki; Kono, Hiroshi; Nakamura, Hiroaki
2017-09-23
This study demonstrated four distinct patterns in the course of back pain after osteoporotic vertebral fracture (OVF). Greater angular instability in the first 6 months after the baseline was one factor affecting back pain after OVF. Understanding the natural course of symptomatic acute OVF is important in deciding the optimal treatment strategy. We used latent class analysis to classify the course of back pain after OVF and identify the risk factors associated with persistent pain. This multicenter cohort study included 218 consecutive patients with ≤ 2-week-old OVFs who were enrolled at 11 institutions. Dynamic x-rays and back pain assessment with a visual analog scale (VAS) were obtained at enrollment and at 1-, 3-, and 6-month follow-ups. The VAS scores were used to characterize patient groups, using hierarchical cluster analysis. VAS for 128 patients was used for hierarchical cluster analysis. Analysis yielded four clusters representing different patterns of back pain progression. Cluster 1 patients (50.8%) had stable, mild pain. Cluster 2 patients (21.1%) started with moderate pain and progressed quickly to very low pain. Patients in cluster 3 (10.9%) had moderate pain that initially improved but worsened after 3 months. Cluster 4 patients (17.2%) had persistent severe pain. Patients in cluster 4 showed significant high baseline pain intensity, higher degree of angular instability, and higher number of previous OVFs, and tended to lack regular exercise. In contrast, patients in cluster 2 had significantly lower baseline VAS and less angular instability. We identified four distinct groups of OVF patients with different patterns of back pain progression. Understanding the course of back pain after OVF may help in its management and contribute to future treatment trials.
Fens, Niki; van Rossum, Annelot G J; Zanen, Pieter; van Ginneken, Bram; van Klaveren, Rob J; Zwinderman, Aeilko H; Sterk, Peter J
2013-06-01
Classification of COPD is currently based on the presence and severity of airways obstruction. However, this may not fully reflect the phenotypic heterogeneity of COPD in the (ex-) smoking community. We hypothesized that factor analysis followed by cluster analysis of functional, clinical, radiological and exhaled breath metabolomic features identifies subphenotypes of COPD in a community-based population of heavy (ex-) smokers. Adults between 50-75 years with a smoking history of at least 15 pack-years derived from a random population-based survey as part of the NELSON study underwent detailed assessment of pulmonary function, chest CT scanning, questionnaires and exhaled breath molecular profiling using an electronic nose. Factor and cluster analyses were performed on the subgroup of subjects fulfilling the GOLD criteria for COPD (post-BD FEV1/FVC < 0.70). Three hundred subjects were recruited, of which 157 fulfilled the criteria for COPD and were included in the factor and cluster analysis. Four clusters were identified: cluster 1 (n = 35; 22%): mild COPD, limited symptoms and good quality of life. Cluster 2 (n = 48; 31%): low lung function, combined emphysema and chronic bronchitis and a distinct breath molecular profile. Cluster 3 (n = 60; 38%): emphysema predominant COPD with preserved lung function. Cluster 4 (n = 14; 9%): highly symptomatic COPD with mildly impaired lung function. In a leave-one-out validation analysis an accuracy of 97.4% was reached. This unbiased taxonomy for mild to moderate COPD reinforces clusters found in previous studies and thereby allows better phenotyping of COPD in the general (ex-) smoking population.
Machine-learned cluster identification in high-dimensional data.
Ultsch, Alfred; Lötsch, Jörn
2017-02-01
High-dimensional biomedical data are frequently clustered to identify subgroup structures pointing at distinct disease subtypes. It is crucial that the used cluster algorithm works correctly. However, by imposing a predefined shape on the clusters, classical algorithms occasionally suggest a cluster structure in homogenously distributed data or assign data points to incorrect clusters. We analyzed whether this can be avoided by using emergent self-organizing feature maps (ESOM). Data sets with different degrees of complexity were submitted to ESOM analysis with large numbers of neurons, using an interactive R-based bioinformatics tool. On top of the trained ESOM the distance structure in the high dimensional feature space was visualized in the form of a so-called U-matrix. Clustering results were compared with those provided by classical common cluster algorithms including single linkage, Ward and k-means. Ward clustering imposed cluster structures on cluster-less "golf ball", "cuboid" and "S-shaped" data sets that contained no structure at all (random data). Ward clustering also imposed structures on permuted real world data sets. By contrast, the ESOM/U-matrix approach correctly found that these data contain no cluster structure. However, ESOM/U-matrix was correct in identifying clusters in biomedical data truly containing subgroups. It was always correct in cluster structure identification in further canonical artificial data. Using intentionally simple data sets, it is shown that popular clustering algorithms typically used for biomedical data sets may fail to cluster data correctly, suggesting that they are also likely to perform erroneously on high dimensional biomedical data. The present analyses emphasized that generally established classical hierarchical clustering algorithms carry a considerable tendency to produce erroneous results. By contrast, unsupervised machine-learned analysis of cluster structures, applied using the ESOM/U-matrix method, is a viable, unbiased method to identify true clusters in the high-dimensional space of complex data. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Savage, J. C.; Simpson, R. W.
2013-09-01
The deformation across the Sierra Nevada Block, the Walker Lane Belt, and the Central Nevada Seismic Belt (CNSB) between 38.5°N and 40.5°N has been analyzed by clustering GPS velocities to identify coherent blocks. Cluster analysis determines the number of clusters required and assigns the GPS stations to the proper clusters. The clusters are shown on a fault map by symbols located at the positions of the GPS stations, each symbol representing the cluster to which the velocity of that GPS station belongs. Fault systems that separate the clusters are readily identified on such a map. Four significant clusters are identified. Those clusters are strips separated by (from west to east) the Mohawk Valley-Genoa fault system, the Pyramid Lake-Wassuk fault system, and the Central Nevada Seismic Belt. The strain rates within the westernmost three clusters approximate simple right-lateral shear (~13 nstrain/a) across vertical planes roughly parallel to the cluster boundaries. Clustering does not recognize the longitudinal segmentation of the Walker Lane Belt into domains dominated by either northwesterly trending, right-lateral faults or northeasterly trending, left-lateral faults.
Savage, James C.; Simpson, Robert W.
2013-01-01
The deformation across the Sierra Nevada Block, the Walker Lane Belt, and the Central Nevada Seismic Belt (CNSB) between 38.5°N and 40.5°N has been analyzed by clustering GPS velocities to identify coherent blocks. Cluster analysis determines the number of clusters required and assigns the GPS stations to the proper clusters. The clusters are shown on a fault map by symbols located at the positions of the GPS stations, each symbol representing the cluster to which the velocity of that GPS station belongs. Fault systems that separate the clusters are readily identified on such a map. Four significant clusters are identified. Those clusters are strips separated by (from west to east) the Mohawk Valley-Genoa fault system, the Pyramid Lake-Wassuk fault system, and the Central Nevada Seismic Belt. The strain rates within the westernmost three clusters approximate simple right-lateral shear (~13 nstrain/a) across vertical planes roughly parallel to the cluster boundaries. Clustering does not recognize the longitudinal segmentation of the Walker Lane Belt into domains dominated by either northwesterly trending, right-lateral faults or northeasterly trending, left-lateral faults.
Structure-sequence based analysis for identification of conserved regions in proteins
Zemla, Adam T; Zhou, Carol E; Lam, Marisa W; Smith, Jason R; Pardes, Elizabeth
2013-05-28
Disclosed are computational methods, and associated hardware and software products for scoring conservation in a protein structure based on a computationally identified family or cluster of protein structures. A method of computationally identifying a family or cluster of protein structures in also disclosed herein.
A Typology of Burnout in Professional Counselors
ERIC Educational Resources Information Center
Lee, Sang Min; Cho, Seong Ho; Kissinger, Daniel; Ogle, Nick T.
2010-01-01
The authors used a cluster analysis procedure and the Counselor Burnout Inventory (S. M. Lee et al., 2007) to identify professional counselors' burnout types. Three clusters were identified: well-adjusted, persevering, and disconnected counselors. The results also indicated that counselors' job satisfaction and self-esteem were good discriminators…
MMPI-2: Cluster Analysis of Personality Profiles in Perinatal Depression—Preliminary Evidence
Grillo, Alessandra; Lauriola, Marco; Giacchetti, Nicoletta
2014-01-01
Background. To assess personality characteristics of women who develop perinatal depression. Methods. The study started with a screening of a sample of 453 women in their third trimester of pregnancy, to which was administered a survey data form, the Edinburgh Postnatal Depression Scale (EPDS) and the Minnesota Multiphasic Personality Inventory 2 (MMPI-2). A clinical group of subjects with perinatal depression (PND, 55 subjects) was selected; clinical and validity scales of MMPI-2 were used as predictors in hierarchical cluster analysis carried out. Results. The analysis identified three clusters of personality profile: two “clinical” clusters (1 and 3) and an “apparently common” one (cluster 2). The first cluster (39.5%) collects structures of personality with prevalent obsessive or dependent functioning tending to develop a “psychasthenic” depression; the third cluster (13.95%) includes women with prevalent borderline functioning tending to develop “dysphoric” depression; the second cluster (46.5%) shows a normal profile with a “defensive” attitude, probably due to the presence of defense mechanisms or to the fear of stigma. Conclusion. Characteristics of personality have a key role in clinical manifestations of perinatal depression; it is important to detect them to identify mothers at risk and to plan targeted therapeutic interventions. PMID:25574499
Ortholog-based screening and identification of genes related to intracellular survival.
Yang, Xiaowen; Wang, Jiawei; Bing, Guoxia; Bie, Pengfei; De, Yanyan; Lyu, Yanli; Wu, Qingmin
2018-04-20
Bioinformatics and comparative genomics analysis methods were used to predict unknown pathogen genes based on homology with identified or functionally clustered genes. In this study, the genes of common pathogens were analyzed to screen and identify genes associated with intracellular survival through sequence similarity, phylogenetic tree analysis and the λ-Red recombination system test method. The total 38,952 protein-coding genes of common pathogens were divided into 19,775 clusters. As demonstrated through a COG analysis, information storage and processing genes might play an important role intracellular survival. Only 19 clusters were present in facultative intracellular pathogens, and not all were present in extracellular pathogens. Construction of a phylogenetic tree selected 18 of these 19 clusters. Comparisons with the DEG database and previous research revealed that seven other clusters are considered essential gene clusters and that seven other clusters are associated with intracellular survival. Moreover, this study confirmed that clusters screened by orthologs with similar function could be replaced with an approved uvrY gene and its orthologs, and the results revealed that the usg gene is associated with intracellular survival. The study improves the current understanding of intracellular pathogens characteristics and allows further exploration of the intracellular survival-related gene modules in these pathogens. Copyright © 2018. Published by Elsevier B.V.
Sherman, Recinda L; Henry, Kevin A; Tannenbaum, Stacey L; Feaster, Daniel J; Kobetz, Erin; Lee, David J
2014-03-20
Epidemiologists are gradually incorporating spatial analysis into health-related research as geocoded cases of disease become widely available and health-focused geospatial computer applications are developed. One health-focused application of spatial analysis is cluster detection. Using cluster detection to identify geographic areas with high-risk populations and then screening those populations for disease can improve cancer control. SaTScan is a free cluster-detection software application used by epidemiologists around the world to describe spatial clusters of infectious and chronic disease, as well as disease vectors and risk factors. The objectives of this article are to describe how spatial analysis can be used in cancer control to detect geographic areas in need of colorectal cancer screening intervention, identify issues commonly encountered by SaTScan users, detail how to select the appropriate methods for using SaTScan, and explain how method selection can affect results. As an example, we used various methods to detect areas in Florida where the population is at high risk for late-stage diagnosis of colorectal cancer. We found that much of our analysis was underpowered and that no single method detected all clusters of statistical or public health significance. However, all methods detected 1 area as high risk; this area is potentially a priority area for a screening intervention. Cluster detection can be incorporated into routine public health operations, but the challenge is to identify areas in which the burden of disease can be alleviated through public health intervention. Reliance on SaTScan's default settings does not always produce pertinent results.
Geographic atrophy phenotype identification by cluster analysis.
Monés, Jordi; Biarnés, Marc
2018-03-01
To identify ocular phenotypes in patients with geographic atrophy secondary to age-related macular degeneration (GA) using a data-driven cluster analysis. This was a retrospective analysis of data from a prospective, natural history study of patients with GA who were followed for ≥6 months. Cluster analysis was used to identify subgroups within the population based on the presence of several phenotypic features: soft drusen, reticular pseudodrusen (RPD), primary foveal atrophy, increased fundus autofluorescence (FAF), greyish FAF appearance and subfoveal choroidal thickness (SFCT). A comparison of features between the subgroups was conducted, and a qualitative description of the new phenotypes was proposed. The atrophy growth rate between phenotypes was then compared. Data were analysed from 77 eyes of 77 patients with GA. Cluster analysis identified three groups: phenotype 1 was characterised by high soft drusen load, foveal atrophy and slow growth; phenotype 3 showed high RPD load, extrafoveal and greyish FAF appearance and thin SFCT; the characteristics of phenotype 2 were midway between phenotypes 1 and 3. Phenotypes differed in all measured features (p≤0.013), with decreases in the presence of soft drusen, foveal atrophy and SFCT seen from phenotypes 1 to 3 and corresponding increases in high RPD load, high FAF and greyish FAF appearance. Atrophy growth rate differed between phenotypes 1, 2 and 3 (0.63, 1.91 and 1.73 mm 2 /year, respectively, p=0.0005). Cluster analysis identified three distinct phenotypes in GA. One of them showed a particularly slow growth pattern. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Clustering approaches to identifying gene expression patterns from DNA microarray data.
Do, Jin Hwan; Choi, Dong-Kug
2008-04-30
The analysis of microarray data is essential for large amounts of gene expression data. In this review we focus on clustering techniques. The biological rationale for this approach is the fact that many co-expressed genes are co-regulated, and identifying co-expressed genes could aid in functional annotation of novel genes, de novo identification of transcription factor binding sites and elucidation of complex biological pathways. Co-expressed genes are usually identified in microarray experiments by clustering techniques. There are many such methods, and the results obtained even for the same datasets may vary considerably depending on the algorithms and metrics for dissimilarity measures used, as well as on user-selectable parameters such as desired number of clusters and initial values. Therefore, biologists who want to interpret microarray data should be aware of the weakness and strengths of the clustering methods used. In this review, we survey the basic principles of clustering of DNA microarray data from crisp clustering algorithms such as hierarchical clustering, K-means and self-organizing maps, to complex clustering algorithms like fuzzy clustering.
Atomistic cluster alignment method for local order mining in liquids and glasses
NASA Astrophysics Data System (ADS)
Fang, X. W.; Wang, C. Z.; Yao, Y. X.; Ding, Z. J.; Ho, K. M.
2010-11-01
An atomistic cluster alignment method is developed to identify and characterize the local atomic structural order in liquids and glasses. With the “order mining” idea for structurally disordered systems, the method can detect the presence of any type of local order in the system and can quantify the structural similarity between a given set of templates and the aligned clusters in a systematic and unbiased manner. Moreover, population analysis can also be carried out for various types of clusters in the system. The advantages of the method in comparison with other previously developed analysis methods are illustrated by performing the structural analysis for four prototype systems (i.e., pure Al, pure Zr, Zr35Cu65 , and Zr36Ni64 ). The results show that the cluster alignment method can identify various types of short-range orders (SROs) in these systems correctly while some of these SROs are difficult to capture by most of the currently available analysis methods (e.g., Voronoi tessellation method). Such a full three-dimensional atomistic analysis method is generic and can be applied to describe the magnitude and nature of noncrystalline ordering in many disordered systems.
Martínez-García, Carlos Galdino; Ugoretz, Sarah Janes; Arriaga-Jordán, Carlos Manuel; Wattiaux, Michel André
2015-02-01
This study explored whether technology adoption and changes in management practices were associated with farm structure, household, and farmer characteristics and to identify processes that may foster productivity and sustainability of small-scale dairy farming in the central highlands of Mexico. Factor analysis of survey data from 44 smallholders identified three factors-related to farm size, farmer's engagement, and household structure-that explained 70 % of cumulative variance. The subsequent hierarchical cluster analysis yielded three clusters. Cluster 1 included the most senior farmers with fewest years of education but greatest years of experience. Cluster 2 included farmers who reported access to extension, cooperative services, and more management changes. Cluster 2 obtained 25 and 35 % more milk than farmers in clusters 1 and 3, respectively. Cluster 3 included the youngest farmers, with most years of education and greatest availability of family labor. Access to a network and membership in a community of peers appeared as important contributors to success. Smallholders gravitated towards easy to implement technologies that have immediate benefits. Nonusers of high investment technologies found them unaffordable because of cost, insufficient farm size, and lack of knowledge or reliable electricity. Multivariate analysis may be a useful tool in planning extension activities and organizing channels of communication to effectively target farmers with varying needs, constraints, and motivations for change and in identifying farmers who may exemplify models of change for others who manage farms that are structurally similar but performing at a lower level.
Proposed shade guide for human facial skin and lip: a pilot study.
Wee, Alvin G; Beatty, Mark W; Gozalo-Diaz, David J; Kim-Pusateri, Seungyee; Marx, David B
2013-08-01
Currently, no commercially available facial shade guide exists in the United States for the fabrication of facial prostheses. The purpose of this study was to measure facial skin and lip color in a human population sample stratified by age, gender, and race. Clustering analysis was used to determine optimal color coordinates for a proposed facial shade guide. Participants (n=119) were recruited from 4 racial/ethnic groups, 5 age groups, and both genders. Reflectance measurements of participants' noses and lower lips were made by using a spectroradiometer and xenon arc lamp with a 45/0 optical configuration. Repeated measures ANOVA (α=.05), to identify skin and lip color differences, resulting from race, age, gender, and location, and a hierarchical clustering analysis, to identify clusters of skin colors) were used. Significant contributors to L*a*b* facial color were race and facial location (P<.01). b* affected all factors (P<.05). Age affected only b* (P<.001), while gender affected only L* (P<.05) and b* (P<.05). Analyses identified 5 clusters of skin color. The study showed that skin color caused by age and gender primarily occurred within the yellow-blue axis. A significant lightness difference between gender groups was also found. Clustering analysis identified 5 distinct skin shade tabs. Copyright © 2013 The Editorial Council of the Journal of Prosthetic Dentistry. Published by Mosby, Inc. All rights reserved.
A Cluster Analytic Study of Clinical Orientations among Chemical Dependency Counselors.
ERIC Educational Resources Information Center
Thombs, Dennis L.; Osborn, Cynthia J.
2001-01-01
Three distinct clinical orientations were identified in a sample of chemical dependency counselors (N=406). Based on cluster analysis, the largest group, identified and labeled as "uniform counselors," endorsed a simple, moral-disease model with little interest in psychosocial interventions. (Contains 50 references and 4 tables.) (GCP)
Wang, Yi; Coleman-Derr, Devin; Chen, Guoping; Gu, Yong Q
2015-07-01
Genome wide analysis of orthologous clusters is an important component of comparative genomics studies. Identifying the overlap among orthologous clusters can enable us to elucidate the function and evolution of proteins across multiple species. Here, we report a web platform named OrthoVenn that is useful for genome wide comparisons and visualization of orthologous clusters. OrthoVenn provides coverage of vertebrates, metazoa, protists, fungi, plants and bacteria for the comparison of orthologous clusters and also supports uploading of customized protein sequences from user-defined species. An interactive Venn diagram, summary counts, and functional summaries of the disjunction and intersection of clusters shared between species are displayed as part of the OrthoVenn result. OrthoVenn also includes in-depth views of the clusters using various sequence analysis tools. Furthermore, OrthoVenn identifies orthologous clusters of single copy genes and allows for a customized search of clusters of specific genes through key words or BLAST. OrthoVenn is an efficient and user-friendly web server freely accessible at http://probes.pw.usda.gov/OrthoVenn or http://aegilops.wheat.ucdavis.edu/OrthoVenn. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Waldram, Alison; Dolan, Gayle; Ashton, Philip M; Jenkins, Claire; Dallman, Timothy J
2018-05-01
The unprecedented level of bacterial strain discrimination provided by whole genome sequencing (WGS) presents new challenges with respect to the utility and interpretation of the data. Whole genome sequences from 1445 isolates of Salmonella belonging to the most commonly identified serotypes in England and Wales isolated between April and August 2014 were analysed. Single linkage single nucleotide polymorphism thresholds at the 10, 5 and 0 level were explored for evidence of epidemiological links between clustered cases. Analysis of the WGS data organised 566 of the 1445 isolates into 32 clusters of five or more. A statistically significant epidemiological link was identified for 17 clusters. The clusters were associated with foreign travel (n = 8), consumption of Chinese takeaways (n = 4), chicken eaten at home (n = 2), and one each of the following; eating out, contact with another case in the home and contact with reptiles. In the same time frame, one cluster was detected using traditional outbreak detection methods. WGS can be used for the highly specific and highly sensitive detection of biologically related isolates when epidemiological links are obscured. Improvements in the collection of detailed, standardised exposure information would enhance cluster investigations. Copyright © 2017 Elsevier Ltd. All rights reserved.
Hahus, Ian; Migliaccio, Kati; Douglas-Mankin, Kyle; Klarenberg, Geraldine; Muñoz-Carpena, Rafael
2018-04-27
Hierarchical and partitional cluster analyses were used to compartmentalize Water Conservation Area 1, a managed wetland within the Arthur R. Marshall Loxahatchee National Wildlife Refuge in southeast Florida, USA, based on physical, biological, and climatic geospatial attributes. Single, complete, average, and Ward's linkages were tested during the hierarchical cluster analyses, with average linkage providing the best results. In general, the partitional method, partitioning around medoids, found clusters that were more evenly sized and more spatially aggregated than those resulting from the hierarchical analyses. However, hierarchical analysis appeared to be better suited to identify outlier regions that were significantly different from other areas. The clusters identified by geospatial attributes were similar to clusters developed for the interior marsh in a separate study using water quality attributes, suggesting that similar factors have influenced variations in both the set of physical, biological, and climatic attributes selected in this study and water quality parameters. However, geospatial data allowed further subdivision of several interior marsh clusters identified from the water quality data, potentially indicating zones with important differences in function. Identification of these zones can be useful to managers and modelers by informing the distribution of monitoring equipment and personnel as well as delineating regions that may respond similarly to future changes in management or climate.
Performance analysis of clustering techniques over microarray data: A case study
NASA Astrophysics Data System (ADS)
Dash, Rasmita; Misra, Bijan Bihari
2018-03-01
Handling big data is one of the major issues in the field of statistical data analysis. In such investigation cluster analysis plays a vital role to deal with the large scale data. There are many clustering techniques with different cluster analysis approach. But which approach suits a particular dataset is difficult to predict. To deal with this problem a grading approach is introduced over many clustering techniques to identify a stable technique. But the grading approach depends on the characteristic of dataset as well as on the validity indices. So a two stage grading approach is implemented. In this study the grading approach is implemented over five clustering techniques like hybrid swarm based clustering (HSC), k-means, partitioning around medoids (PAM), vector quantization (VQ) and agglomerative nesting (AGNES). The experimentation is conducted over five microarray datasets with seven validity indices. The finding of grading approach that a cluster technique is significant is also established by Nemenyi post-hoc hypothetical test.
Supervised group Lasso with applications to microarray data analysis
Ma, Shuangge; Song, Xiao; Huang, Jian
2007-01-01
Background A tremendous amount of efforts have been devoted to identifying genes for diagnosis and prognosis of diseases using microarray gene expression data. It has been demonstrated that gene expression data have cluster structure, where the clusters consist of co-regulated genes which tend to have coordinated functions. However, most available statistical methods for gene selection do not take into consideration the cluster structure. Results We propose a supervised group Lasso approach that takes into account the cluster structure in gene expression data for gene selection and predictive model building. For gene expression data without biological cluster information, we first divide genes into clusters using the K-means approach and determine the optimal number of clusters using the Gap method. The supervised group Lasso consists of two steps. In the first step, we identify important genes within each cluster using the Lasso method. In the second step, we select important clusters using the group Lasso. Tuning parameters are determined using V-fold cross validation at both steps to allow for further flexibility. Prediction performance is evaluated using leave-one-out cross validation. We apply the proposed method to disease classification and survival analysis with microarray data. Conclusion We analyze four microarray data sets using the proposed approach: two cancer data sets with binary cancer occurrence as outcomes and two lymphoma data sets with survival outcomes. The results show that the proposed approach is capable of identifying a small number of influential gene clusters and important genes within those clusters, and has better prediction performance than existing methods. PMID:17316436
Noninvasive analysis of the sputum transcriptome discriminates clinical phenotypes of asthma.
Yan, Xiting; Chu, Jen-Hwa; Gomez, Jose; Koenigs, Maria; Holm, Carole; He, Xiaoxuan; Perez, Mario F; Zhao, Hongyu; Mane, Shrikant; Martinez, Fernando D; Ober, Carole; Nicolae, Dan L; Barnes, Kathleen C; London, Stephanie J; Gilliland, Frank; Weiss, Scott T; Raby, Benjamin A; Cohn, Lauren; Chupp, Geoffrey L
2015-05-15
The airway transcriptome includes genes that contribute to the pathophysiologic heterogeneity seen in individuals with asthma. We analyzed sputum gene expression for transcriptomic endotypes of asthma (TEA), gene signatures that discriminate phenotypes of disease. Gene expression in the sputum and blood of patients with asthma was measured using Affymetrix microarrays. Unsupervised clustering analysis based on pathways from the Kyoto Encyclopedia of Genes and Genomes was used to identify TEA clusters. Logistic regression analysis of matched blood samples defined an expression profile in the circulation to determine the TEA cluster assignment in a cohort of children with asthma to replicate clinical phenotypes. Three TEA clusters were identified. TEA cluster 1 had the most subjects with a history of intubation (P = 0.05), a lower prebronchodilator FEV1 (P = 0.006), a higher bronchodilator response (P = 0.03), and higher exhaled nitric oxide levels (P = 0.04) compared with the other TEA clusters. TEA cluster 2, the smallest cluster, had the most subjects that were hospitalized for asthma (P = 0.04). TEA cluster 3, the largest cluster, had normal lung function, low exhaled nitric oxide levels, and lower inhaled steroid requirements. Evaluation of TEA clusters in children confirmed that TEA clusters 1 and 2 are associated with a history of intubation (P = 5.58 × 10(-6)) and hospitalization (P = 0.01), respectively. There are common patterns of gene expression in the sputum and blood of children and adults that are associated with near-fatal, severe, and milder asthma.
Noninvasive Analysis of the Sputum Transcriptome Discriminates Clinical Phenotypes of Asthma
Yan, Xiting; Chu, Jen-Hwa; Gomez, Jose; Koenigs, Maria; Holm, Carole; He, Xiaoxuan; Perez, Mario F.; Zhao, Hongyu; Mane, Shrikant; Martinez, Fernando D.; Ober, Carole; Nicolae, Dan L.; Barnes, Kathleen C.; London, Stephanie J.; Gilliland, Frank; Weiss, Scott T.; Raby, Benjamin A.; Cohn, Lauren
2015-01-01
Rationale: The airway transcriptome includes genes that contribute to the pathophysiologic heterogeneity seen in individuals with asthma. Objectives: We analyzed sputum gene expression for transcriptomic endotypes of asthma (TEA), gene signatures that discriminate phenotypes of disease. Methods: Gene expression in the sputum and blood of patients with asthma was measured using Affymetrix microarrays. Unsupervised clustering analysis based on pathways from the Kyoto Encyclopedia of Genes and Genomes was used to identify TEA clusters. Logistic regression analysis of matched blood samples defined an expression profile in the circulation to determine the TEA cluster assignment in a cohort of children with asthma to replicate clinical phenotypes. Measurements and Main Results: Three TEA clusters were identified. TEA cluster 1 had the most subjects with a history of intubation (P = 0.05), a lower prebronchodilator FEV1 (P = 0.006), a higher bronchodilator response (P = 0.03), and higher exhaled nitric oxide levels (P = 0.04) compared with the other TEA clusters. TEA cluster 2, the smallest cluster, had the most subjects that were hospitalized for asthma (P = 0.04). TEA cluster 3, the largest cluster, had normal lung function, low exhaled nitric oxide levels, and lower inhaled steroid requirements. Evaluation of TEA clusters in children confirmed that TEA clusters 1 and 2 are associated with a history of intubation (P = 5.58 × 10−6) and hospitalization (P = 0.01), respectively. Conclusions: There are common patterns of gene expression in the sputum and blood of children and adults that are associated with near-fatal, severe, and milder asthma. PMID:25763605
McGuire, Joseph F; Nyirabahizi, Epiphanie; Kircanski, Katharina; Piacentini, John; Peterson, Alan L; Woods, Douglas W; Wilhelm, Sabine; Walkup, John T; Scahill, Lawrence
2013-12-30
Cluster analytic methods have examined the symptom presentation of chronic tic disorders (CTDs), with limited agreement across studies. The present study investigated patterns, clinical correlates, and treatment outcome of tic symptoms. 239 youth and adults with CTDs completed a battery of assessments at baseline to determine diagnoses, tic severity, and clinical characteristics. Participants were randomly assigned to receive either a comprehensive behavioral intervention for tics (CBIT) or psychoeducation and supportive therapy (PST). A cluster analysis was conducted on the baseline Yale Global Tic Severity Scale (YGTSS) symptom checklist to identify the constellations of tic symptoms. Four tic clusters were identified: Impulse Control and Complex Phonic Tics; Complex Motor Tics; Simple Head Motor/Vocal Tics; and Primarily Simple Motor Tics. Frequencies of tic symptoms showed few differences across youth and adults. Tic clusters had small associations with clinical characteristics and showed no associations to the presence of coexisting psychiatric conditions. Cluster membership scores did not predict treatment response to CBIT or tic severity reductions. Tic symptoms distinctly cluster with little difference across youth and adults, or coexisting conditions. This study, which is the first to examine tic clusters and response to treatment, suggested that tic symptom profiles respond equally well to CBIT. Clinical trials.gov. identifiers: NCT00218777; NCT00231985. © 2013 Elsevier Ireland Ltd. All rights reserved.
Toward An Understanding of Cluster Evolution: A Deep X-Ray Selected Cluster Catalog from ROSAT
NASA Technical Reports Server (NTRS)
Jones, Christine; Oliversen, Ronald (Technical Monitor)
2002-01-01
In the past year, we have focussed on studying individual clusters found in this sample with Chandra, as well as using Chandra to measure the luminosity-temperature relation for a sample of distant clusters identified through the ROSAT study, and finally we are continuing our study of fossil groups. For the luminosity-temperature study, we compared a sample of nearby clusters with a sample of distant clusters and, for the first time, measured a significant change in the relation as a function of redshift (Vikhlinin et al. in final preparation for submission to Cape). We also used our ROSAT analysis to select and propose for Chandra observations of individual clusters. We are now analyzing the Chandra observations of the distant cluster A520, which appears to have undergone a recent merger. Finally, we have completed the analysis of the fossil groups identified in ROM observations. In the past few months, we have derived X-ray fluxes and luminosities as well as X-ray extents for an initial sample of 89 objects. Based on the X-ray extents and the lack of bright galaxies, we have identified 16 fossil groups. We are comparing their X-ray and optical properties with those of optically rich groups. A paper is being readied for submission (Jones, Forman, and Vikhlinin in preparation).
Rayward, Anna T; Duncan, Mitch J; Brown, Wendy J; Plotnikoff, Ronald C; Burton, Nicola W
2017-08-01
This study aimed to identify how different patterns of physical activity, sleep duration and sleep quality cluster together, and to examine how the identified clusters differ in terms of socio-demographic and health characteristics. Participants were adults from Brisbane, Australia, aged 42-72 years who reported their physical activity, sleep duration, sleep quality, socio-demographic and health characteristics in 2011 (n=5854). Two-step Cluster Analyses were used to identify clusters. Cluster differences in socio-demographic and health characteristics were examined using chi square tests (p<0.05). Four clusters were identified: 'Poor Sleepers' (31.2%), 'Moderate Sleepers' (30.7%), 'Mixed Sleepers/Highly Active' (20.5%), and 'Excellent Sleepers/Mixed Activity' (17.6%). The 'Poor Sleepers' cluster had the highest proportion of participants with less-than-recommended sleep duration and poor sleep quality, had the poorest health characteristics and a high proportion of participants with low physical activity. Physical activity, sleep duration and sleep quality cluster together in distinct patterns and clusters of poor behaviours are associated with poor health status. Multiple health behaviour change interventions which target both physical activity and sleep should be prioritised to improve health outcomes in mid-aged adults. Copyright © 2017 Elsevier B.V. All rights reserved.
Novel approach to classifying patients with pulmonary arterial hypertension using cluster analysis.
Parikh, Kishan S; Rao, Youlan; Ahmad, Tariq; Shen, Kai; Felker, G Michael; Rajagopal, Sudarshan
2017-01-01
Pulmonary arterial hypertension (PAH) patients have distinct disease courses and responses to treatment, but current diagnostic and treatment schemes provide limited insight. We aimed to see if cluster analysis could distinguish clinical phenotypes in PAH. An unbiased cluster analysis was performed on 17 baseline clinical variables of PAH patients from the FREEDOM-M, FREEDOM-C, and FREEDOM-C2 randomized trials of oral treprostinil versus placebo. Participants were either treatment-naïve (FREEDOM-M) or on background therapy (FREEDOM-C, FREEDOM-C2). We tested for association of clusters with outcomes and interaction with respect to treatment. Primary outcome was 6-minute walking distance (6MWD) change. We included 966 participants with 12-week (FREEDOM-M) or 16-week (FREEDOM-C and FREEDOM-C2) follow-up. Four patient clusters were identified. Compared with Clusters 1 (n = 131) and 2 (n = 496), Clusters 3 (n = 246) and 4 (n = 93) patients were older, heavier, had worse baseline functional class, 6MWD, Borg Dyspnea Index, and fewer years since PAH diagnosis. Clusters also differed by PAH etiology and background therapies, but not gender or race. Mean treatment effect of oral treprostinil differed across Clusters 1-4 increased in a monotonic fashion (Cluster 1: 10.9 m; Cluster 2: 13.0 m; Cluster 3: 25.0 m; Cluster 4: 50.9 m; interaction P value = 0.048). We identified four distinct clusters of PAH patients based on common patient characteristics. Patients who were older, diagnosed with PAH for a shorter period, and had worse baseline symptoms and exercise capacity had the greatest response to oral treprostinil treatment.
Nursing home care quality: a cluster analysis.
Grøndahl, Vigdis Abrahamsen; Fagerli, Liv Berit
2017-02-13
Purpose The purpose of this paper is to explore potential differences in how nursing home residents rate care quality and to explore cluster characteristics. Design/methodology/approach A cross-sectional design was used, with one questionnaire including questions from quality from patients' perspective and Big Five personality traits, together with questions related to socio-demographic aspects and health condition. Residents ( n=103) from four Norwegian nursing homes participated (74.1 per cent response rate). Hierarchical cluster analysis identified clusters with respect to care quality perceptions. χ 2 tests and one-way between-groups ANOVA were performed to characterise the clusters ( p<0.05). Findings Two clusters were identified; Cluster 1 residents (28.2 per cent) had the best care quality perceptions and Cluster 2 (67.0 per cent) had the worst perceptions. The clusters were statistically significant and characterised by personal-related conditions: gender, psychological well-being, preferences, admission, satisfaction with staying in the nursing home, emotional stability and agreeableness, and by external objective care conditions: healthcare personnel and registered nurses. Research limitations/implications Residents assessed as having no cognitive impairments were included, thus excluding the largest group. By choosing questionnaire design and structured interviews, the number able to participate may increase. Practical implications Findings may provide healthcare personnel and managers with increased knowledge on which to develop strategies to improve specific care quality perceptions. Originality/value Cluster analysis can be an effective tool for differentiating between nursing homes residents' care quality perceptions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Ang; Yu, Heng; Tozzi, Paolo
2016-04-10
We search for bulk motions in the intracluster medium (ICM) of massive clusters showing evidence of an ongoing or recent major merger with spatially resolved spectroscopy in Chandra CCD data. We identify a sample of six merging clusters with >150 ks Chandra exposure in the redshift range 0.1 < z < 0.3. By performing X-ray spectral analysis of projected ICM regions selected according to their surface brightness, we obtain the projected redshift maps for all of these clusters. After performing a robust analysis of the statistical and systematic uncertainties in the measured X-ray redshift z{sub X}, we check whether or not themore » global z{sub X} distribution differs from that expected when the ICM is at rest. We find evidence of significant bulk motions at more than 3σ in A2142 and A115, and less than 2σ in A2034 and A520. Focusing on single regions, we identify significant localized velocity differences in all of the merger clusters. We also perform the same analysis on two relaxed clusters with no signatures of recent mergers, finding no signs of bulk motions, as expected. Our results indicate that deep Chandra CCD data enable us to identify the presence of bulk motions at the level of v{sub BM} > 1000 km s{sup −1} in the ICM of massive merging clusters at 0.1 < z < 0.3. Although the CCD spectral resolution is not sufficient for a detailed analysis of the ICM dynamics, Chandra CCD data constitute a key diagnostic tool complementing X-ray bolometers on board future X-ray missions.« less
Salient concerns in using analgesia for cancer pain among outpatients: A cluster analysis study.
Meghani, Salimah H; Knafl, George J
2017-02-10
To identify unique clusters of patients based on their concerns in using analgesia for cancer pain and predictors of the cluster membership. This was a 3-mo prospective observational study ( n = 207). Patients were included if they were adults (≥ 18 years), diagnosed with solid tumors or multiple myelomas, and had at least one prescription of around-the-clock pain medication for cancer or cancer-treatment-related pain. Patients were recruited from two outpatient medical oncology clinics within a large health system in Philadelphia. A choice-based conjoint (CBC) analysis experiment was used to elicit analgesic treatment preferences (utilities). Patients employed trade-offs based on five analgesic attributes (percent relief from analgesics, type of analgesic, type of side-effects, severity of side-effects, out of pocket cost). Patients were clustered based on CBC utilities using novel adaptive statistical methods. Multiple logistic regression was used to identify predictors of cluster membership. The analyses found 4 unique clusters: Most patients made trade-offs based on the expectation of pain relief (cluster 1, 41%). For a subset, the main underlying concern was type of analgesic prescribed, i.e ., opioid vs non-opioid (cluster 2, 11%) and type of analgesic side effects (cluster 4, 21%), respectively. About one in four made trade-offs based on multiple concerns simultaneously including pain relief, type of side effects, and severity of side effects (cluster 3, 28%). In multivariable analysis, to identify predictors of cluster membership, clinical and socioeconomic factors (education, health literacy, income, social support) rather than analgesic attitudes and beliefs were found important; only the belief, i.e ., pain medications can mask changes in health or keep you from knowing what is going on in your body was found significant in predicting two of the four clusters [cluster 1 (-); cluster 4 (+)]. Most patients appear to be driven by a single salient concern in using analgesia for cancer pain. Addressing these concerns, perhaps through real time clinical assessments, may improve patients' analgesic adherence patterns and cancer pain outcomes.
Sun Protection Belief Clusters: Analysis of Amazon Mechanical Turk Data.
Santiago-Rivas, Marimer; Schnur, Julie B; Jandorf, Lina
2016-12-01
This study aimed (i) to determine whether people could be differentiated on the basis of their sun protection belief profiles and individual characteristics and (ii) explore the use of a crowdsourcing web service for the assessment of sun protection beliefs. A sample of 500 adults completed an online survey of sun protection belief items using Amazon Mechanical Turk. A two-phased cluster analysis (i.e., hierarchical and non-hierarchical K-means) was utilized to determine clusters of sun protection barriers and facilitators. Results yielded three distinct clusters of sun protection barriers and three distinct clusters of sun protection facilitators. Significant associations between gender, age, sun sensitivity, and cluster membership were identified. Results also showed an association between barrier and facilitator cluster membership. The results of this study provided a potential alternative approach to developing future sun protection promotion initiatives in the population. Findings add to our knowledge regarding individuals who support, oppose, or are ambivalent toward sun protection and inform intervention research by identifying distinct subtypes that may best benefit from (or have a higher need for) skin cancer prevention efforts.
Ogden, Lorraine G; Stroebele, Nanette; Wyatt, Holly R; Catenacci, Victoria A; Peters, John C; Stuht, Jennifer; Wing, Rena R; Hill, James O
2012-10-01
The National Weight Control Registry (NWCR) is the largest ongoing study of individuals successful at maintaining weight loss; the registry enrolls individuals maintaining a weight loss of at least 13.6 kg (30 lb) for a minimum of 1 year. The current report uses multivariate latent class cluster analysis to identify unique clusters of individuals within the NWCR that have distinct experiences, strategies, and attitudes with respect to weight loss and weight loss maintenance. The cluster analysis considers weight and health history, weight control behaviors and strategies, effort and satisfaction with maintaining weight, and psychological and demographic characteristics. The analysis includes 2,228 participants enrolled between 1998 and 2002. Cluster 1 (50.5%) represents a weight-stable, healthy, exercise conscious group who are very satisfied with their current weight. Cluster 2 (26.9%) has continuously struggled with weight since childhood; they rely on the greatest number of resources and strategies to lose and maintain weight, and report higher levels of stress and depression. Cluster 3 (12.7%) represents a group successful at weight reduction on the first attempt; they were least likely to be overweight as children, are maintaining the longest duration of weight loss, and report the least difficulty maintaining weight. Cluster 4 (9.9%) represents a group less likely to use exercise to control weight; they tend to be older, eat fewer meals, and report more health problems. Further exploration of the unique characteristics of these clusters could be useful for tailoring future weight loss and weight maintenance programs to the specific characteristics of an individual.
Zhong, Xingyu; Tian, Yuqing; Niu, Guoqing; Tan, Huarong
2013-07-01
A draft genome sequence of Streptomyces ansochromogenes 7100 was generated using 454 sequencing technology. In combination with local BLAST searches and gap filling techniques, a comprehensive antiSMASH-based method was adopted to assemble the secondary metabolite biosynthetic gene clusters in the draft genome of S. ansochromogenes. A total of at least 35 putative gene clusters were identified and assembled. Transcriptional analysis showed that 20 of the 35 gene clusters were expressed in either or all of the three different media tested, whereas the other 15 gene clusters were silent in all three different media. This study provides a comprehensive method to identify and assemble secondary metabolite biosynthetic gene clusters in draft genomes of Streptomyces, and will significantly promote functional studies of these secondary metabolite biosynthetic gene clusters.
Comprehensive Molecular Characterization of Muscle-Invasive Bladder Cancer.
Robertson, A Gordon; Kim, Jaegil; Al-Ahmadie, Hikmat; Bellmunt, Joaquim; Guo, Guangwu; Cherniack, Andrew D; Hinoue, Toshinori; Laird, Peter W; Hoadley, Katherine A; Akbani, Rehan; Castro, Mauro A A; Gibb, Ewan A; Kanchi, Rupa S; Gordenin, Dmitry A; Shukla, Sachet A; Sanchez-Vega, Francisco; Hansel, Donna E; Czerniak, Bogdan A; Reuter, Victor E; Su, Xiaoping; de Sa Carvalho, Benilton; Chagas, Vinicius S; Mungall, Karen L; Sadeghi, Sara; Pedamallu, Chandra Sekhar; Lu, Yiling; Klimczak, Leszek J; Zhang, Jiexin; Choo, Caleb; Ojesina, Akinyemi I; Bullman, Susan; Leraas, Kristen M; Lichtenberg, Tara M; Wu, Catherine J; Schultz, Nicholaus; Getz, Gad; Meyerson, Matthew; Mills, Gordon B; McConkey, David J; Weinstein, John N; Kwiatkowski, David J; Lerner, Seth P
2017-10-19
We report a comprehensive analysis of 412 muscle-invasive bladder cancers characterized by multiple TCGA analytical platforms. Fifty-eight genes were significantly mutated, and the overall mutational load was associated with APOBEC-signature mutagenesis. Clustering by mutation signature identified a high-mutation subset with 75% 5-year survival. mRNA expression clustering refined prior clustering analyses and identified a poor-survival "neuronal" subtype in which the majority of tumors lacked small cell or neuroendocrine histology. Clustering by mRNA, long non-coding RNA (lncRNA), and miRNA expression converged to identify subsets with differential epithelial-mesenchymal transition status, carcinoma in situ scores, histologic features, and survival. Our analyses identified 5 expression subtypes that may stratify response to different treatments. Copyright © 2017 Elsevier Inc. All rights reserved.
Modest validity and fair reproducibility of dietary patterns derived by cluster analysis.
Funtikova, Anna N; Benítez-Arciniega, Alejandra A; Fitó, Montserrat; Schröder, Helmut
2015-03-01
Cluster analysis is widely used to analyze dietary patterns. We aimed to analyze the validity and reproducibility of the dietary patterns defined by cluster analysis derived from a food frequency questionnaire (FFQ). We hypothesized that the dietary patterns derived by cluster analysis have fair to modest reproducibility and validity. Dietary data were collected from 107 individuals from population-based survey, by an FFQ at baseline (FFQ1) and after 1 year (FFQ2), and by twelve 24-hour dietary recalls (24-HDR). Repeatability and validity were measured by comparing clusters obtained by the FFQ1 and FFQ2 and by the FFQ2 and 24-HDR (reference method), respectively. Cluster analysis identified a "fruits & vegetables" and a "meat" pattern in each dietary data source. Cluster membership was concordant for 66.7% of participants in FFQ1 and FFQ2 (reproducibility), and for 67.0% in FFQ2 and 24-HDR (validity). Spearman correlation analysis showed reasonable reproducibility, especially in the "fruits & vegetables" pattern, and lower validity also especially in the "fruits & vegetables" pattern. κ statistic revealed a fair validity and reproducibility of clusters. Our findings indicate a reasonable reproducibility and fair to modest validity of dietary patterns derived by cluster analysis. Copyright © 2015 Elsevier Inc. All rights reserved.
Genetically distinct genogroup IV norovirus strains identified in wastewater.
Kitajima, Masaaki; Rachmadi, Andri T; Iker, Brandon C; Haramoto, Eiji; Gerba, Charles P
2016-12-01
We investigated the prevalence and genetic diversity of genogroup IV norovirus (GIV NoV) strains in wastewater in Arizona, United States, over a 13-month period. Among 50 wastewater samples tested, GIV NoVs were identified in 13 (26 %) of the samples. A total of 47 different GIV NoV strains were identified, which were classified into two genetically distinct clusters: the GIV.1 human cluster and a unique genetic cluster closely related to strains previously identified in Japanese wastewater. The results provide additional evidence of the considerable genetic diversity among GIV NoV strains through the analysis of wastewater containing virus strains shed from all populations.
NASA Astrophysics Data System (ADS)
Zhou, Shuguang; Zhou, Kefa; Wang, Jinlin; Yang, Genfang; Wang, Shanshan
2017-12-01
Cluster analysis is a well-known technique that is used to analyze various types of data. In this study, cluster analysis is applied to geochemical data that describe 1444 stream sediment samples collected in northwestern Xinjiang with a sample spacing of approximately 2 km. Three algorithms (the hierarchical, k-means, and fuzzy c-means algorithms) and six data transformation methods (the z-score standardization, ZST; the logarithmic transformation, LT; the additive log-ratio transformation, ALT; the centered log-ratio transformation, CLT; the isometric log-ratio transformation, ILT; and no transformation, NT) are compared in terms of their effects on the cluster analysis of the geochemical compositional data. The study shows that, on the one hand, the ZST does not affect the results of column- or variable-based (R-type) cluster analysis, whereas the other methods, including the LT, the ALT, and the CLT, have substantial effects on the results. On the other hand, the results of the row- or observation-based (Q-type) cluster analysis obtained from the geochemical data after applying NT and the ZST are relatively poor. However, we derive some improved results from the geochemical data after applying the CLT, the ILT, the LT, and the ALT. Moreover, the k-means and fuzzy c-means clustering algorithms are more reliable than the hierarchical algorithm when they are used to cluster the geochemical data. We apply cluster analysis to the geochemical data to explore for Au deposits within the study area, and we obtain a good correlation between the results retrieved by combining the CLT or the ILT with the k-means or fuzzy c-means algorithms and the potential zones of Au mineralization. Therefore, we suggest that the combination of the CLT or the ILT with the k-means or fuzzy c-means algorithms is an effective tool to identify potential zones of mineralization from geochemical data.
An Enhanced K-Means Algorithm for Water Quality Analysis of The Haihe River in China.
Zou, Hui; Zou, Zhihong; Wang, Xiaojing
2015-11-12
The increase and the complexity of data caused by the uncertain environment is today's reality. In order to identify water quality effectively and reliably, this paper presents a modified fast clustering algorithm for water quality analysis. The algorithm has adopted a varying weights K-means cluster algorithm to analyze water monitoring data. The varying weights scheme was the best weighting indicator selected by a modified indicator weight self-adjustment algorithm based on K-means, which is named MIWAS-K-means. The new clustering algorithm avoids the margin of the iteration not being calculated in some cases. With the fast clustering analysis, we can identify the quality of water samples. The algorithm is applied in water quality analysis of the Haihe River (China) data obtained by the monitoring network over a period of eight years (2006-2013) with four indicators at seven different sites (2078 samples). Both the theoretical and simulated results demonstrate that the algorithm is efficient and reliable for water quality analysis of the Haihe River. In addition, the algorithm can be applied to more complex data matrices with high dimensionality.
Epidemiologic Surveillance of Teenage Birth Rates in the United States, 2006-2012.
Amin, Raid; Decesare, Julie Zemaitis; Hans, Jennifer; Roussos-Ross, Kay
2017-06-01
To investigate the geographic variation in the average teenage birth rates by county in the contiguous United States. Data from the National Center for Health Statistics were used in this retrospective cohort to count the total number of live births to females aged 15-19 years by county between 2006 and 2012. Software for disease surveillance and spatial cluster analysis was used to identify clusters of high or low teenage births in counties or areas of greater than 100,000 teenage females. The analysis was then adjusted for percentage of poverty and high school diploma achievement. The unadjusted analysis identified the top 10 clusters of teenage births. The cluster with the highest rate was a city and the surrounding 40 counties, demonstrating an average teen birth rate of 67 per 1,000 females in the age range, 87% higher than the rate in the contiguous United States. Adjustments for poverty rates and high school diploma achievement shifted the top clusters to other areas. Despite an overall national decline in the teenage birth rate, clusters of elevated teenage birth rates remain. These clusters are not random and remain higher than expected when adjusted for poverty and education. This data set provides a framework to focus targeted interventions to reduce teenage birth rates in this high-risk population.
Thaler, Nicholas S; Terranova, Jennifer; Turner, Alisa; Mayfield, Joan; Allen, Daniel N
2015-01-01
Recent studies have examined heterogeneous neuropsychological outcomes in childhood traumatic brain injury (TBI) using cluster analysis. These studies have identified homogeneous subgroups based on tests of IQ, memory, and other cognitive abilities that show some degree of association with specific cognitive, emotional, and behavioral outcomes, and have demonstrated that the clusters derived for children with TBI are different from those observed in normal populations. However, the extent to which these subgroups are stable across abilities has not been examined, and this has significant implications for the generalizability and clinical utility of TBI clusters. The current study addressed this by comparing IQ and memory profiles of 137 children who sustained moderate-to-severe TBI. Cluster analysis of IQ and memory scores indicated that a four-cluster solution was optimal for the IQ scores and a five-cluster solution was optimal for the memory scores. Three clusters on each battery differed primarily by level of performance, while the others had pattern variations. Cross-plotting the clusters across respective IQ and memory test scores indicated that clusters defined by level were generally stable, while clusters defined by pattern differed. Notably, children with slower processing speed exhibited low-average to below-average performance on memory indexes. These results provide some support for the stability of previously identified memory and IQ clusters and provide information about the relationship between IQ and memory in children with TBI.
2013-01-01
Background A general trend towards positive patient-reported evaluations of hospitals could be taken as a sign that most patients form a homogeneous, reasonably pleased group, and consequently that there is little need for quality improvement. The objective of this study was to explore this assumption by identifying and statistically validating clusters of patients based on their evaluation of outcomes related to overall satisfaction, malpractice and benefit of treatment. Methods Data were collected using a national patient-experience survey of 61 hospitals in the 4 health regions in Norway during spring 2011. Postal questionnaires were mailed to 23,420 patients after their discharge from hospital. Cluster analysis was performed to identify response clusters of patients, based on their responses to single items about overall patient satisfaction, benefit of treatment and perception of malpractice. Results Cluster analysis identified six response groups, including one cluster with systematically poorer evaluation across outcomes (18.5% of patients) and one small outlier group (5.3%) with very poor scores across all outcomes. One-Way ANOVA with post-hoc tests showed that most differences between the six response groups on the three outcome items were significant. The response groups were significantly associated with nine patient-experience indicators (p < 0.001), and all groups were significantly different from each of the other groups on a majority of the patient-experience indicators. Clusters were significantly associated with age, education, self-perceived health, gender, and the degree to write open comments in the questionnaire. Conclusions The study identified five response clusters with distinct patient-reported outcome scores, in addition to a heterogeneous outlier group with very poor scores across all outcomes. The outlier group and the cluster with systematically poorer evaluation across outcomes comprised almost one-quarter of all patients, clearly demonstrating the need to tailor quality initiatives and improve patient-perceived quality in hospitals. More research on patient clustering in patient evaluation is needed, as well as standardization of methodology to increase comparability across studies. PMID:23433450
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hu, Lin; Maroudas, Dimitrios, E-mail: maroudas@ecs.umass.edu; Hammond, Karl D.
We report the results of a systematic atomic-scale analysis of the reactions of small mobile helium clusters (He{sub n}, 4 ≤ n ≤ 7) near low-Miller-index tungsten (W) surfaces, aiming at a fundamental understanding of the near-surface dynamics of helium-carrying species in plasma-exposed tungsten. These small mobile helium clusters are attracted to the surface and migrate to the surface by Fickian diffusion and drift due to the thermodynamic driving force for surface segregation. As the clusters migrate toward the surface, trap mutation (TM) and cluster dissociation reactions are activated at rates higher than in the bulk. TM produces W adatoms and immobile complexes ofmore » helium clusters surrounding W vacancies located within the lattice planes at a short distance from the surface. These reactions are identified and characterized in detail based on the analysis of a large number of molecular-dynamics trajectories for each such mobile cluster near W(100), W(110), and W(111) surfaces. TM is found to be the dominant cluster reaction for all cluster and surface combinations, except for the He{sub 4} and He{sub 5} clusters near W(100) where cluster partial dissociation following TM dominates. We find that there exists a critical cluster size, n = 4 near W(100) and W(111) and n = 5 near W(110), beyond which the formation of multiple W adatoms and vacancies in the TM reactions is observed. The identified cluster reactions are responsible for important structural, morphological, and compositional features in the plasma-exposed tungsten, including surface adatom populations, near-surface immobile helium-vacancy complexes, and retained helium content, which are expected to influence the amount of hydrogen re-cycling and tritium retention in fusion tokamaks.« less
Cluster analysis of sputum cytokine-high profiles reveals diversity in T(h)2-high asthma patients.
Seys, Sven F; Scheers, Hans; Van den Brande, Paul; Marijsse, Gudrun; Dilissen, Ellen; Van Den Bergh, Annelies; Goeminne, Pieter C; Hellings, Peter W; Ceuppens, Jan L; Dupont, Lieven J; Bullens, Dominique M A
2017-02-23
Asthma is characterized by a heterogeneous inflammatory profile and can be subdivided into T(h)2-high and T(h)2-low airway inflammation. Profiling of a broader panel of airway cytokines in large unselected patient cohorts is lacking. Patients (n = 205) were defined as being "cytokine-low/high" if sputum mRNA expression of a particular cytokine was outside the respective 10 th /90 th percentile range of the control group (n = 80). Unsupervised hierarchical clustering was used to determine clusters based on sputum cytokine profiles. Half of patients (n = 108; 52.6%) had a classical T(h)2-high ("IL-4-, IL-5- and/or IL-13-high") sputum cytokine profile. Unsupervised cluster analysis revealed 5 clusters. Patients with an "IL-4- and/or IL-13-high" pattern surprisingly did not cluster but were equally distributed among the 5 clusters. Patients with an "IL-5-, IL-17A-/F- and IL-25- high" profile were restricted to cluster 1 (n = 24) with increased sputum eosinophil as well as neutrophil counts and poor lung function parameters at baseline and 2 years later. Four other clusters were identified: "IL-5-high or IL-10-high" (n = 16), "IL-6-high" (n = 8), "IL-22-high" (n = 25). Cluster 5 (n = 132) consists of patients without "cytokine-high" pattern or patients with only high IL-4 and/or IL-13. We identified 5 unique asthma molecular phenotypes by biological clustering. Type 2 cytokines cluster with non-type 2 cytokines in 4 out of 5 clusters. Unsupervised analysis thus not supports a priori type 2 versus non-type 2 molecular phenotypes. www.clinicaltrials.gov NCT01224938. Registered 18 October 2010.
Cellucci, Tania; Tyrrell, Pascal N; Twilt, Marinka; Sheikh, Shehla; Benseler, Susanne M
2014-03-01
To identify distinct clusters of children with inflammatory brain diseases based on clinical, laboratory, and imaging features at presentation, to assess which features contribute strongly to the development of clusters, and to compare additional features between the identified clusters. A single-center cohort study was performed with children who had been diagnosed as having an inflammatory brain disease between June 1, 1989 and December 31, 2010. Demographic, clinical, laboratory, neuroimaging, and histologic data at diagnosis were collected. K-means cluster analysis was performed to identify clusters of patients based on their presenting features. Associations between the clusters and patient variables, such as diagnoses, were determined. A total of 147 children (50% female; median age 8.8 years) were identified: 105 with primary central nervous system (CNS) vasculitis, 11 with secondary CNS vasculitis, 8 with neuronal antibody syndromes, 6 with postinfectious syndromes, and 17 with other inflammatory brain diseases. Three distinct clusters were identified. Paresis and speech deficits were the most common presenting features in cluster 1. Children in cluster 2 were likely to present with behavior changes, cognitive dysfunction, and seizures, while those in cluster 3 experienced ataxia, vision abnormalities, and seizures. Lesions seen on T2/fluid-attenuated inversion recovery sequences of magnetic resonance imaging were common in all clusters, but unilateral ischemic lesions were more prominent in cluster 1. The clusters were associated with specific diagnoses and diagnostic test results. Children with inflammatory brain diseases presented with distinct phenotypical patterns that are associated with specific diagnoses. This information may inform the development of a diagnostic classification of childhood inflammatory brain diseases and suggest that specific pathways of diagnostic evaluation are warranted. Copyright © 2014 by the American College of Rheumatology.
Spatio-Temporal Analysis of Smear-Positive Tuberculosis in the Sidama Zone, Southern Ethiopia
Dangisso, Mesay Hailu; Datiko, Daniel Gemechu; Lindtjørn, Bernt
2015-01-01
Background Tuberculosis (TB) is a disease of public health concern, with a varying distribution across settings depending on socio-economic status, HIV burden, availability and performance of the health system. Ethiopia is a country with a high burden of TB, with regional variations in TB case notification rates (CNRs). However, TB program reports are often compiled and reported at higher administrative units that do not show the burden at lower units, so there is limited information about the spatial distribution of the disease. We therefore aim to assess the spatial distribution and presence of the spatio-temporal clustering of the disease in different geographic settings over 10 years in the Sidama Zone in southern Ethiopia. Methods A retrospective space–time and spatial analysis were carried out at the kebele level (the lowest administrative unit within a district) to identify spatial and space-time clusters of smear-positive pulmonary TB (PTB). Scan statistics, Global Moran’s I, and Getis and Ordi (Gi*) statistics were all used to help analyze the spatial distribution and clusters of the disease across settings. Results A total of 22,545 smear-positive PTB cases notified over 10 years were used for spatial analysis. In a purely spatial analysis, we identified the most likely cluster of smear-positive PTB in 192 kebeles in eight districts (RR= 2, p<0.001), with 12,155 observed and 8,668 expected cases. The Gi* statistic also identified the clusters in the same areas, and the spatial clusters showed stability in most areas in each year during the study period. The space-time analysis also detected the most likely cluster in 193 kebeles in the same eight districts (RR= 1.92, p<0.001), with 7,584 observed and 4,738 expected cases in 2003-2012. Conclusion The study found variations in CNRs and significant spatio-temporal clusters of smear-positive PTB in the Sidama Zone. The findings can be used to guide TB control programs to devise effective TB control strategies for the geographic areas characterized by the highest CNRs. Further studies are required to understand the factors associated with clustering based on individual level locations and investigation of cases. PMID:26030162
Abramyan, Tigran M; Snyder, James A; Thyparambil, Aby A; Stuart, Steven J; Latour, Robert A
2016-08-05
Clustering methods have been widely used to group together similar conformational states from molecular simulations of biomolecules in solution. For applications such as the interaction of a protein with a surface, the orientation of the protein relative to the surface is also an important clustering parameter because of its potential effect on adsorbed-state bioactivity. This study presents cluster analysis methods that are specifically designed for systems where both molecular orientation and conformation are important, and the methods are demonstrated using test cases of adsorbed proteins for validation. Additionally, because cluster analysis can be a very subjective process, an objective procedure for identifying both the optimal number of clusters and the best clustering algorithm to be applied to analyze a given dataset is presented. The method is demonstrated for several agglomerative hierarchical clustering algorithms used in conjunction with three cluster validation techniques. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Identifying Peer Institutions Using Cluster Analysis
ERIC Educational Resources Information Center
Boronico, Jess; Choksi, Shail S.
2012-01-01
The New York Institute of Technology's (NYIT) School of Management (SOM) wishes to develop a list of peer institutions for the purpose of benchmarking and monitoring/improving performance against other business schools. The procedure utilizes relevant criteria for the purpose of establishing this peer group by way of a cluster analysis. The…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Belles, Randy J.; Omitaomu, Olufemi A.
2014-09-01
Geographic information systems (GIS) technology was applied to analyze federal energy demand across the contiguous US. Several federal energy clusters were previously identified, including Hampton Roads, Virginia, which was subsequently studied in detail. This study provides an analysis of three additional diverse federal energy clusters. The analysis shows that there are potential sites in various federal energy clusters that could be evaluated further for placement of an integral pressurized-water reactor (iPWR) to support meeting federal clean energy goals.
Using Cluster Analysis and ICP-MS to Identify Groups of Ecstasy Tablets in Sao Paulo State, Brazil.
Maione, Camila; de Oliveira Souza, Vanessa Cristina; Togni, Loraine Rezende; da Costa, José Luiz; Campiglia, Andres Dobal; Barbosa, Fernando; Barbosa, Rommel Melgaço
2017-11-01
The variations found in the elemental composition in ecstasy samples result in spectral profiles with useful information for data analysis, and cluster analysis of these profiles can help uncover different categories of the drug. We provide a cluster analysis of ecstasy tablets based on their elemental composition. Twenty-five elements were determined by ICP-MS in tablets apprehended by Sao Paulo's State Police, Brazil. We employ the K-means clustering algorithm along with C4.5 decision tree to help us interpret the clustering results. We found a better number of two clusters within the data, which can refer to the approximated number of sources of the drug which supply the cities of seizures. The C4.5 model was capable of differentiating the ecstasy samples from the two clusters with high prediction accuracy using the leave-one-out cross-validation. The model used only Nd, Ni, and Pb concentration values in the classification of the samples. © 2017 American Academy of Forensic Sciences.
Jung, Inuk; Jo, Kyuri; Kang, Hyejin; Ahn, Hongryul; Yu, Youngjae; Kim, Sun
2017-12-01
Identifying biologically meaningful gene expression patterns from time series gene expression data is important to understand the underlying biological mechanisms. To identify significantly perturbed gene sets between different phenotypes, analysis of time series transcriptome data requires consideration of time and sample dimensions. Thus, the analysis of such time series data seeks to search gene sets that exhibit similar or different expression patterns between two or more sample conditions, constituting the three-dimensional data, i.e. gene-time-condition. Computational complexity for analyzing such data is very high, compared to the already difficult NP-hard two dimensional biclustering algorithms. Because of this challenge, traditional time series clustering algorithms are designed to capture co-expressed genes with similar expression pattern in two sample conditions. We present a triclustering algorithm, TimesVector, specifically designed for clustering three-dimensional time series data to capture distinctively similar or different gene expression patterns between two or more sample conditions. TimesVector identifies clusters with distinctive expression patterns in three steps: (i) dimension reduction and clustering of time-condition concatenated vectors, (ii) post-processing clusters for detecting similar and distinct expression patterns and (iii) rescuing genes from unclassified clusters. Using four sets of time series gene expression data, generated by both microarray and high throughput sequencing platforms, we demonstrated that TimesVector successfully detected biologically meaningful clusters of high quality. TimesVector improved the clustering quality compared to existing triclustering tools and only TimesVector detected clusters with differential expression patterns across conditions successfully. The TimesVector software is available at http://biohealth.snu.ac.kr/software/TimesVector/. sunkim.bioinfo@snu.ac.kr. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Dumuid, Dorothea; Olds, T; Lewis, L K; Martin-Fernández, J A; Barreira, T; Broyles, S; Chaput, J-P; Fogelholm, M; Hu, G; Kuriyan, R; Kurpad, A; Lambert, E V; Maia, J; Matsudo, V; Onywera, V O; Sarmiento, O L; Standage, M; Tremblay, M S; Tudor-Locke, C; Zhao, P; Katzmarzyk, P; Gillison, F; Maher, C
2018-02-01
The relationship between children's adiposity and lifestyle behaviour patterns is an area of growing interest. The objectives of this study are to identify clusters of children based on lifestyle behaviours and compare children's adiposity among clusters. Cross-sectional data from the International Study of Childhood Obesity, Lifestyle and the Environment were used. the participants were children (9-11 years) from 12 nations (n = 5710). 24-h accelerometry and self-reported diet and screen time were clustering input variables. Objectively measured adiposity indicators were waist-to-height ratio, percent body fat and body mass index z-scores. sex-stratified analyses were performed on the global sample and repeated on a site-wise basis. Cluster analysis (using isometric log ratios for compositional data) was used to identify common lifestyle behaviour patterns. Site representation and adiposity were compared across clusters using linear models. Four clusters emerged: (1) Junk Food Screenies, (2) Actives, (3) Sitters and (4) All-Rounders. Countries were represented differently among clusters. Chinese children were over-represented in Sitters and Colombian children in Actives. Adiposity varied across clusters, being highest in Sitters and lowest in Actives. Children from different sites clustered into groups of similar lifestyle behaviours. Cluster membership was linked with differing adiposity. Findings support the implementation of activity interventions in all countries, targeting both physical activity and sedentary time. © 2016 World Obesity Federation.
Hummel, Michelle; Wood, Nathan J.; Schweikert, Amy; Stacey, Mark T.; Jones, Jeanne; Barnard, Patrick L.; Erikson, Li H.
2018-01-01
Sea level is projected to rise over the coming decades, further increasing the extent of flooding hazards in coastal communities. Efforts to address potential impacts from climate-driven coastal hazards have called for collaboration among communities to strengthen the application of best practices. However, communities currently lack practical tools for identifying potential partner communities based on similar hazard exposure characteristics. This study uses statistical cluster analysis to identify similarities in community exposure to flooding hazards for a suite of sea level rise and storm scenarios. We demonstrate this approach using 63 jurisdictions in the San Francisco Bay region of California (USA) and compare 21 distinct exposure variables related to residents, employees, and structures for six hazard scenario combinations of sea level rise and storms. Results indicate that cluster analysis can provide an effective mechanism for identifying community groupings. Cluster compositions changed based on the selected societal variables and sea level rise scenarios, suggesting that a community could participate in multiple networks to target specific issues or policy interventions. The proposed clustering approach can serve as a data-driven foundation to help communities identify other communities with similar adaptation challenges and to enhance regional efforts that aim to facilitate adaptation planning and investment prioritization.
Spatiotemporal Analysis of the Ebola Hemorrhagic Fever in West Africa in 2014
NASA Astrophysics Data System (ADS)
Xu, M.; Cao, C. X.; Guo, H. F.
2017-09-01
Ebola hemorrhagic fever (EHF) is an acute hemorrhagic diseases caused by the Ebola virus, which is highly contagious. This paper aimed to explore the possible gathering area of EHF cases in West Africa in 2014, and identify endemic areas and their tendency by means of time-space analysis. We mapped distribution of EHF incidences and explored statistically significant space, time and space-time disease clusters. We utilized hotspot analysis to find the spatial clustering pattern on the basis of the actual outbreak cases. spatial-temporal cluster analysis is used to analyze the spatial or temporal distribution of agglomeration disease, examine whether its distribution is statistically significant. Local clusters were investigated using Kulldorff's scan statistic approach. The result reveals that the epidemic mainly gathered in the western part of Africa near north Atlantic with obvious regional distribution. For the current epidemic, we have found areas in high incidence of EVD by means of spatial cluster analysis.
A comparison of heuristic and model-based clustering methods for dietary pattern analysis.
Greve, Benjamin; Pigeot, Iris; Huybrechts, Inge; Pala, Valeria; Börnhorst, Claudia
2016-02-01
Cluster analysis is widely applied to identify dietary patterns. A new method based on Gaussian mixture models (GMM) seems to be more flexible compared with the commonly applied k-means and Ward's method. In the present paper, these clustering approaches are compared to find the most appropriate one for clustering dietary data. The clustering methods were applied to simulated data sets with different cluster structures to compare their performance knowing the true cluster membership of observations. Furthermore, the three methods were applied to FFQ data assessed in 1791 children participating in the IDEFICS (Identification and Prevention of Dietary- and Lifestyle-Induced Health Effects in Children and Infants) Study to explore their performance in practice. The GMM outperformed the other methods in the simulation study in 72 % up to 100 % of cases, depending on the simulated cluster structure. Comparing the computationally less complex k-means and Ward's methods, the performance of k-means was better in 64-100 % of cases. Applied to real data, all methods identified three similar dietary patterns which may be roughly characterized as a 'non-processed' cluster with a high consumption of fruits, vegetables and wholemeal bread, a 'balanced' cluster with only slight preferences of single foods and a 'junk food' cluster. The simulation study suggests that clustering via GMM should be preferred due to its higher flexibility regarding cluster volume, shape and orientation. The k-means seems to be a good alternative, being easier to use while giving similar results when applied to real data.
Cluster Analysis Identifies 3 Phenotypes within Allergic Asthma.
Sendín-Hernández, María Paz; Ávila-Zarza, Carmelo; Sanz, Catalina; García-Sánchez, Asunción; Marcos-Vadillo, Elena; Muñoz-Bellido, Francisco J; Laffond, Elena; Domingo, Christian; Isidoro-García, María; Dávila, Ignacio
Asthma is a heterogeneous chronic disease with different clinical expressions and responses to treatment. In recent years, several unbiased approaches based on clinical, physiological, and molecular features have described several phenotypes of asthma. Some phenotypes are allergic, but little is known about whether these phenotypes can be further subdivided. We aimed to phenotype patients with allergic asthma using an unbiased approach based on multivariate classification techniques (unsupervised hierarchical cluster analysis). From a total of 54 variables of 225 patients with well-characterized allergic asthma diagnosed following American Thoracic Society (ATS) recommendation, positive skin prick test to aeroallergens, and concordant symptoms, we finally selected 19 variables by multiple correspondence analyses. Then a cluster analysis was performed. Three groups were identified. Cluster 1 was constituted by patients with intermittent or mild persistent asthma, without family antecedents of atopy, asthma, or rhinitis. This group showed the lowest total IgE levels. Cluster 2 was constituted by patients with mild asthma with a family history of atopy, asthma, or rhinitis. Total IgE levels were intermediate. Cluster 3 included patients with moderate or severe persistent asthma that needed treatment with corticosteroids and long-acting β-agonists. This group showed the highest total IgE levels. We identified 3 phenotypes of allergic asthma in our population. Furthermore, we described 2 phenotypes of mild atopic asthma mainly differentiated by a family history of allergy. Copyright © 2017 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
Oberle, Michael; Wohlwend, Nadia; Jonas, Daniel; Maurer, Florian P; Jost, Geraldine; Tschudin-Sutter, Sarah; Vranckx, Katleen; Egli, Adrian
2016-01-01
The technical, biological, and inter-center reproducibility of matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI TOF MS) typing data has not yet been explored. The aim of this study is to compare typing data from multiple centers employing bioinformatics using bacterial strains from two past outbreaks and non-related strains. Participants received twelve extended spectrum betalactamase-producing E. coli isolates and followed the same standard operating procedure (SOP) including a full-protein extraction protocol. All laboratories provided visually read spectra via flexAnalysis (Bruker, Germany). Raw data from each laboratory allowed calculating the technical and biological reproducibility between centers using BioNumerics (Applied Maths NV, Belgium). Technical and biological reproducibility ranged between 96.8-99.4% and 47.6-94.4%, respectively. The inter-center reproducibility showed a comparable clustering among identical isolates. Principal component analysis indicated a higher tendency to cluster within the same center. Therefore, we used a discriminant analysis, which completely separated the clusters. Next, we defined a reference center and performed a statistical analysis to identify specific peaks to identify the outbreak clusters. Finally, we used a classifier algorithm and a linear support vector machine on the determined peaks as classifier. A validation showed that within the set of the reference center, the identification of the cluster was 100% correct with a large contrast between the score with the correct cluster and the next best scoring cluster. Based on the sufficient technical and biological reproducibility of MALDI-TOF MS based spectra, detection of specific clusters is possible from spectra obtained from different centers. However, we believe that a shared SOP and a bioinformatics approach are required to make the analysis robust and reliable.
2012-01-01
Background Although knowledge on single health-related behaviors and their association with health parameters is available, research on multiple health-related behaviors is needed to understand the interactions among these behaviors. The aims of the study were (a) to identify typical health-related behavior patterns in German adolescents focusing on physical activity, media use and dietary behavior; (b) to describe the socio-demographic correlates of the identified clusters and (c) to study their association with overweight. Methods Within the framework of the German Health Interview and Examination Survey for Children and Adolescents (KiGGS) and the “Motorik-Modul” (MoMo), 1,643 German adolescents (11–17 years) completed a questionnaire assessing the amount and type of weekly physical activity in sports clubs and during leisure time, weekly use of television, computer and console games and the frequency and amount of food consumption. From this data the three indices ‘physical activity’, ‘media use’ and ‘healthy nutrition’ were derived and included in a cluster analysis conducted with Ward’s Method and K-means analysis. Chi-square tests were performed to identify socio-demographic correlates of the clusters as well as their association with overweight. Results Four stable clusters representing typical health-related behavior patterns were identified: Cluster 1 (16.2%)—high scores in physical activity index and average scores in media use index and healthy nutrition index; cluster 2 (34.6%)—high healthy nutrition score and below average scores in the other two indices; cluster 3 (18.4%)—low physical activity score, low healthy nutrition score and very high media use score; cluster 4 (30.5%)—below average scores on all three indices. Boys were overrepresented in the clusters 1 and 3, and the relative number of adolescents with low socio-economic status as well as overweight was significantly higher than average in cluster 3. Conclusions Meaningful and stable clusters of health-related behavior were identified. These results confirm findings of another youth study hence supporting the assumption that these clusters represent typical behavior patterns of adolescents. These results are particularly relevant for the characterization of target groups for primary prevention of lifestyle diseases. PMID:23273134
Hale, Corinne R; Casey, Joseph E; Ricciardi, Philip W R
2014-02-01
Wechsler Intelligence Test for Children-IV core subtest scores of 472 children were cluster analyzed to determine if reliable and valid subgroups would emerge. Three subgroups were identified. Clusters were reliable across different stages of the analysis as well as across algorithms and samples. With respect to external validity, the Globally Low cluster differed from the other two clusters on Wechsler Individual Achievement Test-II Word Reading, Numerical Operations, and Spelling subtests, whereas the latter two clusters did not differ from one another. The clusters derived have been identified in studies using previous WISC editions. Clusters characterized by poor performance on subtests historically associated with the VIQ (i.e., VCI + WMI) and PIQ (i.e., POI + PSI) did not emerge, nor did a cluster characterized by low scores on PRI subtests. Picture Concepts represented the highest subtest score in every cluster, failing to vary in a predictable manner with the other PRI subtests.
Leech, Rebecca M; McNaughton, Sarah A; Timperio, Anna
2014-01-22
Diet, physical activity (PA) and sedentary behavior are important, yet modifiable, determinants of obesity. Recent research into the clustering of these behaviors suggests that children and adolescents have multiple obesogenic risk factors. This paper reviews studies using empirical, data-driven methodologies, such as cluster analysis (CA) and latent class analysis (LCA), to identify clustering patterns of diet, PA and sedentary behavior among children or adolescents and their associations with socio-demographic indicators, and overweight and obesity. A literature search of electronic databases was undertaken to identify studies which have used data-driven methodologies to investigate the clustering of diet, PA and sedentary behavior among children and adolescents aged 5-18 years old. Eighteen studies (62% of potential studies) were identified that met the inclusion criteria, of which eight examined the clustering of PA and sedentary behavior and eight examined diet, PA and sedentary behavior. Studies were mostly cross-sectional and conducted in older children and adolescents (≥ 9 years). Findings from the review suggest that obesogenic cluster patterns are complex with a mixed PA/sedentary behavior cluster observed most frequently, but healthy and unhealthy patterning of all three behaviors was also reported. Cluster membership was found to differ according to age, gender and socio-economic status (SES). The tendency for older children/adolescents, particularly females, to comprise clusters defined by low PA was the most robust finding. Findings to support an association between obesogenic cluster patterns and overweight and obesity were inconclusive, with longitudinal research in this area limited. Diet, PA and sedentary behavior cluster together in complex ways that are not well understood. Further research, particularly in younger children, is needed to understand how cluster membership differs according to socio-demographic profile. Longitudinal research is also essential to establish how different cluster patterns track over time and their influence on the development of overweight and obesity.
2014-01-01
Diet, physical activity (PA) and sedentary behavior are important, yet modifiable, determinants of obesity. Recent research into the clustering of these behaviors suggests that children and adolescents have multiple obesogenic risk factors. This paper reviews studies using empirical, data-driven methodologies, such as cluster analysis (CA) and latent class analysis (LCA), to identify clustering patterns of diet, PA and sedentary behavior among children or adolescents and their associations with socio-demographic indicators, and overweight and obesity. A literature search of electronic databases was undertaken to identify studies which have used data-driven methodologies to investigate the clustering of diet, PA and sedentary behavior among children and adolescents aged 5–18 years old. Eighteen studies (62% of potential studies) were identified that met the inclusion criteria, of which eight examined the clustering of PA and sedentary behavior and eight examined diet, PA and sedentary behavior. Studies were mostly cross-sectional and conducted in older children and adolescents (≥9 years). Findings from the review suggest that obesogenic cluster patterns are complex with a mixed PA/sedentary behavior cluster observed most frequently, but healthy and unhealthy patterning of all three behaviors was also reported. Cluster membership was found to differ according to age, gender and socio-economic status (SES). The tendency for older children/adolescents, particularly females, to comprise clusters defined by low PA was the most robust finding. Findings to support an association between obesogenic cluster patterns and overweight and obesity were inconclusive, with longitudinal research in this area limited. Diet, PA and sedentary behavior cluster together in complex ways that are not well understood. Further research, particularly in younger children, is needed to understand how cluster membership differs according to socio-demographic profile. Longitudinal research is also essential to establish how different cluster patterns track over time and their influence on the development of overweight and obesity. PMID:24450617
ERIC Educational Resources Information Center
Luo, Wen; Hughes, Jan N.; Liew, Jeffrey; Kwok, Oiman
2009-01-01
Based on a sample of 480 academically at-risk first graders, we used a cluster analysis involving multimethod assessment (i.e., teacher-report, peer-evaluation, and self-report) of behavioral and psychological engagement to identify subtypes of academic engagement. Four theoretically and practically meaningful clusters were identified and labeled…
Denoth, Francesca; Scalese, Marco; Siciliano, Valeria; Di Renzo, Laura; De Lorenzo, Antonino; Molinaro, Sabrina
2016-06-01
(a) To identify clusters of eating patterns among the Italian population aged 15-64 years, focusing on typical Mediterranean diet (Med-diet) items consumption; (b) to examine the distribution of eating habits, as identified clusters, among age classes and genders; (c) evaluate the impact of: belonging to a specific eating cluster, level of physical activity (PA), sociocultural and psychological factors, as elements determining weight abnormalities. Data for this cross-sectional study were collected using self-reporting questionnaires administered to a sample of 33,127 subjects participating in the Italian population survey on alcohol and other drugs (IPSAD(®)2011). The cluster analysis was performed on a subsample (n = 5278 subjects) which provided information on eating habits, and adapted to identify categories of eating patterns. Stepwise multinomial regression analysis was performed to evaluate the associations between weight categories and eating clusters, adjusted for the following background variables: PA levels, sociocultural and psychological factors. Three clusters were identified: "Mediterranean-like", "Western-like" and "low fruit/vegetables". Frequent consumption of Med-diet patterns was more common among females and elderly. The relationship between overweight/obesity and male gender, educational level, PA, depression and eating disorders (p < 0.05) was confirmed. Belonging to a cluster other than "Mediterranean-like" was significantly associated with obesity. The low consumption of Med-diet patterns among youth, and the frequent association of sociocultural, psychological issues and inappropriate lifestyle with overweight/obesity, highlight the need for an interdisciplinary approach including market policies, to promote a wider awareness of the Mediterranean eating habit benefits in combination with an appropriate lifestyle.
Statistical Significance for Hierarchical Clustering
Kimes, Patrick K.; Liu, Yufeng; Hayes, D. Neil; Marron, J. S.
2017-01-01
Summary Cluster analysis has proved to be an invaluable tool for the exploratory and unsupervised analysis of high dimensional datasets. Among methods for clustering, hierarchical approaches have enjoyed substantial popularity in genomics and other fields for their ability to simultaneously uncover multiple layers of clustering structure. A critical and challenging question in cluster analysis is whether the identified clusters represent important underlying structure or are artifacts of natural sampling variation. Few approaches have been proposed for addressing this problem in the context of hierarchical clustering, for which the problem is further complicated by the natural tree structure of the partition, and the multiplicity of tests required to parse the layers of nested clusters. In this paper, we propose a Monte Carlo based approach for testing statistical significance in hierarchical clustering which addresses these issues. The approach is implemented as a sequential testing procedure guaranteeing control of the family-wise error rate. Theoretical justification is provided for our approach, and its power to detect true clustering structure is illustrated through several simulation studies and applications to two cancer gene expression datasets. PMID:28099990
ERIC Educational Resources Information Center
Heffel, Carly J.; Riggs, Shelley A.; Ruiz, John M.; Ruggles, Mark
2015-01-01
Although suicide clusters have been identified in many populations, research exploring the role of online communication in the aftermath of a suicide cluster is extremely limited. This study used the Consensual Qualitative Research method to analyze interviews with ten high school students 1 year after a suicide cluster in a small suburban school…
ERIC Educational Resources Information Center
Scharfenberg, Franz-Josef; Bogner, Franz X.
2013-01-01
This study classified students into different cognitive load (CL) groups by means of cluster analysis based on their experienced CL in a gene technology outreach lab which has instructionally been designed with regard to CL theory. The relationships of the identified student CL clusters to learner characteristics, laboratory variables, and…
[Spatial analysis of syphilis and gonorrhea infections in a Public Health Service in Madrid].
Wijers, Irene G M; Sánchez Gómez, Amaya; Taveira Jiménez, Jose Antonio
2017-06-21
Sexually transmitted diseases are a significant public health problem. Within the Madrid Autonomous Region, the districts with the highest syphilis and gonorrhea incidences are part of the same Public Health Service (Servicio de Salud Pública del Área 7, SSPA 7). The objective of this study was to identify, by spatial analysis, clusters of syphilis and gonorrhea infections in this SSPA in Madrid. All confirmed syphilis and gonorrhea cases registered in SSPA 7 in Madrid were selected. Moran's I was calculated in order to identify the existence of spatial autocorrelation and a cluster analysis was performed. Clusters and cumulative incidences (CI) per health zone were mapped. The district with most cases was Centro (CI: 67.5 and 160.7 per 100.000 inhabitants for syphilis and gonorrhea, respectively) with the highest CI (120.0 and 322.6 per 100.000 inhabitants) in the Justicia health zone.91.6% of all syphilis cases and 89.6% of gonorrhea cases were among men who have sex with men (MSM). Moran's I was 0.54 and 0.55 (p=0.001) for syphilis and gonorrhea, respectively. For syphilis, a cluster was identified including the six health zones of the Centro district, with a relative risk (RR)of 6.66 (p=0.001). For gonorrhea, a cluster was found including the Centro district, three health zones of the Chamberí district and one of Latina (RR 5.05; p=0.001). Centro was the district with most cases of syphilis and gonorrhea and the most affected population were MSM. For both infections, clusters were found with an important overlap. By identifying the most vulnerable health zones and populations, these results can help to design public health measures for preventing sexually transmitted diseases.
Identification of Urban Leprosy Clusters
Paschoal, José Antonio Armani; Paschoal, Vania Del'Arco; Nardi, Susilene Maria Tonelli; Rosa, Patrícia Sammarco; Ismael, Manuela Gallo y Sanches; Sichieri, Eduvaldo Paulo
2013-01-01
Overpopulation of urban areas results from constant migrations that cause disordered urban growth, constituting clusters defined as sets of people or activities concentrated in relatively small physical spaces that often involve precarious conditions. Aim. Using residential grouping, the aim was to identify possible clusters of individuals in São José do Rio Preto, Sao Paulo, Brazil, who have or have had leprosy. Methods. A population-based, descriptive, ecological study using the MapInfo and CrimeStat techniques, geoprocessing, and space-time analysis evaluated the location of 425 people treated for leprosy between 1998 and 2010. Clusters were defined as concentrations of at least 8 people with leprosy; a distance of up to 300 meters between residences was adopted. Additionally, the year of starting treatment and the clinical forms of the disease were analyzed. Results. Ninety-eight (23.1%) of 425 geocoded cases were located within one of ten clusters identified in this study, and 129 cases (30.3%) were in the region of a second-order cluster, an area considered of high risk for the disease. Conclusion. This study identified ten clusters of leprosy cases in the city and identified an area of high risk for the appearance of new cases of the disease. PMID:24288467
Borri, Marco; Schmidt, Maria A; Powell, Ceri; Koh, Dow-Mu; Riddell, Angela M; Partridge, Mike; Bhide, Shreerang A; Nutting, Christopher M; Harrington, Kevin J; Newbold, Katie L; Leach, Martin O
2015-01-01
To describe a methodology, based on cluster analysis, to partition multi-parametric functional imaging data into groups (or clusters) of similar functional characteristics, with the aim of characterizing functional heterogeneity within head and neck tumour volumes. To evaluate the performance of the proposed approach on a set of longitudinal MRI data, analysing the evolution of the obtained sub-sets with treatment. The cluster analysis workflow was applied to a combination of dynamic contrast-enhanced and diffusion-weighted imaging MRI data from a cohort of squamous cell carcinoma of the head and neck patients. Cumulative distributions of voxels, containing pre and post-treatment data and including both primary tumours and lymph nodes, were partitioned into k clusters (k = 2, 3 or 4). Principal component analysis and cluster validation were employed to investigate data composition and to independently determine the optimal number of clusters. The evolution of the resulting sub-regions with induction chemotherapy treatment was assessed relative to the number of clusters. The clustering algorithm was able to separate clusters which significantly reduced in voxel number following induction chemotherapy from clusters with a non-significant reduction. Partitioning with the optimal number of clusters (k = 4), determined with cluster validation, produced the best separation between reducing and non-reducing clusters. The proposed methodology was able to identify tumour sub-regions with distinct functional properties, independently separating clusters which were affected differently by treatment. This work demonstrates that unsupervised cluster analysis, with no prior knowledge of the data, can be employed to provide a multi-parametric characterization of functional heterogeneity within tumour volumes.
Characterization of the CPAP-treated patient population in Catalonia
Gavaldá, Ricard; Teixidó, Ivan; Woehrle, Holger; Rué, Montserrat; Solsona, Francesc; Escarrabill, Joan; Colls, Cristina; García-Altés, Anna; de Batlle, Jordi; Sánchez de-la-Torre, Manuel
2017-01-01
There are different phenotypes of obstructive sleep apnoea (OSA), many of which have not been characterised. Identification of these different phenotypes is important in defining prognosis and guiding the therapeutic strategy. The aim of this study was to characterise the entire population of continuous positive airway pressure (CPAP)-treated patients in Catalonia and identify specific patient profiles using cluster analysis. A total of 72,217 CPAP-treated patients who contacted the Catalan Health System (CatSalut) during the years 2012 and 2013 were included. Six clusters were identified, classified as “Neoplastic patients” (Cluster 1, 10.4%), “Metabolic syndrome patients” (Cluster 2, 27.7%), “Asthmatic patients” (Cluster 3, 5.8%), “Musculoskeletal and joint disorder patients” (Cluster 4, 10.3%), “Patients with few comorbidities” (Cluster 5, 35.6%) and “Oldest and cardiac disease patients” (Cluster 6, 10.2%). Healthcare facility use and mortality were highest in patients from Cluster 1 and 6. Conversely, patients in Clusters 2 and 4 had low morbidity, mortality and healthcare resource use. Our findings highlight the heterogeneity of CPAP-treated patients, and suggest that OSA is associated with a different prognosis in the clusters identified. These results suggest the need for a comprehensive and individualised approach to CPAP treatment of OSA. PMID:28934303
Language Learner Motivational Types: A Cluster Analysis Study
ERIC Educational Resources Information Center
Papi, Mostafa; Teimouri, Yasser
2014-01-01
The study aimed to identify different second language (L2) learner motivational types drawing on the framework of the L2 motivational self system. A total of 1,278 secondary school students learning English in Iran completed a questionnaire survey. Cluster analysis yielded five different groups based on the strength of different variables within…
Automatic script identification from images using cluster-based templates
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hochberg, J.; Kerns, L.; Kelly, P.
We have developed a technique for automatically identifying the script used to generate a document that is stored electronically in bit image form. Our approach differs from previous work in that the distinctions among scripts are discovered by an automatic learning procedure, without any handson analysis. We first develop a set of representative symbols (templates) for each script in our database (Cyrillic, Roman, etc.). We do this by identifying all textual symbols in a set of training documents, scaling each symbol to a fixed size, clustering similar symbols, pruning minor clusters, and finding each cluster`s centroid. To identify a newmore » document`s script, we identify and scale a subset of symbols from the document and compare them to the templates for each script. We choose the script whose templates provide the best match. Our current system distinguishes among the Armenian, Burmese, Chinese, Cyrillic, Ethiopic, Greek, Hebrew, Japanese, Korean, Roman, and Thai scripts with over 90% accuracy.« less
USDA-ARS?s Scientific Manuscript database
Genome wide analysis of orthologous clusters is an important component of comparative genomics studies. Identifying the overlap among orthologous clusters can enable us to elucidate the function and evolution of proteins across multiple species. Here, we report a web platform named OrthoVenn that i...
Ajayi, Alex A; Syed, Moin
2014-10-01
This study used a person-oriented analytic approach to identify meaningful patterns of barriers-focused racial socialization and perceived racial discrimination experiences in a sample of 295 late adolescents. Using cluster analysis, three distinct groups were identified: Low Barrier Socialization-Low Discrimination, High Barrier Socialization-Low Discrimination, and High Barrier Socialization-High Discrimination clusters. These groups were substantively unique in terms of the frequency of racial socialization messages about bias preparation and out-group mistrust its members received and their actual perceived discrimination experiences. Further, individuals in the High Barrier Socialization-High Discrimination cluster reported significantly higher depressive symptoms than those in the Low Barrier Socialization-Low Discrimination and High Barrier Socialization-Low Discrimination clusters. However, no differences in adjustment were observed between the Low Barrier Socialization-Low Discrimination and High Barrier Socialization-Low Discrimination clusters. Overall, the findings highlight important individual differences in how young people of color experience their race and how these differences have significant implications on psychological adjustment. Copyright © 2014 The Foundation for Professionals in Services for Adolescents. Published by Elsevier Ltd. All rights reserved.
Lee, Yii-Ching; Huang, Shian-Chang; Huang, Chih-Hsuan; Wu, Hsin-Hung
2016-01-01
This study uses kernel k-means cluster analysis to identify medical staffs with high burnout. The data collected in October to November 2014 are from the emotional exhaustion dimension of the Chinese version of Safety Attitudes Questionnaire in a regional teaching hospital in Taiwan. The number of effective questionnaires including the entire staffs such as physicians, nurses, technicians, pharmacists, medical administrators, and respiratory therapists is 680. The results show that 8 clusters are generated by kernel k-means method. Employees in clusters 1, 4, and 5 are relatively in good conditions, whereas employees in clusters 2, 3, 6, 7, and 8 need to be closely monitored from time to time because they have relatively higher degree of burnout. When employees with higher degree of burnout are identified, the hospital management can take actions to improve the resilience, reduce the potential medical errors, and, eventually, enhance the patient safety. This study also suggests that the hospital management needs to keep track of medical staffs’ fatigue conditions and provide timely assistance for burnout recovery through employee assistance programs, mindfulness-based stress reduction programs, positivity currency buildup, and forming appreciative inquiry groups. PMID:27895218
Investigation of Spatial and Temporal Trends in Water Quality in Daya Bay, South China Sea
Wu, Mei-Lin; Wang, You-Shao; Dong, Jun-De; Sun, Cui-Ci; Wang, Yu-Tu; Sun, Fu-Lin; Cheng, Hao
2011-01-01
The objective is to identify the spatial and temporal variability of the hydrochemical quality of the water column in a subtropical coastal system, Daya Bay, China. Water samples were collected in four seasons at 12 monitoring sites. The Southeast Asian monsoons, northeasterly from October to the next April and southwesterly from May to September have also an important influence on water quality in Daya Bay. In the spatial pattern, two groups have been identified, with the help of multidimensional scaling analysis and cluster analysis. Cluster I consisted of the sites S3, S8, S10 and S11 in the west and north coastal parts of Daya Bay. Cluster I is mainly related to anthropogenic activities such as fish-farming. Cluster II consisted of the rest of the stations in the center, east and south parts of Daya Bay. Cluster II is mainly related to seawater exchange from South China Sea. PMID:21776234
Breast cancer and symptom clusters during radiotherapy.
Matthews, Ellyn E; Schmiege, Sarah J; Cook, Paul F; Sousa, Karen H
2012-01-01
Symptom clusters assessment shifts the clinical focus from a specific symptom to the patient's experience as a whole. Few studies have examined breast cancer symptom clusters during treatment, and fewer studies have addressed symptom clusters during radiation therapy (RT). The theoretical underpinning of this study is the Symptoms Experience Model. Research is needed to identify antecedents and consequences of cancer-related symptom clusters. The present study was intended to determine the clustering of symptoms during RT in women with breast cancer and significant correlations among the symptoms, individual characteristics, and mood. A secondary data analysis from a descriptive correlational study of 93 women at weeks 3 to 7 of RT from centers in the mid-Atlantic region of the United States, Symptom Distress Scale, the subscales of the Positive and Negative Affect Scale, Life Orientation Test, and Self-transcendence Scale were completed. Confirmatory factor analysis revealed symptoms grouped into 3 distinct clusters: pain-insomnia-fatigue, cognitive disturbance-outlook, and gastrointestinal. The pain-insomnia-fatigue and cognitive disturbance-outlook clusters were associated with individual characteristics, optimism, self-transcendence, and positive and negative mood. The gastrointestinal cluster correlated significantly only with positive mood. This study provides insight into symptoms that group together and the relationship of symptom clusters to antecedents and mood. These findings underscore the need to define and standardize the measurement of symptom clusters and understand variability in concurrent symptoms. Attention to symptom clusters shifts the clinical focus from a specific symptom to the patient's experience as a whole and helps identify the most effective interventions.
Hendricks, Brian; Mark-Carew, Miguella
2017-02-01
Lyme disease is the most commonly reported vectorborne disease in the United States. The objective of our study was to identify patterns of Lyme disease reporting after multistate inclusion to mitigate potential border effects. County-level human Lyme disease surveillance data were obtained from Kentucky, Maryland, Ohio, Pennsylvania, Virginia, and West Virginia state health departments. Rate smoothing and Local Moran's I was performed to identify clusters of reporting activity and identify spatial outliers. A logistic generalized estimating equation was performed to identify significant associations in disease clustering over time. Resulting analyses identified statistically significant (P=0.05) clusters of high reporting activity and trends over time. High reporting activity aggregated near border counties in high incidence states, while low reporting aggregated near shared county borders in non-high incidence states. Findings highlight the need for exploratory surveillance approaches to describe the extent to which state level reporting affects accurate estimation of Lyme disease progression. Copyright © 2017 Elsevier Ltd. All rights reserved.
Gene duplications in prokaryotes can be associated with environmental adaptation
2010-01-01
Background Gene duplication is a normal evolutionary process. If there is no selective advantage in keeping the duplicated gene, it is usually reduced to a pseudogene and disappears from the genome. However, some paralogs are retained. These gene products are likely to be beneficial to the organism, e.g. in adaptation to new environmental conditions. The aim of our analysis is to investigate the properties of paralog-forming genes in prokaryotes, and to analyse the role of these retained paralogs by relating gene properties to life style of the corresponding prokaryotes. Results Paralogs were identified in a number of prokaryotes, and these paralogs were compared to singletons of persistent orthologs based on functional classification. This showed that the paralogs were associated with for example energy production, cell motility, ion transport, and defence mechanisms. A statistical overrepresentation analysis of gene and protein annotations was based on paralogs of the 200 prokaryotes with the highest fraction of paralog-forming genes. Biclustering of overrepresented gene ontology terms versus species was used to identify clusters of properties associated with clusters of species. The clusters were classified using similarity scores on properties and species to identify interesting clusters, and a subset of clusters were analysed by comparison to literature data. This analysis showed that paralogs often are associated with properties that are important for survival and proliferation of the specific organisms. This includes processes like ion transport, locomotion, chemotaxis and photosynthesis. However, the analysis also showed that the gene ontology terms sometimes were too general, imprecise or even misleading for automatic analysis. Conclusions Properties described by gene ontology terms identified in the overrepresentation analysis are often consistent with individual prokaryote lifestyles and are likely to give a competitive advantage to the organism. Paralogs and singletons dominate different categories of functional classification, where paralogs in particular seem to be associated with processes involving interaction with the environment. PMID:20961426
Gene duplications in prokaryotes can be associated with environmental adaptation.
Bratlie, Marit S; Johansen, Jostein; Sherman, Brad T; Huang, Da Wei; Lempicki, Richard A; Drabløs, Finn
2010-10-20
Gene duplication is a normal evolutionary process. If there is no selective advantage in keeping the duplicated gene, it is usually reduced to a pseudogene and disappears from the genome. However, some paralogs are retained. These gene products are likely to be beneficial to the organism, e.g. in adaptation to new environmental conditions. The aim of our analysis is to investigate the properties of paralog-forming genes in prokaryotes, and to analyse the role of these retained paralogs by relating gene properties to life style of the corresponding prokaryotes. Paralogs were identified in a number of prokaryotes, and these paralogs were compared to singletons of persistent orthologs based on functional classification. This showed that the paralogs were associated with for example energy production, cell motility, ion transport, and defence mechanisms. A statistical overrepresentation analysis of gene and protein annotations was based on paralogs of the 200 prokaryotes with the highest fraction of paralog-forming genes. Biclustering of overrepresented gene ontology terms versus species was used to identify clusters of properties associated with clusters of species. The clusters were classified using similarity scores on properties and species to identify interesting clusters, and a subset of clusters were analysed by comparison to literature data. This analysis showed that paralogs often are associated with properties that are important for survival and proliferation of the specific organisms. This includes processes like ion transport, locomotion, chemotaxis and photosynthesis. However, the analysis also showed that the gene ontology terms sometimes were too general, imprecise or even misleading for automatic analysis. Properties described by gene ontology terms identified in the overrepresentation analysis are often consistent with individual prokaryote lifestyles and are likely to give a competitive advantage to the organism. Paralogs and singletons dominate different categories of functional classification, where paralogs in particular seem to be associated with processes involving interaction with the environment.
Blanco, Mario R.; Martin, Joshua S.; Kahlscheuer, Matthew L.; Krishnan, Ramya; Abelson, John; Laederach, Alain; Walter, Nils G.
2016-01-01
The spliceosome is the dynamic RNA-protein machine responsible for faithfully splicing introns from precursor messenger RNAs (pre-mRNAs). Many of the dynamic processes required for the proper assembly, catalytic activation, and disassembly of the spliceosome as it acts on its pre-mRNA substrate remain poorly understood, a challenge that persists for many biomolecular machines. Here, we developed a fluorescence-based Single Molecule Cluster Analysis (SiMCAn) tool to dissect the manifold conformational dynamics of a pre-mRNA through the splicing cycle. By clustering common dynamic behaviors derived from selectively blocked splicing reactions, SiMCAn was able to identify signature conformations and dynamic behaviors of multiple ATP-dependent intermediates. In addition, it identified a conformation adopted late in splicing by a 3′ splice site mutant, invoking a mechanism for substrate proofreading. SiMCAn presents a novel framework for interpreting complex single molecule behaviors that should prove widely useful for the comprehensive analysis of a plethora of dynamic cellular machines. PMID:26414013
Analysis of risk factors for cluster behavior of dental implant failures.
Chrcanovic, Bruno Ramos; Kisch, Jenö; Albrektsson, Tomas; Wennerberg, Ann
2017-08-01
Some studies indicated that implant failures are commonly concentrated in few patients. To identify and analyze cluster behavior of dental implant failures among subjects of a retrospective study. This retrospective study included patients receiving at least three implants only. Patients presenting at least three implant failures were classified as presenting a cluster behavior. Univariate and multivariate logistic regression models and generalized estimating equations analysis evaluated the effect of explanatory variables on the cluster behavior. There were 1406 patients with three or more implants (8337 implants, 592 failures). Sixty-seven (4.77%) patients presented cluster behavior, with 56.8% of all implant failures. The intake of antidepressants and bruxism were identified as potential negative factors exerting a statistically significant influence on a cluster behavior at the patient-level. The negative factors at the implant-level were turned implants, short implants, poor bone quality, age of the patient, the intake of medicaments to reduce the acid gastric production, smoking, and bruxism. A cluster pattern among patients with implant failure is highly probable. Factors of interest as predictors for implant failures could be a number of systemic and local factors, although a direct causal relationship cannot be ascertained. © 2017 Wiley Periodicals, Inc.
Passion and intrinsic motivation in digital gaming.
Wang, Chee Keng John; Khoo, Angeline; Liu, Woon Chia; Divaharan, Shanti
2008-02-01
Digital gaming is fast becoming a favorite activity all over the world. Yet very few studies have examined the underlying motivational processes involved in digital gaming. One motivational force that receives little attention in psychology is passion, which could help us understand the motivation of gamers. The purpose of the present study was to identify subgroups of young people with distinctive passion profiles on self-determined regulations, flow dispositions, affect, and engagement time in gaming. One hundred fifty-five students from two secondary schools in Singapore participated in the survey. There were 134 males and 8 females (13 unspecified). The participants completed a questionnaire to measure harmonious passion (HP), obsessive passion (OP), perceived locus of causality, disposition flow, positive and negative affects, and engagement time in gaming. Cluster analysis found three clusters with distinct passion profiles. The first cluster had an average HP/OP profile, the second cluster had a low HP/OP profile, and the third cluster had a high HP/OP profile. The three clusters displayed different levels of cognitive, affective, and behavioral outcomes. Cluster analysis, as this study shows, is useful in identifying groups of gamers with different passion profiles. It has helped us gain a deeper understanding of motivation in digital gaming.
Goad, David M; Zhu, Chuanmei; Kellogg, Elizabeth A
2017-10-01
CLV3/ESR (CLE) proteins are important signaling peptides in plants. The short CLE peptide (12-13 amino acids) is cleaved from a larger pre-propeptide and functions as an extracellular ligand. The CLE family is large and has resisted attempts at classification because the CLE domain is too short for reliable phylogenetic analysis and the pre-propeptide is too variable. We used a model-based search for CLE domains from 57 plant genomes and used the entire pre-propeptide for comprehensive clustering analysis. In total, 1628 CLE genes were identified in land plants, with none recognizable from green algae. These CLEs form 12 groups within which CLE domains are largely conserved and pre-propeptides can be aligned. Most clusters contain sequences from monocots, eudicots and Amborella trichopoda, with sequences from Picea abies, Selaginella moellendorffii and Physcomitrella patens scattered in some clusters. We easily identified previously known clusters involved in vascular differentiation and nodulation. In addition, we found a number of discrete groups whose function remains poorly characterized. Available data indicate that CLE proteins within a cluster are likely to share function, whereas those from different clusters play at least partially different roles. Our analysis provides a foundation for future evolutionary and functional studies. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
Chalmet, Kristen; Staelens, Delfien; Blot, Stijn; Dinakis, Sylvie; Pelgrom, Jolanda; Plum, Jean; Vogelaers, Dirk; Vandekerckhove, Linos; Verhofstede, Chris
2010-09-07
The number of HIV-1 infected individuals in the Western world continues to rise. More in-depth understanding of regional HIV-1 epidemics is necessary for the optimal design and adequate use of future prevention strategies. The use of a combination of phylogenetic analysis of HIV sequences, with data on patients' demographics, infection route, clinical information and laboratory results, will allow a better characterization of individuals responsible for local transmission. Baseline HIV-1 pol sequences, obtained through routine drug-resistance testing, from 506 patients, newly diagnosed between 2001 and 2009, were used to construct phylogenetic trees and identify transmission-clusters. Patients' demographics, laboratory and clinical data, were retrieved anonymously. Statistical analysis was performed to identify subtype-specific and transmission-cluster-specific characteristics. Multivariate analysis showed significant differences between the 59.7% of individuals with subtype B infection and the 40.3% non-B infected individuals, with regard to route of transmission, origin, infection with Chlamydia (p = 0.01) and infection with Hepatitis C virus (p = 0.017). More and larger transmission-clusters were identified among the subtype B infections (p < 0.001). Overall, in multivariate analysis, clustering was significantly associated with Caucasian origin, infection through homosexual contact and younger age (all p < 0.001). Bivariate analysis additionally showed a correlation between clustering and syphilis (p < 0.001), higher CD4 counts (p = 0.002), Chlamydia infection (p = 0.013) and primary HIV (p = 0.017). Combination of phylogenetics with demographic information, laboratory and clinical data, revealed that HIV-1 subtype B infected Caucasian men-who-have-sex-with-men with high prevalence of sexually transmitted diseases, account for the majority of local HIV-transmissions. This finding elucidates observed epidemiological trends through molecular analysis, and justifies sustained focus in prevention on this high risk group.
Use of multivariate statistics to identify unreliable data obtained using CASA.
Martínez, Luis Becerril; Crispín, Rubén Huerta; Mendoza, Maximino Méndez; Gallegos, Oswaldo Hernández; Martínez, Andrés Aragón
2013-06-01
In order to identify unreliable data in a dataset of motility parameters obtained from a pilot study acquired by a veterinarian with experience in boar semen handling, but without experience in the operation of a computer assisted sperm analysis (CASA) system, a multivariate graphical and statistical analysis was performed. Sixteen boar semen samples were aliquoted then incubated with varying concentrations of progesterone from 0 to 3.33 µg/ml and analyzed in a CASA system. After standardization of the data, Chernoff faces were pictured for each measurement, and a principal component analysis (PCA) was used to reduce the dimensionality and pre-process the data before hierarchical clustering. The first twelve individual measurements showed abnormal features when Chernoff faces were drawn. PCA revealed that principal components 1 and 2 explained 63.08% of the variance in the dataset. Values of principal components for each individual measurement of semen samples were mapped to identify differences among treatment or among boars. Twelve individual measurements presented low values of principal component 1. Confidence ellipses on the map of principal components showed no statistically significant effects for treatment or boar. Hierarchical clustering realized on two first principal components produced three clusters. Cluster 1 contained evaluations of the two first samples in each treatment, each one of a different boar. With the exception of one individual measurement, all other measurements in cluster 1 were the same as observed in abnormal Chernoff faces. Unreliable data in cluster 1 are probably related to the operator inexperience with a CASA system. These findings could be used to objectively evaluate the skill level of an operator of a CASA system. This may be particularly useful in the quality control of semen analysis using CASA systems.
Kurukulaaratchy, Ramesh J; Zhang, Hongmei; Patil, Veeresh; Raza, Abid; Karmaus, Wilfried; Ewart, Susan; Arshad, S Hasan
2015-01-01
Rhinitis affects many young adults and often shows comorbidity with asthma. We hypothesized that young adult rhinitis, like asthma, exhibits clinical heterogeneity identifiable by means of cluster analysis. Participants in the Isle of Wight birth cohort (n = 1456) were assessed at 1, 2, 4, 10, and 18 years of age. Cluster analysis was performed on those with rhinitis at age 18 years (n = 468) by using 13 variables defining clinical characteristics. Four clusters were identified. Patients in cluster 1 (n = 128 [27.4%]; ie, moderate childhood-onset rhinitis) had high atopy and eczema prevalence and high total IgE levels but low asthma prevalence. They showed the best lung function at 18 years of age, with normal fraction of exhaled nitric oxide (Feno), low bronchial hyperresponsiveness (BHR), and low bronchodilator reversibility (BDR) but high rhinitis symptoms and treatment. Patients in cluster 2 (n = 199 [42.5%]; ie, mild-adolescence-onset female rhinitis) had the lowest prevalence of comorbid atopy, asthma, and eczema. They had normal lung function and low BHR, BDR, Feno values, and total IgE levels plus low rhinitis symptoms, severity, and treatment. Patients in cluster 3 (n = 59 [12.6%]; ie, severe earliest-onset rhinitis with asthma) had the youngest rhinitis onset plus the highest comorbid asthma (of simultaneous onset) and atopy. They showed the most obstructed lung function with high BHR, BDR, and Feno values plus high rhinitis symptoms, severity, and treatment. Patient 4 in cluster 4 (n = 82 [17.5%]; ie, moderate childhood-onset male rhinitis with asthma) had high atopy, intermediate asthma, and low eczema. They had impaired lung function with high Feno values and total IgE levels but intermediate BHR and BDR. They had moderate rhinitis symptoms. Clinically distinctive adolescent rhinitis clusters are apparent with varying sex and asthma associations plus differing rhinitis severity and treatment needs. Copyright © 2014 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Farsadnia, F.; Rostami Kamrood, M.; Moghaddam Nia, A.; Modarres, R.; Bray, M. T.; Han, D.; Sadatinejad, J.
2014-02-01
One of the several methods in estimating flood quantiles in ungauged or data-scarce watersheds is regional frequency analysis. Amongst the approaches to regional frequency analysis, different clustering techniques have been proposed to determine hydrologically homogeneous regions in the literature. Recently, Self-Organization feature Map (SOM), a modern hydroinformatic tool, has been applied in several studies for clustering watersheds. However, further studies are still needed with SOM on the interpretation of SOM output map for identifying hydrologically homogeneous regions. In this study, two-level SOM and three clustering methods (fuzzy c-mean, K-mean, and Ward's Agglomerative hierarchical clustering) are applied in an effort to identify hydrologically homogeneous regions in Mazandaran province watersheds in the north of Iran, and their results are compared with each other. Firstly the SOM is used to form a two-dimensional feature map. Next, the output nodes of the SOM are clustered by using unified distance matrix algorithm and three clustering methods to form regions for flood frequency analysis. The heterogeneity test indicates the four regions achieved by the two-level SOM and Ward approach after adjustments are sufficiently homogeneous. The results suggest that the combination of SOM and Ward is much better than the combination of either SOM and FCM or SOM and K-mean.
The PhytoClust tool for metabolic gene clusters discovery in plant genomes
Fuchs, Lisa-Maria
2017-01-01
Abstract The existence of Metabolic Gene Clusters (MGCs) in plant genomes has recently raised increased interest. Thus far, MGCs were commonly identified for pathways of specialized metabolism, mostly those associated with terpene type products. For efficient identification of novel MGCs, computational approaches are essential. Here, we present PhytoClust; a tool for the detection of candidate MGCs in plant genomes. The algorithm employs a collection of enzyme families related to plant specialized metabolism, translated into hidden Markov models, to mine given genome sequences for physically co-localized metabolic enzymes. Our tool accurately identifies previously characterized plant MGCs. An exhaustive search of 31 plant genomes detected 1232 and 5531 putative gene cluster types and candidates, respectively. Clustering analysis of putative MGCs types by species reflected plant taxonomy. Furthermore, enrichment analysis revealed taxa- and species-specific enrichment of certain enzyme families in MGCs. When operating through our web-interface, PhytoClust users can mine a genome either based on a list of known cluster types or by defining new cluster rules. Moreover, for selected plant species, the output can be complemented by co-expression analysis. Altogether, we envisage PhytoClust to enhance novel MGCs discovery which will in turn impact the exploration of plant metabolism. PMID:28486689
The PhytoClust tool for metabolic gene clusters discovery in plant genomes.
Töpfer, Nadine; Fuchs, Lisa-Maria; Aharoni, Asaph
2017-07-07
The existence of Metabolic Gene Clusters (MGCs) in plant genomes has recently raised increased interest. Thus far, MGCs were commonly identified for pathways of specialized metabolism, mostly those associated with terpene type products. For efficient identification of novel MGCs, computational approaches are essential. Here, we present PhytoClust; a tool for the detection of candidate MGCs in plant genomes. The algorithm employs a collection of enzyme families related to plant specialized metabolism, translated into hidden Markov models, to mine given genome sequences for physically co-localized metabolic enzymes. Our tool accurately identifies previously characterized plant MGCs. An exhaustive search of 31 plant genomes detected 1232 and 5531 putative gene cluster types and candidates, respectively. Clustering analysis of putative MGCs types by species reflected plant taxonomy. Furthermore, enrichment analysis revealed taxa- and species-specific enrichment of certain enzyme families in MGCs. When operating through our web-interface, PhytoClust users can mine a genome either based on a list of known cluster types or by defining new cluster rules. Moreover, for selected plant species, the output can be complemented by co-expression analysis. Altogether, we envisage PhytoClust to enhance novel MGCs discovery which will in turn impact the exploration of plant metabolism. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
A cross-species bi-clustering approach to identifying conserved co-regulated genes.
Sun, Jiangwen; Jiang, Zongliang; Tian, Xiuchun; Bi, Jinbo
2016-06-15
A growing number of studies have explored the process of pre-implantation embryonic development of multiple mammalian species. However, the conservation and variation among different species in their developmental programming are poorly defined due to the lack of effective computational methods for detecting co-regularized genes that are conserved across species. The most sophisticated method to date for identifying conserved co-regulated genes is a two-step approach. This approach first identifies gene clusters for each species by a cluster analysis of gene expression data, and subsequently computes the overlaps of clusters identified from different species to reveal common subgroups. This approach is ineffective to deal with the noise in the expression data introduced by the complicated procedures in quantifying gene expression. Furthermore, due to the sequential nature of the approach, the gene clusters identified in the first step may have little overlap among different species in the second step, thus difficult to detect conserved co-regulated genes. We propose a cross-species bi-clustering approach which first denoises the gene expression data of each species into a data matrix. The rows of the data matrices of different species represent the same set of genes that are characterized by their expression patterns over the developmental stages of each species as columns. A novel bi-clustering method is then developed to cluster genes into subgroups by a joint sparse rank-one factorization of all the data matrices. This method decomposes a data matrix into a product of a column vector and a row vector where the column vector is a consistent indicator across the matrices (species) to identify the same gene cluster and the row vector specifies for each species the developmental stages that the clustered genes co-regulate. Efficient optimization algorithm has been developed with convergence analysis. This approach was first validated on synthetic data and compared to the two-step method and several recent joint clustering methods. We then applied this approach to two real world datasets of gene expression during the pre-implantation embryonic development of the human and mouse. Co-regulated genes consistent between the human and mouse were identified, offering insights into conserved functions, as well as similarities and differences in genome activation timing between the human and mouse embryos. The R package containing the implementation of the proposed method in C ++ is available at: https://github.com/JavonSun/mvbc.git and also at the R platform https://www.r-project.org/ jinbo@engr.uconn.edu. © The Author 2016. Published by Oxford University Press.
Jonsson, Anders; Bonander, Carl; Nilson, Finn; Huss, Fredrik
2017-09-01
Residential fires represent the largest category of fatal fires in Sweden. The purpose of this study was to describe the epidemiology of fatal residential fires in Sweden and to identify clusters of events. Data was collected from a database that combines information on fatal fires with data from forensic examinations and the Swedish Cause of Death-register. Mortality rates were calculated for different strata using population statistics and rescue service turnout reports. Cluster analysis was performed using multiple correspondence analysis with agglomerative hierarchical clustering. Male sex, old age, smoking, and alcohol were identified as risk factors, and the most common primary injury diagnosis was exposure to toxic gases. Compared to non-fatal fires, fatal residential fires more often originated in the bedroom, were more often caused by smoking, and were more likely to occur at night. Six clusters were identified. The first two clusters were both smoking-related, but were separated into (1) fatalities that often involved elderly people, usually female, whose clothes were ignited (17% of the sample), (2) middle-aged (45-64years old), (often) intoxicated men, where the fire usually originated in furniture (30%). Other clusters that were identified in the analysis were related to (3) fires caused by technical fault, started in electrical installations in single houses (13%), (4) cooking appliances left on (8%), (5) events with unknown cause, room and object of origin (25%), and (6) deliberately set fires (7%). Fatal residential fires were unevenly distributed in the Swedish population. To further reduce the incidence of fire mortality, specialized prevention efforts that focus on the different needs of each cluster are required. Cooperation between various societal functions, e.g. rescue services, elderly care, psychiatric clinics and other social services, with an application of both human and technological interventions, should reduce residential fire mortality in Sweden. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Chen, Jin; Roth, Robert E; Naito, Adam T; Lengerich, Eugene J; MacEachren, Alan M
2008-01-01
Background Kulldorff's spatial scan statistic and its software implementation – SaTScan – are widely used for detecting and evaluating geographic clusters. However, two issues make using the method and interpreting its results non-trivial: (1) the method lacks cartographic support for understanding the clusters in geographic context and (2) results from the method are sensitive to parameter choices related to cluster scaling (abbreviated as scaling parameters), but the system provides no direct support for making these choices. We employ both established and novel geovisual analytics methods to address these issues and to enhance the interpretation of SaTScan results. We demonstrate our geovisual analytics approach in a case study analysis of cervical cancer mortality in the U.S. Results We address the first issue by providing an interactive visual interface to support the interpretation of SaTScan results. Our research to address the second issue prompted a broader discussion about the sensitivity of SaTScan results to parameter choices. Sensitivity has two components: (1) the method can identify clusters that, while being statistically significant, have heterogeneous contents comprised of both high-risk and low-risk locations and (2) the method can identify clusters that are unstable in location and size as the spatial scan scaling parameter is varied. To investigate cluster result stability, we conducted multiple SaTScan runs with systematically selected parameters. The results, when scanning a large spatial dataset (e.g., U.S. data aggregated by county), demonstrate that no single spatial scan scaling value is known to be optimal to identify clusters that exist at different scales; instead, multiple scans that vary the parameters are necessary. We introduce a novel method of measuring and visualizing reliability that facilitates identification of homogeneous clusters that are stable across analysis scales. Finally, we propose a logical approach to proceed through the analysis of SaTScan results. Conclusion The geovisual analytics approach described in this manuscript facilitates the interpretation of spatial cluster detection methods by providing cartographic representation of SaTScan results and by providing visualization methods and tools that support selection of SaTScan parameters. Our methods distinguish between heterogeneous and homogeneous clusters and assess the stability of clusters across analytic scales. Method We analyzed the cervical cancer mortality data for the United States aggregated by county between 2000 and 2004. We ran SaTScan on the dataset fifty times with different parameter choices. Our geovisual analytics approach couples SaTScan with our visual analytic platform, allowing users to interactively explore and compare SaTScan results produced by different parameter choices. The Standardized Mortality Ratio and reliability scores are visualized for all the counties to identify stable, homogeneous clusters. We evaluated our analysis result by comparing it to that produced by other independent techniques including the Empirical Bayes Smoothing and Kafadar spatial smoother methods. The geovisual analytics approach introduced here is developed and implemented in our Java-based Visual Inquiry Toolkit. PMID:18992163
Chen, Jin; Roth, Robert E; Naito, Adam T; Lengerich, Eugene J; Maceachren, Alan M
2008-11-07
Kulldorff's spatial scan statistic and its software implementation - SaTScan - are widely used for detecting and evaluating geographic clusters. However, two issues make using the method and interpreting its results non-trivial: (1) the method lacks cartographic support for understanding the clusters in geographic context and (2) results from the method are sensitive to parameter choices related to cluster scaling (abbreviated as scaling parameters), but the system provides no direct support for making these choices. We employ both established and novel geovisual analytics methods to address these issues and to enhance the interpretation of SaTScan results. We demonstrate our geovisual analytics approach in a case study analysis of cervical cancer mortality in the U.S. We address the first issue by providing an interactive visual interface to support the interpretation of SaTScan results. Our research to address the second issue prompted a broader discussion about the sensitivity of SaTScan results to parameter choices. Sensitivity has two components: (1) the method can identify clusters that, while being statistically significant, have heterogeneous contents comprised of both high-risk and low-risk locations and (2) the method can identify clusters that are unstable in location and size as the spatial scan scaling parameter is varied. To investigate cluster result stability, we conducted multiple SaTScan runs with systematically selected parameters. The results, when scanning a large spatial dataset (e.g., U.S. data aggregated by county), demonstrate that no single spatial scan scaling value is known to be optimal to identify clusters that exist at different scales; instead, multiple scans that vary the parameters are necessary. We introduce a novel method of measuring and visualizing reliability that facilitates identification of homogeneous clusters that are stable across analysis scales. Finally, we propose a logical approach to proceed through the analysis of SaTScan results. The geovisual analytics approach described in this manuscript facilitates the interpretation of spatial cluster detection methods by providing cartographic representation of SaTScan results and by providing visualization methods and tools that support selection of SaTScan parameters. Our methods distinguish between heterogeneous and homogeneous clusters and assess the stability of clusters across analytic scales. We analyzed the cervical cancer mortality data for the United States aggregated by county between 2000 and 2004. We ran SaTScan on the dataset fifty times with different parameter choices. Our geovisual analytics approach couples SaTScan with our visual analytic platform, allowing users to interactively explore and compare SaTScan results produced by different parameter choices. The Standardized Mortality Ratio and reliability scores are visualized for all the counties to identify stable, homogeneous clusters. We evaluated our analysis result by comparing it to that produced by other independent techniques including the Empirical Bayes Smoothing and Kafadar spatial smoother methods. The geovisual analytics approach introduced here is developed and implemented in our Java-based Visual Inquiry Toolkit.
Bennett, Robert M; Russell, Jon; Cappelleri, Joseph C; Bushmakin, Andrew G; Zlateva, Gergana; Sadosky, Alesia
2010-06-28
The purpose of this study was to determine whether some of the clinical features of fibromyalgia (FM) that patients would like to see improved aggregate into definable clusters. Seven hundred and eighty-eight patients with clinically confirmed FM and baseline pain > or =40 mm on a 100 mm visual analogue scale ranked 5 FM clinical features that the subjects would most like to see improved after treatment (one for each priority quintile) from a list of 20 developed during focus groups. For each subject, clinical features were transformed into vectors with rankings assigned values 1-5 (lowest to highest ranking). Logistic analysis was used to create a distance matrix and hierarchical cluster analysis was applied to identify cluster structure. The frequency of cluster selection was determined, and cluster importance was ranked using cluster scores derived from rankings of the clinical features. Multidimensional scaling was used to visualize and conceptualize cluster relationships. Six clinical features clusters were identified and named based on their key characteristics. In order of selection frequency, the clusters were Pain (90%; 4 clinical features), Fatigue (89%; 4 clinical features), Domestic (42%; 4 clinical features), Impairment (29%; 3 functions), Affective (21%; 3 clinical features), and Social (9%; 2 functional). The "Pain Cluster" was ranked of greatest importance by 54% of subjects, followed by Fatigue, which was given the highest ranking by 28% of subjects. Multidimensional scaling mapped these clusters to two dimensions: Status (bounded by Physical and Emotional domains), and Setting (bounded by Individual and Group interactions). Common clinical features of FM could be grouped into 6 clusters (Pain, Fatigue, Domestic, Impairment, Affective, and Social) based on patient perception of relevance to treatment. Furthermore, these 6 clusters could be charted in the 2 dimensions of Status and Setting, thus providing a unique perspective for interpretation of FM symptomatology.
An Enhanced K-Means Algorithm for Water Quality Analysis of The Haihe River in China
Zou, Hui; Zou, Zhihong; Wang, Xiaojing
2015-01-01
The increase and the complexity of data caused by the uncertain environment is today’s reality. In order to identify water quality effectively and reliably, this paper presents a modified fast clustering algorithm for water quality analysis. The algorithm has adopted a varying weights K-means cluster algorithm to analyze water monitoring data. The varying weights scheme was the best weighting indicator selected by a modified indicator weight self-adjustment algorithm based on K-means, which is named MIWAS-K-means. The new clustering algorithm avoids the margin of the iteration not being calculated in some cases. With the fast clustering analysis, we can identify the quality of water samples. The algorithm is applied in water quality analysis of the Haihe River (China) data obtained by the monitoring network over a period of eight years (2006–2013) with four indicators at seven different sites (2078 samples). Both the theoretical and simulated results demonstrate that the algorithm is efficient and reliable for water quality analysis of the Haihe River. In addition, the algorithm can be applied to more complex data matrices with high dimensionality. PMID:26569283
Ryge, Jesper; Winther, Ole; Wienecke, Jacob; Sandelin, Albin; Westerdahl, Ann-Charlotte; Hultborn, Hans; Kiehn, Ole
2010-06-09
Spinal cord injury leads to neurological dysfunctions affecting the motor, sensory as well as the autonomic systems. Increased excitability of motor neurons has been implicated in injury-induced spasticity, where the reappearance of self-sustained plateau potentials in the absence of modulatory inputs from the brain correlates with the development of spasticity. Here we examine the dynamic transcriptional response of motor neurons to spinal cord injury as it evolves over time to unravel common gene expression patterns and their underlying regulatory mechanisms. For this we use a rat-tail-model with complete spinal cord transection causing injury-induced spasticity, where gene expression profiles are obtained from labeled motor neurons extracted with laser microdissection 0, 2, 7, 21 and 60 days post injury. Consensus clustering identifies 12 gene clusters with distinct time expression profiles. Analysis of these gene clusters identifies early immunological/inflammatory and late developmental responses as well as a regulation of genes relating to neuron excitability that support the development of motor neuron hyper-excitability and the reappearance of plateau potentials in the late phase of the injury response. Transcription factor motif analysis identifies differentially expressed transcription factors involved in the regulation of each gene cluster, shaping the expression of the identified biological processes and their associated genes underlying the changes in motor neuron excitability. This analysis provides important clues to the underlying mechanisms of transcriptional regulation responsible for the increased excitability observed in motor neurons in the late chronic phase of spinal cord injury suggesting alternative targets for treatment of spinal cord injury. Several transcription factors were identified as potential regulators of gene clusters containing elements related to motor neuron hyper-excitability, the manipulation of which potentially could be used to alter the transcriptional response to prevent the motor neurons from entering a state of hyper-excitability.
Transcriptional regulation of gene expression clusters in motor neurons following spinal cord injury
2010-01-01
Background Spinal cord injury leads to neurological dysfunctions affecting the motor, sensory as well as the autonomic systems. Increased excitability of motor neurons has been implicated in injury-induced spasticity, where the reappearance of self-sustained plateau potentials in the absence of modulatory inputs from the brain correlates with the development of spasticity. Results Here we examine the dynamic transcriptional response of motor neurons to spinal cord injury as it evolves over time to unravel common gene expression patterns and their underlying regulatory mechanisms. For this we use a rat-tail-model with complete spinal cord transection causing injury-induced spasticity, where gene expression profiles are obtained from labeled motor neurons extracted with laser microdissection 0, 2, 7, 21 and 60 days post injury. Consensus clustering identifies 12 gene clusters with distinct time expression profiles. Analysis of these gene clusters identifies early immunological/inflammatory and late developmental responses as well as a regulation of genes relating to neuron excitability that support the development of motor neuron hyper-excitability and the reappearance of plateau potentials in the late phase of the injury response. Transcription factor motif analysis identifies differentially expressed transcription factors involved in the regulation of each gene cluster, shaping the expression of the identified biological processes and their associated genes underlying the changes in motor neuron excitability. Conclusions This analysis provides important clues to the underlying mechanisms of transcriptional regulation responsible for the increased excitability observed in motor neurons in the late chronic phase of spinal cord injury suggesting alternative targets for treatment of spinal cord injury. Several transcription factors were identified as potential regulators of gene clusters containing elements related to motor neuron hyper-excitability, the manipulation of which potentially could be used to alter the transcriptional response to prevent the motor neurons from entering a state of hyper-excitability. PMID:20534130
McGuire, Joseph F.; Nyirabahizi, Epiphanie; Kircanski, Katharina; Piacentini, John; Peterson, Alan L.; Woods, Douglas W.; Wilhelm, Sabine; Walkup, John T.; Scahill, Lawrence
2013-01-01
Cluster analytic methods have examined the symptom presentation of chronic tic disorders (CTDs), with limited agreement across studies. The present study investigated patterns, clinical correlates, and treatment outcome of tic symptoms. 239 youth and adults with CTDs completed a battery of assessments at baseline to determine diagnoses, tic severity, and clinical characteristics. Participants were randomly assigned to receive either a comprehensive behavioral intervention for tics (CBIT) or psychoeducation and supportive therapy (PST). A cluster analysis was conducted on the baseline Yale Global Tic Severity Scale (YGTSS) symptom checklist to identify the constellations of tic symptoms. Four tic clusters were identified: Impulse Control and Complex Phonic Tics; Complex Motor Tics; Simple Head Motor/Vocal Tics; and Primarily Simple Motor Tics. Frequencies of tic symptoms showed few differences across youth and adults. Tic clusters had small associations with clinical characteristics and showed no associations to the presence of coexisting psychiatric conditions. Cluster membership scores did not predict treatment response to CBIT or tic severity reductions. Tic symptoms distinctly cluster with few difference across youth and adults, or coexisting conditions. This study, which is the first to examine tic clusters in relation to treatment, suggested that tic symptom profiles respond equally well to CBIT. PMID:24144615
Ren, Hongyan; Tang, Ping; Zhao, Qinghua; Ren, Guosheng
2017-08-23
To identify symptom distress and clusters in patients 3 months after radical cystectomy and to explore their potential predictors. A cross-sectional design was used to investigate 99 bladder cancer patients 3 months after radical cystectomy. Data were collected by demographic and disease characteristic questionnaires, the symptom experience scale of the M.D. Anderson symptom inventory, two additional symptoms specific to radical cystectomy, and the functional assessment of cancer therapy questionnaire. A factor analysis, stepwise regression, and correlation analysis were applied. Three symptom clusters were identified: fatigue-malaise, gastrointestinal, and psycho-urinary. Age, complication severity, albumin post-surgery (negative), orthotropic neobladder reconstruction, adjuvant chemotherapy and American Society of Anesthesiologists (ASA) scores were significant predictors of fatigue-malaise. Adjuvant chemotherapy, orthotropic neobladder reconstruction, female gender, ASA scores and albumin (negative) were significant predictors of gastrointestinal symptoms. Being unmarried, having a higher educational level and complication severity were significant predictors of psycho-urinary symptoms. The correlations between clusters and for each cluster with quality of life were significant, with the highest correlation observed between the psycho-urinary cluster and quality of life. Bladder cancer patients experience concurrent symptoms that appear to cluster and are significantly correlated with quality of life. Moreover, symptom clusters may be predicted by certain demographic and clinical characteristics.
Spatial pattern recognition of seismic events in South West Colombia
NASA Astrophysics Data System (ADS)
Benítez, Hernán D.; Flórez, Juan F.; Duque, Diana P.; Benavides, Alberto; Lucía Baquero, Olga; Quintero, Jiber
2013-09-01
Recognition of seismogenic zones in geographical regions supports seismic hazard studies. This recognition is usually based on visual, qualitative and subjective analysis of data. Spatial pattern recognition provides a well founded means to obtain relevant information from large amounts of data. The purpose of this work is to identify and classify spatial patterns in instrumental data of the South West Colombian seismic database. In this research, clustering tendency analysis validates whether seismic database possesses a clustering structure. A non-supervised fuzzy clustering algorithm creates groups of seismic events. Given the sensitivity of fuzzy clustering algorithms to centroid initial positions, we proposed a methodology to initialize centroids that generates stable partitions with respect to centroid initialization. As a result of this work, a public software tool provides the user with the routines developed for clustering methodology. The analysis of the seismogenic zones obtained reveals meaningful spatial patterns in South-West Colombia. The clustering analysis provides a quantitative location and dispersion of seismogenic zones that facilitates seismological interpretations of seismic activities in South West Colombia.
Chin, John J; Kim, Anna J; Takahashi, Lois; Wiebe, Douglas J
2015-01-01
Social determinants of health may be substantially affected by spatial factors, which together may explain the persistence of health inequities. Clustering of possible sources of negative health and social outcomes points to a spatial focus for future interventions. We analyzed the spatial clustering of sex work businesses in Southern California to examine where and why they cluster. We explored economic and legal factors as possible explanations of clustering. We manually coded data from a website used by paying members to post reviews of female massage parlor workers. We identified clusters of sexually oriented massage parlor businesses using spatial autocorrelation tests. We conducted spatial regression using census tract data to identify predictors of clustering. A total of 889 venues were identified. Clusters of tracts having higher-than-expected numbers of sexually oriented massage parlors ("hot spots") were located outside downtowns. These hot spots were characterized by a higher proportion of adult males, a higher proportion of households below the federal poverty level, and a smaller average household size. Sexually oriented massage parlors in Los Angeles and Orange counties cluster in particular neighborhoods. More research is needed to ascertain the causal factors of such clusters and how interventions can be designed to leverage these spatial factors.
Maternal Styles of Talking about Child Feeding across Sociodemographic Groups
Pesch, Megan H.; Harrell, Kristina J.; Kaciroti, Niko; Rosenblum, Kate; Lumeng, Julie C.
2011-01-01
This study sought to identify maternal styles of talking about child feeding from a semi-structured interview and to evaluate associated maternal and child characteristics. Mothers of preschool-aged children (n = 133) of diverse race/ethnicity and socioeconomic status (SES) (45 lower SES black, 29 lower SES white, 32 lower SES Hispanic, 15 middle to upper SES white, 12 middle to upper SES Asian) participated in a semi-structured interview about feeding. Interviews were audio-taped and transcribed. Themes were identified, and individual interviews were coded within these themes: authority (high/low), confidence (confident/conflicted/unopinionated), and investment (deep/mild/removed). Demographic characteristics were collected and a subset of children had measured weights and heights. Cluster analysis was used to identify narrative styles. Participant characteristics were compared across clusters using Fisher’s exact test and analysis of variance. Six narrative styles were identified: Easy-Going, Practical No-Nonsense, Disengaged, Effortful No-Nonsense, Indulgent Worry, and Conflicted Control. Cluster membership differed significantly based on maternal demographic group (P < .001) and child weight status (P < .05). More than half (60%) of children of mothers in the Conflicted Control cluster were obese. Maternal styles of talking about feeding are associated with maternal and child characteristics. PMID:22117662
NASA Astrophysics Data System (ADS)
Chuan, Zun Liang; Ismail, Noriszura; Shinyie, Wendy Ling; Lit Ken, Tan; Fam, Soo-Fen; Senawi, Azlyna; Yusoff, Wan Nur Syahidah Wan
2018-04-01
Due to the limited of historical precipitation records, agglomerative hierarchical clustering algorithms widely used to extrapolate information from gauged to ungauged precipitation catchments in yielding a more reliable projection of extreme hydro-meteorological events such as extreme precipitation events. However, identifying the optimum number of homogeneous precipitation catchments accurately based on the dendrogram resulted using agglomerative hierarchical algorithms are very subjective. The main objective of this study is to propose an efficient regionalized algorithm to identify the homogeneous precipitation catchments for non-stationary precipitation time series. The homogeneous precipitation catchments are identified using average linkage hierarchical clustering algorithm associated multi-scale bootstrap resampling, while uncentered correlation coefficient as the similarity measure. The regionalized homogeneous precipitation is consolidated using K-sample Anderson Darling non-parametric test. The analysis result shows the proposed regionalized algorithm performed more better compared to the proposed agglomerative hierarchical clustering algorithm in previous studies.
Health and disease phenotyping in old age using a cluster network analysis.
Valenzuela, Jesus Felix; Monterola, Christopher; Tong, Victor Joo Chuan; Ng, Tze Pin; Larbi, Anis
2017-11-15
Human ageing is a complex trait that involves the synergistic action of numerous biological processes that interact to form a complex network. Here we performed a network analysis to examine the interrelationships between physiological and psychological functions, disease, disability, quality of life, lifestyle and behavioural risk factors for ageing in a cohort of 3,270 subjects aged ≥55 years. We considered associations between numerical and categorical descriptors using effect-size measures for each variable pair and identified clusters of variables from the resulting pairwise effect-size network and minimum spanning tree. We show, by way of a correspondence analysis between the two sets of clusters, that they correspond to coarse-grained and fine-grained structure of the network relationships. The clusters obtained from the minimum spanning tree mapped to various conceptual domains and corresponded to physiological and syndromic states. Hierarchical ordering of these clusters identified six common themes based on interactions with physiological systems and common underlying substrates of age-associated morbidity and disease chronicity, functional disability, and quality of life. These findings provide a starting point for indepth analyses of ageing that incorporate immunologic, metabolomic and proteomic biomarkers, and ultimately offer low-level-based typologies of healthy and unhealthy ageing.
Stathakis, D. G.; Pentz, E. S.; Freeman, M. E.; Kullman, J.; Hankins, G. R.; Pearlson, N. J.; Wright, TRF.
1995-01-01
We report the complete molecular organization of the Dopa decarboxylase gene cluster. Mutagenesis screens recovered 77 new Df(2L)TW130 recessive lethal mutations. These new alleles combined with 263 previously isolated mutations in the cluster to define 18 essential genes. In addition, seven new deficiencies were isolated and characterized. Deficiency mapping, restriction fragment length polymorphism (RFLP) analysis and P-element-mediated germline transformation experiments determined the gene order for all 18 loci. Genomic and cDNA restriction endonuclease mapping, Northern blot analysis and DNA sequencing provided information on exact gene location, mRNA size and transcriptional direction for most of these loci. In addition, this analysis identified two transcription units that had not previously been identified by extensive mutagenesis screening. Most of the loci are contained within two dense subclusters. We discuss the effectiveness of mutagens and strategies used in our screens, the variable mutability of loci within the genome of Drosophila melanogaster, the cytological and molecular organization of the Ddc gene cluster, the validity of the one band-one gene hypothesis and a possible purpose for the clustering of genes in the Ddc region. PMID:8647399
Predicting the points of interaction of small molecules in the NF-κB pathway
2011-01-01
Background The similarity property principle has been used extensively in drug discovery to identify small compounds that interact with specific drug targets. Here we show it can be applied to identify the interactions of small molecules within the NF-κB signalling pathway. Results Clusters that contain compounds with a predominant interaction within the pathway were created, which were then used to predict the interaction of compounds not included in the clustering analysis. Conclusions The technique successfully predicted the points of interactions of compounds that are known to interact with the NF-κB pathway. The method was also shown to be successful when compounds for which the interaction points were unknown were included in the clustering analysis. PMID:21342508
USDA-ARS?s Scientific Manuscript database
The mechanisms as well the genetics underlying bioavailability and metabolism of carotenoids in humans remains unclear. The individual temporal response of plasma carotenoids was analyzed in adults who consumed carotenoid-containing juices on a controlled-diet study using cluster analysis. Treatmen...
Cluster Analysis of Assessment in Anatomy and Physiology for Health Science Undergraduates
ERIC Educational Resources Information Center
Brown, Stephen; White, Sue; Power, Nicola
2016-01-01
Academic content common to health science programs is often taught to a mixed group of students; however, content assessment may be consistent for each discipline. This study used a retrospective cluster analysis on such a group, first to identify high and low achieving students, and second, to determine the distribution of students within…
Lipoprotein lipase S447X variant associated with VLDL, LDL and HDL diameter clustering in the MetS
USDA-ARS?s Scientific Manuscript database
Previous analysis clustered 1,238 individuals from the general population Genetics of Lipid Lowering Drugs Network (GOLDN) study by the size of their fasting very low-density, low-density and high-density lipoproteins (VLDL, LDL, HDL) using latent class analysis. From two of the eight identified gro...
Use of LANDSAT imagery for wildlife habitat mapping in northeast and eastcentral Alaska
NASA Technical Reports Server (NTRS)
Lent, P. C. (Principal Investigator)
1976-01-01
The author has identified the following significant results. There is strong indication that spatially rare feature classes may be missed in clustering classifications based on 2% random sampling. Therefore, it seems advisable to augment random sampling for cluster analysis with directed sampling of any spatially rare features which are relevant to the analysis.
ERIC Educational Resources Information Center
Cuccaro, Michael L.; Tuchman, Roberto F.; Hamilton, Kara L.; Wright, Harry H.; Abramson, Ruth K.; Haines, Jonathan L.; Gilbert, John R.; Pericak-Vance, Margaret
2012-01-01
Epilepsy co-occurs frequently in autism spectrum disorders (ASD). Understanding this co-occurrence requires a better understanding of the ASD-epilepsy phenotype (or phenotypes). To address this, we conducted latent class cluster analysis (LCCA) on an ASD dataset (N = 577) which included 64 individuals with epilepsy. We identified a 5-cluster…
Lewis, Daniel R.; Olex, Amy L.; Lundy, Stacey R.; Turkett, William H.; Fetrow, Jacquelyn S.; Muday, Gloria K.
2013-01-01
To identify gene products that participate in auxin-dependent lateral root formation, a high temporal resolution, genome-wide transcript abundance analysis was performed with auxin-treated Arabidopsis thaliana roots. Data analysis identified 1246 transcripts that were consistently regulated by indole-3-acetic acid (IAA), partitioning into 60 clusters with distinct response kinetics. We identified rapidly induced clusters containing auxin-response functional annotations and clusters exhibiting delayed induction linked to cell division temporally correlated with lateral root induction. Several clusters were enriched with genes encoding proteins involved in cell wall modification, opening the possibility for understanding mechanistic details of cell structural changes that result in root formation following auxin treatment. Mutants with insertions in 72 genes annotated with a cell wall remodeling function were examined for alterations in IAA-regulated root growth and development. This reverse-genetic screen yielded eight mutants with root phenotypes. Detailed characterization of seedlings with mutations in CELLULASE3/GLYCOSYLHYDROLASE9B3 and LEUCINE RICH EXTENSIN2, genes not normally linked to auxin response, revealed defects in the early and late stages of lateral root development, respectively. The genes identified here using kinetic insight into expression changes lay the foundation for mechanistic understanding of auxin-mediated cell wall remodeling as an essential feature of lateral root development. PMID:24045021
Sensory Clusters of Adults with and without Autism Spectrum Conditions
ERIC Educational Resources Information Center
Elwin, Marie; Schröder, Agneta; Ek, Lena; Wallsten, Tuula; Kjellin, Lars
2017-01-01
We identified clusters of atypical sensory functioning adults with ASC by hierarchical cluster analysis. A new scale for commonly self-reported sensory reactivity was used as a measure. In a low frequency group (n = 37), all subscale scores were relatively low, in particular atypical sensory/motor reactivity. In the intermediate group (n = 17)…
An Empirical Taxonomy of Youths' Fears: Cluster Analysis of the American Fear Survey Schedule
ERIC Educational Resources Information Center
Burnham, Joy J.; Schaefer, Barbara A.; Giesen, Judy
2006-01-01
Fears profiles among children and adolescents were explored using the Fear Survey Schedule for Children-American version (FSSC-AM; J.J. Burnham, 1995, 2005). Eight cluster profiles were identified via multistage Euclidean grouping and supported by homogeneity coefficients and replication. Four clusters reflected overall level of fears (i.e., very…
Borri, Marco; Schmidt, Maria A.; Powell, Ceri; Koh, Dow-Mu; Riddell, Angela M.; Partridge, Mike; Bhide, Shreerang A.; Nutting, Christopher M.; Harrington, Kevin J.; Newbold, Katie L.; Leach, Martin O.
2015-01-01
Purpose To describe a methodology, based on cluster analysis, to partition multi-parametric functional imaging data into groups (or clusters) of similar functional characteristics, with the aim of characterizing functional heterogeneity within head and neck tumour volumes. To evaluate the performance of the proposed approach on a set of longitudinal MRI data, analysing the evolution of the obtained sub-sets with treatment. Material and Methods The cluster analysis workflow was applied to a combination of dynamic contrast-enhanced and diffusion-weighted imaging MRI data from a cohort of squamous cell carcinoma of the head and neck patients. Cumulative distributions of voxels, containing pre and post-treatment data and including both primary tumours and lymph nodes, were partitioned into k clusters (k = 2, 3 or 4). Principal component analysis and cluster validation were employed to investigate data composition and to independently determine the optimal number of clusters. The evolution of the resulting sub-regions with induction chemotherapy treatment was assessed relative to the number of clusters. Results The clustering algorithm was able to separate clusters which significantly reduced in voxel number following induction chemotherapy from clusters with a non-significant reduction. Partitioning with the optimal number of clusters (k = 4), determined with cluster validation, produced the best separation between reducing and non-reducing clusters. Conclusion The proposed methodology was able to identify tumour sub-regions with distinct functional properties, independently separating clusters which were affected differently by treatment. This work demonstrates that unsupervised cluster analysis, with no prior knowledge of the data, can be employed to provide a multi-parametric characterization of functional heterogeneity within tumour volumes. PMID:26398888
Comparison of Salmonella enteritidis phage types isolated from layers and humans in Belgium in 2005.
Welby, Sarah; Imberechts, Hein; Riocreux, Flavien; Bertrand, Sophie; Dierick, Katelijne; Wildemauwe, Christa; Hooyberghs, Jozef; Van der Stede, Yves
2011-08-01
The aim of this study was to investigate the available results for Belgium of the European Union coordinated monitoring program (2004/665 EC) on Salmonella in layers in 2005, as well as the results of the monthly outbreak reports of Salmonella Enteritidis in humans in 2005 to identify a possible statistical significant trend in both populations. Separate descriptive statistics and univariate analysis were carried out and the parametric and/or non-parametric hypothesis tests were conducted. A time cluster analysis was performed for all Salmonella Enteritidis phage types (PTs) isolated. The proportions of each Salmonella Enteritidis PT in layers and in humans were compared and the monthly distribution of the most common PT, isolated in both populations, was evaluated. The time cluster analysis revealed significant clusters during the months May and June for layers and May, July, August, and September for humans. PT21, the most frequently isolated PT in both populations in 2005, seemed to be responsible of these significant clusters. PT4 was the second most frequently isolated PT. No significant difference was found for the monthly trend evolution of both PT in both populations based on parametric and non-parametric methods. A similar monthly trend of PT distribution in humans and layers during the year 2005 was observed. The time cluster analysis and the statistical significance testing confirmed these results. Moreover, the time cluster analysis showed significant clusters during the summer time and slightly delayed in time (humans after layers). These results suggest a common link between the prevalence of Salmonella Enteritidis in layers and the occurrence of the pathogen in humans. Phage typing was confirmed to be a useful tool for identifying temporal trends.
A scoping review of spatial cluster analysis techniques for point-event data.
Fritz, Charles E; Schuurman, Nadine; Robertson, Colin; Lear, Scott
2013-05-01
Spatial cluster analysis is a uniquely interdisciplinary endeavour, and so it is important to communicate and disseminate ideas, innovations, best practices and challenges across practitioners, applied epidemiology researchers and spatial statisticians. In this research we conducted a scoping review to systematically search peer-reviewed journal databases for research that has employed spatial cluster analysis methods on individual-level, address location, or x and y coordinate derived data. To illustrate the thematic issues raised by our results, methods were tested using a dataset where known clusters existed. Point pattern methods, spatial clustering and cluster detection tests, and a locally weighted spatial regression model were most commonly used for individual-level, address location data (n = 29). The spatial scan statistic was the most popular method for address location data (n = 19). Six themes were identified relating to the application of spatial cluster analysis methods and subsequent analyses, which we recommend researchers to consider; exploratory analysis, visualization, spatial resolution, aetiology, scale and spatial weights. It is our intention that researchers seeking direction for using spatial cluster analysis methods, consider the caveats and strengths of each approach, but also explore the numerous other methods available for this type of analysis. Applied spatial epidemiology researchers and practitioners should give special consideration to applying multiple tests to a dataset. Future research should focus on developing frameworks for selecting appropriate methods and the corresponding spatial weighting schemes.
Intertumoral Heterogeneity within Medulloblastoma Subgroups.
Cavalli, Florence M G; Remke, Marc; Rampasek, Ladislav; Peacock, John; Shih, David J H; Luu, Betty; Garzia, Livia; Torchia, Jonathon; Nor, Carolina; Morrissy, A Sorana; Agnihotri, Sameer; Thompson, Yuan Yao; Kuzan-Fischer, Claudia M; Farooq, Hamza; Isaev, Keren; Daniels, Craig; Cho, Byung-Kyu; Kim, Seung-Ki; Wang, Kyu-Chang; Lee, Ji Yeoun; Grajkowska, Wieslawa A; Perek-Polnik, Marta; Vasiljevic, Alexandre; Faure-Conter, Cecile; Jouvet, Anne; Giannini, Caterina; Nageswara Rao, Amulya A; Li, Kay Ka Wai; Ng, Ho-Keung; Eberhart, Charles G; Pollack, Ian F; Hamilton, Ronald L; Gillespie, G Yancey; Olson, James M; Leary, Sarah; Weiss, William A; Lach, Boleslaw; Chambless, Lola B; Thompson, Reid C; Cooper, Michael K; Vibhakar, Rajeev; Hauser, Peter; van Veelen, Marie-Lise C; Kros, Johan M; French, Pim J; Ra, Young Shin; Kumabe, Toshihiro; López-Aguilar, Enrique; Zitterbart, Karel; Sterba, Jaroslav; Finocchiaro, Gaetano; Massimino, Maura; Van Meir, Erwin G; Osuka, Satoru; Shofuda, Tomoko; Klekner, Almos; Zollo, Massimo; Leonard, Jeffrey R; Rubin, Joshua B; Jabado, Nada; Albrecht, Steffen; Mora, Jaume; Van Meter, Timothy E; Jung, Shin; Moore, Andrew S; Hallahan, Andrew R; Chan, Jennifer A; Tirapelli, Daniela P C; Carlotti, Carlos G; Fouladi, Maryam; Pimentel, José; Faria, Claudia C; Saad, Ali G; Massimi, Luca; Liau, Linda M; Wheeler, Helen; Nakamura, Hideo; Elbabaa, Samer K; Perezpeña-Diazconti, Mario; Chico Ponce de León, Fernando; Robinson, Shenandoah; Zapotocky, Michal; Lassaletta, Alvaro; Huang, Annie; Hawkins, Cynthia E; Tabori, Uri; Bouffet, Eric; Bartels, Ute; Dirks, Peter B; Rutka, James T; Bader, Gary D; Reimand, Jüri; Goldenberg, Anna; Ramaswamy, Vijay; Taylor, Michael D
2017-06-12
While molecular subgrouping has revolutionized medulloblastoma classification, the extent of heterogeneity within subgroups is unknown. Similarity network fusion (SNF) applied to genome-wide DNA methylation and gene expression data across 763 primary samples identifies very homogeneous clusters of patients, supporting the presence of medulloblastoma subtypes. After integration of somatic copy-number alterations, and clinical features specific to each cluster, we identify 12 different subtypes of medulloblastoma. Integrative analysis using SNF further delineates group 3 from group 4 medulloblastoma, which is not as readily apparent through analyses of individual data types. Two clear subtypes of infants with Sonic Hedgehog medulloblastoma with disparate outcomes and biology are identified. Medulloblastoma subtypes identified through integrative clustering have important implications for stratification of future clinical trials. Copyright © 2017 Elsevier Inc. All rights reserved.
Analysis of ligand-protein exchange by Clustering of Ligand Diffusion Coefficient Pairs (CoLD-CoP)
NASA Astrophysics Data System (ADS)
Snyder, David A.; Chantova, Mihaela; Chaudhry, Saadia
2015-06-01
NMR spectroscopy is a powerful tool in describing protein structures and protein activity for pharmaceutical and biochemical development. This study describes a method to determine weak binding ligands in biological systems by using hierarchic diffusion coefficient clustering of multidimensional data obtained with a 400 MHz Bruker NMR. Comparison of DOSY spectrums of ligands of the chemical library in the presence and absence of target proteins show translational diffusion rates for small molecules upon interaction with macromolecules. For weak binders such as compounds found in fragment libraries, changes in diffusion rates upon macromolecular binding are on the order of the precision of DOSY diffusion measurements, and identifying such subtle shifts in diffusion requires careful statistical analysis. The "CoLD-CoP" (Clustering of Ligand Diffusion Coefficient Pairs) method presented here uses SAHN clustering to identify protein-binders in a chemical library or even a not fully characterized metabolite mixture. We will show how DOSY NMR and the "CoLD-CoP" method complement each other in identifying the most suitable candidates for lysozyme and wheat germ acid phosphatase.
Artim-Esen, Bahar; Çene, Erhan; Şahinkaya, Yasemin; Ertan, Semra; Pehlivan, Özlem; Kamali, Sevil; Gül, Ahmet; Öcal, Lale; Aral, Orhan; Inanç, Murat
2014-07-01
Associations between autoantibodies and clinical features have been described in systemic lupus erythematosus (SLE). Herein, we aimed to define autoantibody clusters and their clinical correlations in a large cohort of patients with SLE. We analyzed 852 patients with SLE who attended our clinic. Seven autoantibodies were selected for cluster analysis: anti-DNA, anti-Sm, anti-RNP, anticardiolipin (aCL) immunoglobulin (Ig)G or IgM, lupus anticoagulant (LAC), anti-Ro, and anti-La. Two-step clustering and Kaplan-Meier survival analyses were used. Five clusters were identified. A cluster consisted of patients with only anti-dsDNA antibodies, a cluster of anti-Sm and anti-RNP, a cluster of aCL IgG/M and LAC, and a cluster of anti-Ro and anti-La antibodies. Analysis revealed 1 more cluster that consisted of patients who did not belong to any of the clusters formed by antibodies chosen for cluster analysis. Sm/RNP cluster had significantly higher incidence of pulmonary hypertension and Raynaud phenomenon. DsDNA cluster had the highest incidence of renal involvement. In the aCL/LAC cluster, there were significantly more patients with neuropsychiatric involvement, antiphospholipid syndrome, autoimmune hemolytic anemia, and thrombocytopenia. According to the Systemic Lupus International Collaborating Clinics damage index, the highest frequency of damage was in the aCL/LAC cluster. Comparison of 10 and 20 years survival showed reduced survival in the aCL/LAC cluster. This study supports the existence of autoantibody clusters with distinct clinical features in SLE and shows that forming clinical subsets according to autoantibody clusters may be useful in predicting the outcome of the disease. Autoantibody clusters in SLE may exhibit differences according to the clinical setting or population.
Ghazizadeh, Mahtab; McDonald, Anthony D; Lee, John D
2014-09-01
This study applies text mining to extract clusters of vehicle problems and associated trends from free-response data in the National Highway Traffic Safety Administration's vehicle owner's complaint database. As the automotive industry adopts new technologies, it is important to systematically assess the effect of these changes on traffic safety. Driving simulators, naturalistic driving data, and crash databases all contribute to a better understanding of how drivers respond to changing vehicle technology, but other approaches, such as automated analysis of incident reports, are needed. Free-response data from incidents representing two severity levels (fatal incidents and incidents involving injury) were analyzed using a text mining approach: latent semantic analysis (LSA). LSA and hierarchical clustering identified clusters of complaints for each severity level, which were compared and analyzed across time. Cluster analysis identified eight clusters of fatal incidents and six clusters of incidents involving injury. Comparisons showed that although the airbag clusters across the two severity levels have the same most frequent terms, the circumstances around the incidents differ. The time trends show clear increases in complaints surrounding the Ford/Firestone tire recall and the Toyota unintended acceleration recall. Increases in complaints may be partially driven by these recall announcements and the associated media attention. Text mining can reveal useful information from free-response databases that would otherwise be prohibitively time-consuming and difficult to summarize manually. Text mining can extend human analysis capabilities for large free-response databases to support earlier detection of problems and more timely safety interventions.
Deckersbach, Thilo; Peters, Amy T.; Sylvia, Louisa G.; Gold, Alexandra K.; da Silva Magalhaes, Pedro Vieira; Henry, David B.; Frank, Ellen; Otto, Michael W.; Berk, Michael; Dougherty, Darin D.; Nierenberg, Andrew A.; Miklowitz, David J.
2016-01-01
Background We sought to address how predictors and moderators of psychotherapy for bipolar depression – identified individually in prior analyses – can inform the development of a metric for prospectively classifying treatment outcome in intensive psychotherapy (IP) versus collaborative care (CC) adjunctive to pharmacotherapy in the Systematic Treatment Enhancement Program (STEP-BD) study. Methods We conducted post-hoc analyses on 135 STEP-BD participants using cluster analysis to identify subsets of participants with similar clinical profiles and investigated this combined metric as a moderator and predictor of response to IP. We used agglomerative hierarchical cluster analyses and k-means clustering to determine the content of the clinical profiles. Logistic regression and Cox proportional hazard models were used to evaluate whether the resulting clusters predicted or moderated likelihood of recovery or time until recovery. Results The cluster analysis yielded a two-cluster solution: 1) “less-recurrent/severe” and 2) “chronic/recurrent.” Rates of recovery in IP were similar for less-recurrent/severe and chronic/recurrent participants. Less-recurrent/severe patients were more likely than chronic/recurrent patients to achieve recovery in CC (p = .040, OR = 4.56). IP yielded a faster recovery for chronic/recurrent participants, whereas CC led to recovery sooner in the less-recurrent/severe cluster (p = .034, OR = 2.62). Limitations Cluster analyses require list-wise deletion of cases with missing data so we were unable to conduct analyses on all STEP-BD participants. Conclusions A well-powered, parametric approach can distinguish patients based on illness history and provide clinicians with symptom profiles of patients that confer differential prognosis in CC vs. IP. PMID:27289316
Dietary BMAA exposure in an amyotrophic lateral sclerosis cluster from southern France.
Masseret, Estelle; Banack, Sandra; Boumédiène, Farid; Abadie, Eric; Brient, Luc; Pernet, Fabrice; Juntas-Morales, Raoul; Pageot, Nicolas; Metcalf, James; Cox, Paul; Camu, William
2013-01-01
Dietary exposure to the cyanotoxin BMAA is suspected to be the cause of amyotrophic lateral sclerosis in the Western Pacific Islands. In Europe and North America, this toxin has been identified in the marine environment of amyotrophic lateral sclerosis clusters but, to date, only few dietary exposures have been described. We aimed at identifying cluster(s) of amyotrophic lateral sclerosis in the Hérault district, a coastal district from Southern France, and to search, in the identified area(s), for the existence of a potential dietary source of BMAA. A spatio-temporal cluster analysis was performed in the district, considering all incident amyotrophic lateral sclerosis cases identified from 1994 to 2009 by our expert center. We investigated the cluster area with serial collections of oysters and mussels that were subsequently analyzed blind for BMAA concentrations. We found one significant amyotrophic lateral sclerosis cluster (p = 0.0024), surrounding the Thau lagoon, the most important area of shellfish production and consumption along the French Mediterranean coast. BMAA was identified in mussels (1.8 µg/g to 6.0 µg/g) and oysters (0.6 µg/g to 1.6 µg/g). The highest concentrations of BMAA were measured during summer when the highest picocyanobacteria abundances were recorded. While it is not possible to ascertain a direct link between shellfish consumption and the existence of this ALS cluster, these results add new data to the potential association of BMAA with sporadic amyotrophic lateral sclerosis, one of the most severe neurodegenerative disorder.
OMERACT-based fibromyalgia symptom subgroups: an exploratory cluster analysis.
Vincent, Ann; Hoskin, Tanya L; Whipple, Mary O; Clauw, Daniel J; Barton, Debra L; Benzo, Roberto P; Williams, David A
2014-10-16
The aim of this study was to identify subsets of patients with fibromyalgia with similar symptom profiles using the Outcome Measures in Rheumatology (OMERACT) core symptom domains. Female patients with a diagnosis of fibromyalgia and currently meeting fibromyalgia research survey criteria completed the Brief Pain Inventory, the 30-item Profile of Mood States, the Medical Outcomes Sleep Scale, the Multidimensional Fatigue Inventory, the Multiple Ability Self-Report Questionnaire, the Fibromyalgia Impact Questionnaire-Revised (FIQ-R) and the Short Form-36 between 1 June 2011 and 31 October 2011. Hierarchical agglomerative clustering was used to identify subgroups of patients with similar symptom profiles. To validate the results from this sample, hierarchical agglomerative clustering was repeated in an external sample of female patients with fibromyalgia with similar inclusion criteria. A total of 581 females with a mean age of 55.1 (range, 20.1 to 90.2) years were included. A four-cluster solution best fit the data, and each clustering variable differed significantly (P <0.0001) among the four clusters. The four clusters divided the sample into severity levels: Cluster 1 reflects the lowest average levels across all symptoms, and cluster 4 reflects the highest average levels. Clusters 2 and 3 capture moderate symptoms levels. Clusters 2 and 3 differed mainly in profiles of anxiety and depression, with Cluster 2 having lower levels of depression and anxiety than Cluster 3, despite higher levels of pain. The results of the cluster analysis of the external sample (n = 478) looked very similar to those found in the original cluster analysis, except for a slight difference in sleep problems. This was despite having patients in the validation sample who were significantly younger (P <0.0001) and had more severe symptoms (higher FIQ-R total scores (P = 0.0004)). In our study, we incorporated core OMERACT symptom domains, which allowed for clustering based on a comprehensive symptom profile. Although our exploratory cluster solution needs confirmation in a longitudinal study, this approach could provide a rationale to support the study of individualized clinical evaluation and intervention.
Dennis, Ann M; Hué, Stephane; Learner, Emily; Sebastian, Joseph; Miller, William C; Eron, Joseph J
2017-01-01
HIV-1 diversity is increasing in North American and European cohorts which may have public health implications. However, little is known about non-B subtype diversity in the southern United States, despite the region being the epicenter of the nation's epidemic. We characterized HIV-1 diversity and transmission clusters to identify the extent to which non-B strains are transmitted locally. We conducted cross-sectional analyses of HIV-1 partial pol sequences collected from 1997 to 2014 from adults accessing routine clinical care in North Carolina (NC). Subtypes were evaluated using COMET and phylogenetic analysis. Putative transmission clusters were identified using maximum-likelihood trees. Clusters involving non-B strains were confirmed and their dates of origin were estimated using Bayesian phylogenetics. Data were combined with demographic information collected at the time of sample collection and country of origin for a subset of patients. Among 24,972 sequences from 15,246 persons, the non-B subtype prevalence increased from 0% to 3.46% over the study period. Of 325 persons with non-B subtypes, diversity was high with over 15 pure subtypes and recombinants; subtype C (28.9%) and CRF02_AG (24.0%) were most common. While identification of transmission clusters was lower for persons with non-B versus B subtypes, several local transmission clusters (≥3 persons) involving non-B subtypes were identified and all were presumably due to heterosexual transmission. Prevalence of non-B subtype diversity remains low in NC but a statistically significant rise was identified over time which likely reflects multiple importation. However, the combined phylogenetic clustering analysis reveals evidence for local onward transmission. Detection of these non-B clusters suggests heterosexual transmission and may guide diagnostic and prevention interventions.
Kim, Kwang Hyun; Yoon, Hyun Suk; Song, Wan; Choo, Hee Jung; Yoon, Hana; Chung, Woo Sik; Sim, Bong Suk; Lee, Dong Hyeon
2017-01-01
To classify patients with orthotopic neobladder based on urodynamic parameters using cluster analysis and to characterize the voiding function of each group. From January 2012 to November 2015, 142 patients with bladder cancer underwent radical cystectomy and Studer neobladder reconstruction at our institute. Of the 142 patients, 103 with complete urodynamic data and information on urinary functional outcomes were included in this study. K-means clustering was performed with urodynamic parameters which included maximal cystometric capacity, residual volume, maximal flow rate, compliance, and detrusor pressure at maximum flow rate. Three groups emerged by cluster analysis. Urodynamic parameters and urinary function outcomes were compared between three groups. Group 1 (n = 44) had ideal urodynamic parameters with a mean maximal bladder capacity of 513.3 ml and mean residual urine volume of 33.1 ml. Group 2 (n = 42) was characterized by small bladder capacity with low compliance. Patients in group 2 had higher rates of daytime incontinence and nighttime incontinence than patients in group 1. Group 3 (n = 17) was characterized by large residual urine volume with high compliance. When we examined gender differences in urodynamics and functional outcomes, residual urine volume and the rate of daytime incontinence were only marginally significant. However, females were significantly more likely to belong to group 2 or 3 (P = 0.003). In multivariate analysis to identify factors associated with group 1 which has the most ideal urodynamic pattern, age (OR 0.95, P = 0.017) and male gender (OR 7.57, P = 0.003) were identified as significant factors. While patients with ileal neobladder present with various voiding symptoms, three urodynamic patterns were identified by cluster analysis. Approximately half of patients had ideal urodynamic parameters. The other two groups were characterized by large residual urine and small capacity bladder with low compliance. Young age and male gender appear to have a favorable impact on urodynamic and voiding outcomes in patients undergoing orthotopic neobladder reconstruction.
Yang, Jun; Hou, Ziming; Wang, Changjiang; Wang, Hao; Zhang, Hongbing
2018-04-23
Adamantinomatous craniopharyngioma (ACP) is an aggressive brain tumor that occurs predominantly in the pediatric population. Conventional diagnosis method and standard therapy cannot treat ACPs effectively. In this paper, we aimed to identify key genes for ACP early diagnosis and treatment. Datasets GSE94349 and GSE68015 were obtained from Gene Expression Omnibus database. Consensus clustering was applied to discover the gene clusters in the expression data of GSE94349 and functional enrichment analysis was performed on gene set in each cluster. The protein-protein interaction (PPI) network was built by the Search Tool for the Retrieval of Interacting Genes, and hubs were selected. Support vector machine (SVM) model was built based on the signature genes identified from enrichment analysis and PPI network. Dataset GSE94349 was used for training and testing, and GSE68015 was used for validation. Besides, RT-qPCR analysis was performed to analyze the expression of signature genes in ACP samples compared with normal controls. Seven gene clusters were discovered in the differentially expressed genes identified from GSE94349 dataset. Enrichment analysis of each cluster identified 25 pathways that highly associated with ACP. PPI network was built and 46 hubs were determined. Twenty-five pathway-related genes that overlapped with the hubs in PPI network were used as signatures to establish the SVM diagnosis model for ACP. The prediction accuracy of SVM model for training, testing, and validation data were 94, 85, and 74%, respectively. The expression of CDH1, CCL2, ITGA2, COL8A1, COL6A2, and COL6A3 were significantly upregulated in ACP tumor samples, while CAMK2A, RIMS1, NEFL, SYT1, and STX1A were significantly downregulated, which were consistent with the differentially expressed gene analysis. SVM model is a promising classification tool for screening and early diagnosis of ACP. The ACP-related pathways and signature genes will advance our knowledge of ACP pathogenesis and benefit the therapy improvement.
Wang, Lili; Palmer, Andrew J; Cocker, Fiona; Sanderson, Kristy
2017-01-09
No universally accepted definition of multimorbidity (MM) exists, and implications of different definitions have not been explored. This study examined the performance of the count and cluster definitions of multimorbidity on the sociodemographic profile and health-related quality of life (HRQoL) in a general population. Data were derived from the nationally representative 2007 Australian National Survey of Mental Health and Wellbeing (n = 8841). The HRQoL scores were measured using the Assessment of Quality of Life (AQoL-4D) instrument. The simple count (2+ & 3+ conditions) and hierarchical cluster methods were used to define/identify clusters of multimorbidity. Linear regression was used to assess the associations between HRQoL and multimorbidity as defined by the different methods. The assessment of multimorbidity, which was defined using the count method, resulting in the prevalence of 26% (MM2+) and 10.1% (MM3+). Statistically significant clusters identified through hierarchical cluster analysis included heart or circulatory conditions (CVD)/arthritis (cluster-1, 9%) and major depressive disorder (MDD)/anxiety (cluster-2, 4%). A sensitivity analysis suggested that the stability of the clusters resulted from hierarchical clustering. The sociodemographic profiles were similar between MM2+, MM3+ and cluster-1, but were different from cluster-2. HRQoL was negatively associated with MM2+ (β: -0.18, SE: -0.01, p < 0.001), MM3+ (β: -0.23, SE: -0.02, p < 0.001), cluster-1 (β: -0.10, SE: 0.01, p < 0.001) and cluster-2 (β: -0.36, SE: 0.01, p < 0.001). Our findings confirm the existence of an inverse relationship between multimorbidity and HRQoL in the Australian population and indicate that the hierarchical clustering approach is validated when the outcome of interest is HRQoL from this head-to-head comparison. Moreover, a simple count fails to identify if there are specific conditions of interest that are driving poorer HRQoL. Researchers should exercise caution when selecting a definition of multimorbidity because it may significantly influence the study outcomes.
Lee, Tai-Fen; Du, Shin-Hei; Teng, Shih-Hua; Liao, Chun-Hsing; Sheng, Wang-Hui; Teng, Lee-Jene
2014-01-01
We evaluated whether the Bruker Biotyper matrix-associated laser desorption ionization–time of flight mass spectrometry (MALDI-TOF MS) system provides accurate species-level identifications of 147 isolates of aerobically growing Gram-positive rods (GPRs). The bacterial isolates included Nocardia (n = 74), Listeria (n = 39), Kocuria (n = 15), Rhodococcus (n = 10), Gordonia (n = 7), and Tsukamurella (n = 2) species, which had all been identified by conventional methods, molecular methods, or both. In total, 89.7% of Listeria monocytogenes, 80% of Rhodococcus species, 26.7% of Kocuria species, and 14.9% of Nocardia species (n = 11, all N. nova and N. otitidiscaviarum) were correctly identified to the species level (score values, ≥2.0). A clustering analysis of spectra generated by the Bruker Biotyper identified six clusters of Nocardia species, i.e., cluster 1 (N. cyriacigeorgica), cluster 2 (N. brasiliensis), cluster 3 (N. farcinica), cluster 4 (N. puris), cluster 5 (N. asiatica), and cluster 6 (N. beijingensis), based on the six peaks generated by ClinProTools with the genetic algorithm, i.e., m/z 2,774.477 (cluster 1), m/z 5,389.792 (cluster 2), m/z 6,505.720 (cluster 3), m/z 5,428.795 (cluster 4), m/z 6,525.326 (cluster 5), and m/z 16,085.216 (cluster 6). Two clusters of L. monocytogenes spectra were also found according to the five peaks, i.e., m/z 5,594.85, m/z 6,184.39, and m/z 11,187.31, for cluster 1 (serotype 1/2a) and m/z 5,601.21 and m/z 11,199.33 for cluster 2 (serotypes 1/2b and 4b). The Bruker Biotyper system was unable to accurately identify Nocardia (except for N. nova and N. otitidiscaviarum), Tsukamurella, or Gordonia species. Continuous expansion of the MALDI-TOF MS databases to include more GPRs is necessary. PMID:24759706
Moreira, Naiara Ferraz; da Veiga, Gloria Valeria; Santaliestra-Pasías, Alba María; Androutsos, Odysseas; Cuenca-García, Magdalena; de Oliveira, Alessandra Silva Dias; Pereira, Rosangela Alves; de Moraes, Anelise Bezerra de Vasconcelos; Van den Bussche, Karen; Censi, Laura; González-Gross, Marcela; Cañada, David; Gottrand, Frederic; Kafatos, Anthony; Marcos, Ascensión; Widhalm, Kurt; Mólnar, Dénes; Moreno, Luis Alberto
2018-01-01
The objective of this study was to identify clustering patterns of four energy balance-related behaviors (EBRB): television (TV) watching, moderate and vigorous physical activity (MVPA), consumption of fruits and vegetables (F&V), and consumption of sugar-sweetened beverages (SSB), among European and Brazilian adolescents. EBRB associations with different body fat composition indicators were then evaluated. Participants included adolescents from eight European countries in the HELENA (Healthy Lifestyle in Europe by Nutrition in Adolescents) study (n = 2,057, 53.8% female; age: 12.5-17.5 years) and from the metropolitan region of Rio de Janeiro/Brazil in the ELANA study (the Adolescent Nutritional Assessment Longitudinal Study) (n = 968, 53.2% female; age: 13.5-19 years). EBRB data allowed for sex- and study-specific clusters. Associations were estimated by ANOVA and odds ratios. Five clustering patterns were identified. Four similar clusters were identified for each sex and study. Among boys, different cluster identified was characterized by high F&V consumption in the HELENA study and high TV watching and high MVPA time in the ELANA study. Among girls, the different clusters identified was characterized by high F&V consumption in both studies and, additionally, high SSB consumption in the ELANA study. Regression analysis showed that clusters characterized by high SSB consumption in European boys; high TV watching, and high TV watching plus high MVPA in Brazilian boys; and high MVPA, and high SSB and F&V consumption in Brazilian girls, were positively associated with different body fat composition indicators. Common clusters were observed in adolescents from Europe and Brazil, however, no cluster was identified as being completely healthy or unhealthy. Each cluster seems to impact on body composition indicators, depending on the group. Public health actions should aim to promote adequate practices of EBRB. Copyright © 2017. Published by Elsevier Ltd.
Off-road truck-related accidents in U.S. mines
Dindarloo, Saeid R.; Pollard, Jonisha P.; Siami-Irdemoosa, Elnaz
2016-01-01
Introduction Off-road trucks are one of the major sources of equipment-related accidents in the U.S. mining industries. A systematic analysis of all off-road truck-related accidents, injuries, and illnesses, which are reported and published by the Mine Safety and Health Administration (MSHA), is expected to provide practical insights for identifying the accident patterns and trends in the available raw database. Therefore, appropriate safety management measures can be administered and implemented based on these accident patterns/trends. Methods A hybrid clustering-classification methodology using K-means clustering and gene expression programming (GEP) is proposed for the analysis of severe and non-severe off-road truck-related injuries at U.S. mines. Using the GEP sub-model, a small subset of the 36 recorded attributes was found to be correlated to the severity level. Results Given the set of specified attributes, the clustering sub-model was able to cluster the accident records into 5 distinct groups. For instance, the first cluster contained accidents related to minerals processing mills and coal preparation plants (91%). More than two-thirds of the victims in this cluster had less than 5 years of job experience. This cluster was associated with the highest percentage of severe injuries (22 severe accidents, 3.4%). Almost 50% of all accidents in this cluster occurred at stone operations. Similarly, the other four clusters were characterized to highlight important patterns that can be used to determine areas of focus for safety initiatives. Conclusions The identified clusters of accidents may play a vital role in the prevention of severe injuries in mining. Further research into the cluster attributes and identified patterns will be necessary to determine how these factors can be mitigated to reduce the risk of severe injuries. Practical application Analyzing injury data using data mining techniques provides some insight into attributes that are associated with high accuracies for predicting injury severity. PMID:27620937
Off-road truck-related accidents in U.S. mines.
Dindarloo, Saeid R; Pollard, Jonisha P; Siami-Irdemoosa, Elnaz
2016-09-01
Off-road trucks are one of the major sources of equipment-related accidents in the U.S. mining industries. A systematic analysis of all off-road truck-related accidents, injuries, and illnesses, which are reported and published by the Mine Safety and Health Administration (MSHA), is expected to provide practical insights for identifying the accident patterns and trends in the available raw database. Therefore, appropriate safety management measures can be administered and implemented based on these accident patterns/trends. A hybrid clustering-classification methodology using K-means clustering and gene expression programming (GEP) is proposed for the analysis of severe and non-severe off-road truck-related injuries at U.S. mines. Using the GEP sub-model, a small subset of the 36 recorded attributes was found to be correlated to the severity level. Given the set of specified attributes, the clustering sub-model was able to cluster the accident records into 5 distinct groups. For instance, the first cluster contained accidents related to minerals processing mills and coal preparation plants (91%). More than two-thirds of the victims in this cluster had less than 5years of job experience. This cluster was associated with the highest percentage of severe injuries (22 severe accidents, 3.4%). Almost 50% of all accidents in this cluster occurred at stone operations. Similarly, the other four clusters were characterized to highlight important patterns that can be used to determine areas of focus for safety initiatives. The identified clusters of accidents may play a vital role in the prevention of severe injuries in mining. Further research into the cluster attributes and identified patterns will be necessary to determine how these factors can be mitigated to reduce the risk of severe injuries. Analyzing injury data using data mining techniques provides some insight into attributes that are associated with high accuracies for predicting injury severity. Copyright © 2016 Elsevier Ltd and National Safety Council. All rights reserved.
Clermont, Gilles; Chen, Lujie; Dubrawski, Artur W.; Ren, Dianxu; Hoffman, Leslie A.; Pinsky, Michael R.; Hravnak, Marilyn
2018-01-01
Cardiorespiratory instability (CRI) in monitored step-down unit (SDU) patients has a variety of etiologies, and likely manifests in patterns of vital signs (VS) changes. We explored use of clustering techniques to identify patterns in the initial CRI epoch (CRI1; first exceedances of VS beyond stability thresholds after SDU admission) of unstable patients, and inter-cluster differences in admission characteristics and outcomes. Continuous noninvasive monitoring of heart rate (HR), respiratory rate (RR), and pulse oximetry (SpO2) were sampled at 1/20 Hz. We identified CRI1 in 165 patients, employed hierarchical and k-means clustering, tested several clustering solutions, used 10-fold cross validation to establish the best solution and assessed inter-cluster differences in admission characteristics and outcomes. Three clusters (C) were derived: C1) normal/high HR and RR, normal SpO2 (n = 30); C2) normal HR and RR, low SpO2 (n = 103); and C3) low/normal HR, low RR and normal SpO2 (n = 32). Clusters were significantly different based on age (p < 0.001; older patients in C2), number of comorbidities (p = 0.008; more C2 patients had ≥ 2) and hospital length of stay (p = 0.006; C1 patients stayed longer). There were no between-cluster differences in SDU length of stay, or mortality. Three different clusters of VS presentations for CRI1 were identified. Clusters varied on age, number of comorbidities and hospital length of stay. Future study is needed to determine if there are common physiologic underpinnings of VS clusters which might inform clinical decision-making when CRI first manifests. PMID:28229353
Kang, Kiwon; Sung, Joohon; Kim, Chang Yup
2010-01-01
We investigated the clustering of selected lifestyle factors (cigarette smoking, heavy alcohol consumption, lack of physical exercise) and identified the population characteristics associated with increasing lifestyle risks. Data on lifestyle risk factors, sociodemographic characteristics, and history of chronic diseases were obtained from 7,694 individuals >/=20 years of age who participated in the 2005 Korea National Health and Nutrition Examination Survey (KNHANES). Clustering of lifestyle risks involved the observed prevalence of multiple risks and those expected from marginal exposure prevalence of the three selected risk factors. Prevalence odds ratio was adopted as a measurement of clustering. Multiple correspondence analysis, Kendall tau correlation, Man-Whitney analysis, and ordinal logistic regression analysis were conducted to identify variables increasing lifestyle risks. In both men and women, increased lifestyle risks were associated with clustering of: (1) cigarette smoking and excessive alcohol consumption, and (2) smoking, excessive alcohol consumption, and lack of physical exercise. Patterns of clustering for physical exercise were different from those for cigarette smoking and alcohol consumption. The increased unhealthy clustering was found among men 20-64 years of age with mild or moderate stress, and among women 35-49 years of age who were never-married, with mild stress, and increased body mass index (>30 kg/m(2)). Addressing a lack of physical exercise considering individual characteristics including gender, age, employment activity, and stress levels should be a focus of health promotion efforts.
Ten-year results of a ponderosa pine progeny test in the Black Hills
Wayne D. Shepperd; Sue E. McElderry
1986-01-01
Ten-year survival and growth of seedlings from 77 parent trees from throughout the Black Hills were compared, using a cluster-analysis technique. Five clusters were identified that account for most of the variability in survival and growth of the open-pollinated families. One cluster, containing 6 families, exhibited exceptional survival and growth. Another, containing...
Identifying and Assessing Interesting Subgroups in a Heterogeneous Population.
Lee, Woojoo; Alexeyenko, Andrey; Pernemalm, Maria; Guegan, Justine; Dessen, Philippe; Lazar, Vladimir; Lehtiö, Janne; Pawitan, Yudi
2015-01-01
Biological heterogeneity is common in many diseases and it is often the reason for therapeutic failures. Thus, there is great interest in classifying a disease into subtypes that have clinical significance in terms of prognosis or therapy response. One of the most popular methods to uncover unrecognized subtypes is cluster analysis. However, classical clustering methods such as k-means clustering or hierarchical clustering are not guaranteed to produce clinically interesting subtypes. This could be because the main statistical variability--the basis of cluster generation--is dominated by genes not associated with the clinical phenotype of interest. Furthermore, a strong prognostic factor might be relevant for a certain subgroup but not for the whole population; thus an analysis of the whole sample may not reveal this prognostic factor. To address these problems we investigate methods to identify and assess clinically interesting subgroups in a heterogeneous population. The identification step uses a clustering algorithm and to assess significance we use a false discovery rate- (FDR-) based measure. Under the heterogeneity condition the standard FDR estimate is shown to overestimate the true FDR value, but this is remedied by an improved FDR estimation procedure. As illustrations, two real data examples from gene expression studies of lung cancer are provided.
Comparative analysis of prophages in Streptococcus mutans genomes
Fu, Tiwei; Fan, Xiangyu; Long, Quanxin; Deng, Wanyan; Song, Jinlin
2017-01-01
Prophages have been considered genetic units that have an intimate association with novel phenotypic properties of bacterial hosts, such as pathogenicity and genomic variation. Little is known about the genetic information of prophages in the genome of Streptococcus mutans, a major pathogen of human dental caries. In this study, we identified 35 prophage-like elements in S. mutans genomes and performed a comparative genomic analysis. Comparative genomic and phylogenetic analyses of prophage sequences revealed that the prophages could be classified into three main large clusters: Cluster A, Cluster B, and Cluster C. The S. mutans prophages in each cluster were compared. The genomic sequences of phismuN66-1, phismuNLML9-1, and phismu24-1 all shared similarities with the previously reported S. mutans phages M102, M102AD, and ϕAPCM01. The genomes were organized into seven major gene clusters according to the putative functions of the predicted open reading frames: packaging and structural modules, integrase, host lysis modules, DNA replication/recombination modules, transcriptional regulatory modules, other protein modules, and hypothetical protein modules. Moreover, an integrase gene was only identified in phismuNLML9-1 prophages. PMID:29158986
Mun, Eun-Young; von Eye, Alexander; Bates, Marsha E.; Vaschillo, Evgeny G.
2010-01-01
Model-based cluster analysis is a new clustering procedure to investigate population heterogeneity utilizing finite mixture multivariate normal densities. It is an inferentially based, statistically principled procedure that allows comparison of non-nested models using the Bayesian Information Criterion (BIC) to compare multiple models and identify the optimum number of clusters. The current study clustered 36 young men and women based on their baseline heart rate (HR) and HR variability (HRV), chronic alcohol use, and reasons for drinking. Two cluster groups were identified and labeled High Alcohol Risk and Normative groups. Compared to the Normative group, individuals in the High Alcohol Risk group had higher levels of alcohol use and more strongly endorsed disinhibition and suppression reasons for use. The High Alcohol Risk group showed significant HRV changes in response to positive and negative emotional and appetitive picture cues, compared to neutral cues. In contrast, the Normative group showed a significant HRV change only to negative cues. Findings suggest that the individuals with autonomic self-regulatory difficulties may be more susceptible to heavy alcohol use and use alcohol for emotional regulation. PMID:18331138
Lee, Yii-Ching; Huang, Shian-Chang; Huang, Chih-Hsuan; Wu, Hsin-Hung
2016-01-01
This study uses kernel k-means cluster analysis to identify medical staffs with high burnout. The data collected in October to November 2014 are from the emotional exhaustion dimension of the Chinese version of Safety Attitudes Questionnaire in a regional teaching hospital in Taiwan. The number of effective questionnaires including the entire staffs such as physicians, nurses, technicians, pharmacists, medical administrators, and respiratory therapists is 680. The results show that 8 clusters are generated by kernel k-means method. Employees in clusters 1, 4, and 5 are relatively in good conditions, whereas employees in clusters 2, 3, 6, 7, and 8 need to be closely monitored from time to time because they have relatively higher degree of burnout. When employees with higher degree of burnout are identified, the hospital management can take actions to improve the resilience, reduce the potential medical errors, and, eventually, enhance the patient safety. This study also suggests that the hospital management needs to keep track of medical staffs' fatigue conditions and provide timely assistance for burnout recovery through employee assistance programs, mindfulness-based stress reduction programs, positivity currency buildup, and forming appreciative inquiry groups. © The Author(s) 2016.
Molecular Predictors of 3D Morphogenesis by Breast Cancer Cell Lines in 3D Culture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Han, Ju; Chang, Hang; Giricz, Orsi
Correlative analysis of molecular markers with phenotypic signatures is the simplest model for hypothesis generation. In this paper, a panel of 24 breast cell lines was grown in 3D culture, their morphology was imaged through phase contrast microscopy, and computational methods were developed to segment and represent each colony at multiple dimensions. Subsequently, subpopulations from these morphological responses were identified through consensus clustering to reveal three clusters of round, grape-like, and stellate phenotypes. In some cases, cell lines with particular pathobiological phenotypes clustered together (e.g., ERBB2 amplified cell lines sharing the same morphometric properties as the grape-like phenotype). Next, associationsmore » with molecular features were realized through (i) differential analysis within each morphological cluster, and (ii) regression analysis across the entire panel of cell lines. In both cases, the dominant genes that are predictive of the morphological signatures were identified. Specifically, PPAR? has been associated with the invasive stellate morphological phenotype, which corresponds to triple-negative pathobiology. PPAR? has been validated through two supporting biological assays.« less
Using sperm morphometry and multivariate analysis to differentiate species of gray Mazama
Duarte, José Maurício Barbanti
2016-01-01
There is genetic evidence that the two species of Brazilian gray Mazama, Mazama gouazoubira and Mazama nemorivaga, belong to different genera. This study identified significant differences that separated them into distinct groups, based on characteristics of the spermatozoa and ejaculate of both species. The characteristics that most clearly differentiated between the species were ejaculate colour, white for M. gouazoubira and reddish for M. nemorivaga, and sperm head dimensions. Multivariate analysis of sperm head dimension and format data accurately discriminated three groups for species with total percentage of misclassified of 0.71. The individual analysis, by animal, and the multivariate analysis have also discriminated correctly all five animals (total percentage of misclassified of 13.95%), and the canonical plot has shown three different clusters: Cluster 1, including individuals of M. nemorivaga; Cluster 2, including two individuals of M. gouazoubira; and Cluster 3, including a single individual of M. gouazoubira. The results obtained in this work corroborate the hypothesis of the formation of new genera and species for gray Mazama. Moreover, the easily applied method described herein can be used as an auxiliary tool to identify sibling species of other taxonomic groups. PMID:28018612
Murugesan, Sugeerth; Bouchard, Kristofer; Chang, Edward; ...
2017-06-06
There exists a need for effective and easy-to-use software tools supporting the analysis of complex Electrocorticography (ECoG) data. Understanding how epileptic seizures develop or identifying diagnostic indicators for neurological diseases require the in-depth analysis of neural activity data from ECoG. Such data is multi-scale and is of high spatio-temporal resolution. Comprehensive analysis of this data should be supported by interactive visual analysis methods that allow a scientist to understand functional patterns at varying levels of granularity and comprehend its time-varying behavior. We introduce a novel multi-scale visual analysis system, ECoG ClusterFlow, for the detailed exploration of ECoG data. Our systemmore » detects and visualizes dynamic high-level structures, such as communities, derived from the time-varying connectivity network. The system supports two major views: 1) an overview summarizing the evolution of clusters over time and 2) an electrode view using hierarchical glyph-based design to visualize the propagation of clusters in their spatial, anatomical context. We present case studies that were performed in collaboration with neuroscientists and neurosurgeons using simulated and recorded epileptic seizure data to demonstrate our system's effectiveness. ECoG ClusterFlow supports the comparison of spatio-temporal patterns for specific time intervals and allows a user to utilize various clustering algorithms. Neuroscientists can identify the site of seizure genesis and its spatial progression during various the stages of a seizure. Our system serves as a fast and powerful means for the generation of preliminary hypotheses that can be used as a basis for subsequent application of rigorous statistical methods, with the ultimate goal being the clinical treatment of epileptogenic zones.« less
Aikawa, Ken; Kataoka, Masao; Ogawa, Soichiro; Akaihata, Hidenori; Sato, Yuichi; Yabe, Michihiro; Hata, Junya; Koguchi, Tomoyuki; Kojima, Yoshiyuki; Shiragasawa, Chihaya; Kobayashi, Toshimitsu; Yamaguchi, Osamu
2015-08-01
To present a new grouping of male patients with lower urinary tract symptoms (LUTS) based on symptom patterns and clarify whether the therapeutic effect of α1-blocker differs among the groups. We performed secondary analysis of anonymous data from 4815 patients enrolled in a postmarketing surveillance study of tamsulosin in Japan. Data on 7 International Prostate Symptom Score (IPSS) items at the initial visit were used in the cluster analysis. IPSS and quality of life (QOL) scores before and after tamsulosin treatment for 12 weeks were assessed in each cluster. Partial correlation coefficients were also obtained for IPSS and QOL scores based on changes before and after treatment. Five symptom groups were identified by cluster analysis of IPSS. On their symptom profile, each cluster was labeled as minimal type (cluster 1), multiple severe type (cluster 2), weak stream type (cluster 3), storage type (cluster 4), and voiding type (cluster 5). Prevalence and the mean symptom score were significantly improved in almost all symptoms in all clusters by tamsulosin treatment. Nocturia and weak stream had the strongest effect on QOL in clusters 1, 2, and 4 and clusters 3 and 5, respectively. The study clarified that 5 characteristic symptom patterns exist by cluster analysis of IPSS in male patients with LUTS. Tamsulosin improved various symptoms and QOL in each symptom group. The study reports many male patients with LUTS being satisfied with monotherapy using tamsulosin and suggests the usefulness of α1-blockers as a drug of first choice. Copyright © 2015 Elsevier Inc. All rights reserved.
Rapid identification of Enterobacter hormaechei and Enterobacter cloacae genetic cluster III.
Ohad, S; Block, C; Kravitz, V; Farber, A; Pilo, S; Breuer, R; Rorman, E
2014-05-01
Enterobacter cloacae complex bacteria are of both clinical and environmental importance. Phenotypic methods are unable to distinguish between some of the species in this complex, which often renders their identification incomplete. The goal of this study was to develop molecular assays to identify Enterobacter hormaechei and Ent. cloacae genetic cluster III which are relatively frequently encountered in clinical material. The molecular assays developed in this study are qPCR technology based and served to identify both Ent. hormaechei and Ent. cloacae genetic cluster III. qPCR results were compared to hsp60 sequence analysis. Most clinical isolates were assigned to Ent. hormaechei subsp. steigerwaltii and Ent. cloacae genetic cluster III. The latter was proportionately more frequently isolated from bloodstream infections than from other material (P < 0·05). The qPCR assays detecting Ent. hormaechei and Ent. cloacae genetic cluster III demonstrated high sensitivity and specificity. The presented qPCR assays allow accurate and rapid identification of clinical isolates of the Ent. cloacae complex. The improved identifications obtained can specifically assist analysis of Ent. hormaechei and Ent. cloacae genetic cluster III in nosocomial outbreaks and can promote rapid environmental monitoring. An association was observed between Ent. cloacae cluster III and systemic infection that deserves further attention. © 2014 The Society for Applied Microbiology.
THE JCMT GOULD BELT SURVEY: DENSE CORE CLUSTERS IN ORION A
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lane, J.; Kirk, H.; Johnstone, D.
The Orion A molecular cloud is one of the most well-studied nearby star-forming regions, and includes regions of both highly clustered and more dispersed star formation across its full extent. Here, we analyze dense, star-forming cores identified in the 850 and 450 μ m SCUBA-2 maps from the JCMT Gould Belt Legacy Survey. We identify dense cores in a uniform manner across the Orion A cloud and analyze their clustering properties. Using two independent lines of analysis, we find evidence that clusters of dense cores tend to be mass segregated, suggesting that stellar clusters may have some amount of primordial mass segregationmore » already imprinted in them at an early stage. We also demonstrate that the dense core clusters have a tendency to be elongated, perhaps indicating a formation mechanism linked to the filamentary structure within molecular clouds.« less
Kim, Hee-Sook; Eun, Sang Jun; Hwang, Jin Yong; Lee, Kun-Sei; Cho, Sung-Il
2018-05-01
Most patients with acute myocardial infarction (AMI) experience more than one symptom at onset. Although symptoms are an important early indicator, patients and physicians may have difficulty interpreting symptoms and detecting AMI at an early stage. This study aimed to identify symptom clusters among Korean patients with ST-elevation myocardial infarction (STEMI), to examine the relationship between symptom clusters and patient-related variables, and to investigate the influence of symptom clusters on treatment time delay (decision time [DT], onset-to-balloon time [OTB]). This was a prospective multicenter study with a descriptive design that used face-to-face interviews. A total of 342 patients with STEMI were included in this study. To identify symptom clusters, two-step cluster analysis was performed using SPSS software. Multinomial logistic regression to explore factors related to each cluster and multiple logistic regression to determine the effect of symptom clusters on treatment time delay were conducted. Three symptom clusters were identified: cluster 1 (classic MI; characterized by chest pain); cluster 2 (stress symptoms; sweating and chest pain); and cluster 3 (multiple symptoms; dizziness, sweating, chest pain, weakness, and dyspnea). Compared with patients in clusters 2 and 3, those in cluster 1 were more likely to have diabetes or prior MI. Patients in clusters 2 and 3, who predominantly showed other symptoms in addition to chest pain, had a significantly shorter DT and OTB than those in cluster 1. In conclusion, to decrease treatment time delay, it seems important that patients and clinicians recognize symptom clusters, rather than relying on chest pain alone. Further research is necessary to translate our findings into clinical practice and to improve patient education and public education campaigns.
Kim, Anna J.; Takahashi, Lois; Wiebe, Douglas J.
2015-01-01
Objective Social determinants of health may be substantially affected by spatial factors, which together may explain the persistence of health inequities. Clustering of possible sources of negative health and social outcomes points to a spatial focus for future interventions. We analyzed the spatial clustering of sex work businesses in Southern California to examine where and why they cluster. We explored economic and legal factors as possible explanations of clustering. Methods We manually coded data from a website used by paying members to post reviews of female massage parlor workers. We identified clusters of sexually oriented massage parlor businesses using spatial autocorrelation tests. We conducted spatial regression using census tract data to identify predictors of clustering. Results A total of 889 venues were identified. Clusters of tracts having higher-than-expected numbers of sexually oriented massage parlors (“hot spots”) were located outside downtowns. These hot spots were characterized by a higher proportion of adult males, a higher proportion of households below the federal poverty level, and a smaller average household size. Conclusion Sexually oriented massage parlors in Los Angeles and Orange counties cluster in particular neighborhoods. More research is needed to ascertain the causal factors of such clusters and how interventions can be designed to leverage these spatial factors. PMID:26327731
Identifying technical aliases in SELDI mass spectra of complex mixtures of proteins
2013-01-01
Background Biomarker discovery datasets created using mass spectrum protein profiling of complex mixtures of proteins contain many peaks that represent the same protein with different charge states. Correlated variables such as these can confound the statistical analyses of proteomic data. Previously we developed an algorithm that clustered mass spectrum peaks that were biologically or technically correlated. Here we demonstrate an algorithm that clusters correlated technical aliases only. Results In this paper, we propose a preprocessing algorithm that can be used for grouping technical aliases in mass spectrometry protein profiling data. The stringency of the variance allowed for clustering is customizable, thereby affecting the number of peaks that are clustered. Subsequent analysis of the clusters, instead of individual peaks, helps reduce difficulties associated with technically-correlated data, and can aid more efficient biomarker identification. Conclusions This software can be used to pre-process and thereby decrease the complexity of protein profiling proteomics data, thus simplifying the subsequent analysis of biomarkers by decreasing the number of tests. The software is also a practical tool for identifying which features to investigate further by purification, identification and confirmation. PMID:24010718
Using data mining to segment healthcare markets from patients' preference perspectives.
Liu, Sandra S; Chen, Jie
2009-01-01
This paper aims to provide an example of how to use data mining techniques to identify patient segments regarding preferences for healthcare attributes and their demographic characteristics. Data were derived from a number of individuals who received in-patient care at a health network in 2006. Data mining and conventional hierarchical clustering with average linkage and Pearson correlation procedures are employed and compared to show how each procedure best determines segmentation variables. Data mining tools identified three differentiable segments by means of cluster analysis. These three clusters have significantly different demographic profiles. The study reveals, when compared with traditional statistical methods, that data mining provides an efficient and effective tool for market segmentation. When there are numerous cluster variables involved, researchers and practitioners need to incorporate factor analysis for reducing variables to clearly and meaningfully understand clusters. Interests and applications in data mining are increasing in many businesses. However, this technology is seldom applied to healthcare customer experience management. The paper shows that efficient and effective application of data mining methods can aid the understanding of patient healthcare preferences.
Duque, Ricardo E
2012-04-01
Flow cytometric analysis of cell suspensions involves the sequential 'registration' of intrinsic and extrinsic parameters of thousands of cells in list mode files. Thus, it is almost irresistible to describe phenomena in numerical terms or by 'ratios' that have the appearance of 'accuracy' due to the presence of numbers obtained from thousands of cells. The concepts involved in the detection and characterization of B cell lymphoproliferative processes are revisited in this paper by identifying parameters that, when analyzed appropriately, are both necessary and sufficient. The neoplastic process (cluster) can be visualized easily because the parameters that distinguish it form a cluster in multidimensional space that is unique and distinguishable from neighboring clusters that are not of diagnostic interest but serve to provide a background. For B cell neoplasia it is operationally necessary to identify the multidimensional space occupied by a cluster whose kappa:lambda ratio is 100:0 or 0:100. Thus, the concept of kappa:lambda ratio is without meaning and would not detect B cell neoplasia in an unacceptably high number of cases.
An Empirical Taxonomy of Hospital Governing Board Roles
Lee, Shoou-Yih D; Alexander, Jeffrey A; Wang, Virginia; Margolin, Frances S; Combes, John R
2008-01-01
Objective To develop a taxonomy of governing board roles in U.S. hospitals. Data Sources 2005 AHA Hospital Governance Survey, 2004 AHA Annual Survey of Hospitals, and Area Resource File. Study Design A governing board taxonomy was developed using cluster analysis. Results were validated and reviewed by industry experts. Differences in hospital and environmental characteristics across clusters were examined. Data Extraction Methods One-thousand three-hundred thirty-four hospitals with complete information on the study variables were included in the analysis. Principal Findings Five distinct clusters of hospital governing boards were identified. Statistical tests showed that the five clusters had high internal reliability and high internal validity. Statistically significant differences in hospital and environmental conditions were found among clusters. Conclusions The developed taxonomy provides policy makers, health care executives, and researchers a useful way to describe and understand hospital governing board roles. The taxonomy may also facilitate valid and systematic assessment of governance performance. Further, the taxonomy could be used as a framework for governing boards themselves to identify areas for improvement and direction for change. PMID:18355260
Oberle, Michael; Wohlwend, Nadia; Jonas, Daniel; Maurer, Florian P.; Jost, Geraldine; Tschudin-Sutter, Sarah; Vranckx, Katleen; Egli, Adrian
2016-01-01
Background The technical, biological, and inter-center reproducibility of matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI TOF MS) typing data has not yet been explored. The aim of this study is to compare typing data from multiple centers employing bioinformatics using bacterial strains from two past outbreaks and non-related strains. Material/Methods Participants received twelve extended spectrum betalactamase-producing E. coli isolates and followed the same standard operating procedure (SOP) including a full-protein extraction protocol. All laboratories provided visually read spectra via flexAnalysis (Bruker, Germany). Raw data from each laboratory allowed calculating the technical and biological reproducibility between centers using BioNumerics (Applied Maths NV, Belgium). Results Technical and biological reproducibility ranged between 96.8–99.4% and 47.6–94.4%, respectively. The inter-center reproducibility showed a comparable clustering among identical isolates. Principal component analysis indicated a higher tendency to cluster within the same center. Therefore, we used a discriminant analysis, which completely separated the clusters. Next, we defined a reference center and performed a statistical analysis to identify specific peaks to identify the outbreak clusters. Finally, we used a classifier algorithm and a linear support vector machine on the determined peaks as classifier. A validation showed that within the set of the reference center, the identification of the cluster was 100% correct with a large contrast between the score with the correct cluster and the next best scoring cluster. Conclusions Based on the sufficient technical and biological reproducibility of MALDI-TOF MS based spectra, detection of specific clusters is possible from spectra obtained from different centers. However, we believe that a shared SOP and a bioinformatics approach are required to make the analysis robust and reliable. PMID:27798637
Horsch, Salome; Kopczynski, Dominik; Kuthe, Elias; Baumbach, Jörg Ingo; Rahmann, Sven
2017-01-01
Motivation Disease classification from molecular measurements typically requires an analysis pipeline from raw noisy measurements to final classification results. Multi capillary column—ion mobility spectrometry (MCC-IMS) is a promising technology for the detection of volatile organic compounds in the air of exhaled breath. From raw measurements, the peak regions representing the compounds have to be identified, quantified, and clustered across different experiments. Currently, several steps of this analysis process require manual intervention of human experts. Our goal is to identify a fully automatic pipeline that yields competitive disease classification results compared to an established but subjective and tedious semi-manual process. Method We combine a large number of modern methods for peak detection, peak clustering, and multivariate classification into analysis pipelines for raw MCC-IMS data. We evaluate all combinations on three different real datasets in an unbiased cross-validation setting. We determine which specific algorithmic combinations lead to high AUC values in disease classifications across the different medical application scenarios. Results The best fully automated analysis process achieves even better classification results than the established manual process. The best algorithms for the three analysis steps are (i) SGLTR (Savitzky-Golay Laplace-operator filter thresholding regions) and LM (Local Maxima) for automated peak identification, (ii) EM clustering (Expectation Maximization) and DBSCAN (Density-Based Spatial Clustering of Applications with Noise) for the clustering step and (iii) RF (Random Forest) for multivariate classification. Thus, automated methods can replace the manual steps in the analysis process to enable an unbiased high throughput use of the technology. PMID:28910313
Cluster analysis of multiple planetary flow regimes
NASA Technical Reports Server (NTRS)
Mo, Kingtse; Ghil, Michael
1987-01-01
A modified cluster analysis method was developed to identify spatial patterns of planetary flow regimes, and to study transitions between them. This method was applied first to a simple deterministic model and second to Northern Hemisphere (NH) 500 mb data. The dynamical model is governed by the fully-nonlinear, equivalent-barotropic vorticity equation on the sphere. Clusters of point in the model's phase space are associated with either a few persistent or with many transient events. Two stationary clusters have patterns similar to unstable stationary model solutions, zonal, or blocked. Transient clusters of wave trains serve as way stations between the stationary ones. For the NH data, cluster analysis was performed in the subspace of the first seven empirical orthogonal functions (EOFs). Stationary clusters are found in the low-frequency band of more than 10 days, and transient clusters in the bandpass frequency window between 2.5 and 6 days. In the low-frequency band three pairs of clusters determine, respectively, EOFs 1, 2, and 3. They exhibit well-known regional features, such as blocking, the Pacific/North American (PNA) pattern and wave trains. Both model and low-pass data show strong bimodality. Clusters in the bandpass window show wave-train patterns in the two jet exit regions. They are related, as in the model, to transitions between stationary clusters.
Chubachi, Shotaro; Sato, Minako; Kameyama, Naofumi; Tsutsumi, Akihiro; Sasaki, Mamoru; Tateno, Hiroki; Nakamura, Hidetoshi; Asano, Koichiro; Betsuyaku, Tomoko
2016-08-01
Patients with chronic obstructive pulmonary disease (COPD) frequently suffer from various comorbidities. Recently, cluster analysis has been proposed to examine the phenotypic heterogeneity in COPD. In order to comprehensively understand the comorbidities of COPD in Japan, we conducted multicenter, longitudinal cohort study, called the Keio COPD Comorbidity Research (K-CCR). In this cohort, comorbid diagnoses were established by both objective examination and review of clinical records, in addition to self-report. We aimed to investigate the clustering of nineteen clinically relevant comorbidities and the meaningful outcomes of the clusters over a two-year follow-up period. The present study analyzed data from COPD patients whose data of comorbidities were completed (n = 311). Cluster analysis was performed using Ward's minimum-variance method. Five comorbidity clusters were identified: less comorbidity; malignancy; metabolic and cardiovascular; gastroesophageal reflux disease (GERD) and psychological; and underweight and anemic. FEV1 did not differ among the clusters. GERD and psychological cluster had worse COPD assessment test (CAT) and Saint George's respiratory questionnaire (SGRQ) at baseline compared to the other clusters (CAT: p = 0.0003 and SGRQ: p = 0.00046). The rate of change in these scores did not differ within 2 years. The underweight and anemic cluster included subjects with lower baseline ratio of predicted diffusing capacity (DLco/VA) compared to the malignancy cluster (p = 0.036). Five clusters of comorbidities were identified in Japanese COPD patients. The clinical characteristics and health-related quality of life were different among these clusters during a follow-up of two years. Copyright © 2016 Elsevier Ltd. All rights reserved.
Distribution and Genetic Diversity of Bacteriocin Gene Clusters in Rumen Microbial Genomes.
Azevedo, Analice C; Bento, Cláudia B P; Ruiz, Jeronimo C; Queiroz, Marisa V; Mantovani, Hilário C
2015-10-01
Some species of ruminal bacteria are known to produce antimicrobial peptides, but the screening procedures have mostly been based on in vitro assays using standardized methods. Recent sequencing efforts have made available the genome sequences of hundreds of ruminal microorganisms. In this work, we performed genome mining of the complete and partial genome sequences of 224 ruminal bacteria and 5 ruminal archaea to determine the distribution and diversity of bacteriocin gene clusters. A total of 46 bacteriocin gene clusters were identified in 33 strains of ruminal bacteria. Twenty gene clusters were related to lanthipeptide biosynthesis, while 11 gene clusters were associated with sactipeptide production, 7 gene clusters were associated with class II bacteriocin production, and 8 gene clusters were associated with class III bacteriocin production. The frequency of strains whose genomes encode putative antimicrobial peptide precursors was 14.4%. Clusters related to the production of sactipeptides were identified for the first time among ruminal bacteria. BLAST analysis indicated that the majority of the gene clusters (88%) encoding putative lanthipeptides contained all the essential genes required for lanthipeptide biosynthesis. Most strains of Streptococcus (66.6%) harbored complete lanthipeptide gene clusters, in addition to an open reading frame encoding a putative class II bacteriocin. Albusin B-like proteins were found in 100% of the Ruminococcus albus strains screened in this study. The in silico analysis provided evidence of novel biosynthetic gene clusters in bacterial species not previously related to bacteriocin production, suggesting that the rumen microbiota represents an underexplored source of antimicrobial peptides. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Dimensional assessment of personality pathology in patients with eating disorders.
Goldner, E M; Srikameswaran, S; Schroeder, M L; Livesley, W J; Birmingham, C L
1999-02-22
This study examined patients with eating disorders on personality pathology using a dimensional method. Female subjects who met DSM-IV diagnostic criteria for eating disorder (n = 136) were evaluated and compared to an age-controlled general population sample (n = 68). We assessed 18 features of personality disorder with the Dimensional Assessment of Personality Pathology - Basic Questionnaire (DAPP-BQ). Factor analysis and cluster analysis were used to derive three clusters of patients. A five-factor solution was obtained with limited intercorrelation between factors. Cluster analysis produced three clusters with the following characteristics: Cluster 1 members (constituting 49.3% of the sample and labelled 'rigid') had higher mean scores on factors denoting compulsivity and interpersonal difficulties; Cluster 2 (18.4% of the sample) showed highest scores in factors denoting psychopathy, neuroticism and impulsive features, and appeared to constitute a borderline psychopathology group; Cluster 3 (32.4% of the sample) was characterized by few differences in personality pathology in comparison to the normal population sample. Cluster membership was associated with DSM-IV diagnosis -- a large proportion of patients with anorexia nervosa were members of Cluster 1. An empirical classification of eating-disordered patients derived from dimensional assessment of personality pathology identified three groups with clinical relevance.
Griss, Johannes; Perez-Riverol, Yasset; Lewis, Steve; Tabb, David L.; Dianes, José A.; del-Toro, Noemi; Rurik, Marc; Walzer, Mathias W.; Kohlbacher, Oliver; Hermjakob, Henning; Wang, Rui; Vizcaíno, Juan Antonio
2016-01-01
Mass spectrometry (MS) is the main technology used in proteomics approaches. However, on average 75% of spectra analysed in an MS experiment remain unidentified. We propose to use spectrum clustering at a large-scale to shed a light on these unidentified spectra. PRoteomics IDEntifications database (PRIDE) Archive is one of the largest MS proteomics public data repositories worldwide. By clustering all tandem MS spectra publicly available in PRIDE Archive, coming from hundreds of datasets, we were able to consistently characterize three distinct groups of spectra: 1) incorrectly identified spectra, 2) spectra correctly identified but below the set scoring threshold, and 3) truly unidentified spectra. Using a multitude of complementary analysis approaches, we were able to identify less than 20% of the consistently unidentified spectra. The complete spectrum clustering results are available through the new version of the PRIDE Cluster resource (http://www.ebi.ac.uk/pride/cluster). This resource is intended, among other aims, to encourage and simplify further investigation into these unidentified spectra. PMID:27493588
Griss, Johannes; Perez-Riverol, Yasset; Lewis, Steve; Tabb, David L; Dianes, José A; Del-Toro, Noemi; Rurik, Marc; Walzer, Mathias W; Kohlbacher, Oliver; Hermjakob, Henning; Wang, Rui; Vizcaíno, Juan Antonio
2016-08-01
Mass spectrometry (MS) is the main technology used in proteomics approaches. However, on average 75% of spectra analysed in an MS experiment remain unidentified. We propose to use spectrum clustering at a large-scale to shed a light on these unidentified spectra. PRoteomics IDEntifications database (PRIDE) Archive is one of the largest MS proteomics public data repositories worldwide. By clustering all tandem MS spectra publicly available in PRIDE Archive, coming from hundreds of datasets, we were able to consistently characterize three distinct groups of spectra: 1) incorrectly identified spectra, 2) spectra correctly identified but below the set scoring threshold, and 3) truly unidentified spectra. Using a multitude of complementary analysis approaches, we were able to identify less than 20% of the consistently unidentified spectra. The complete spectrum clustering results are available through the new version of the PRIDE Cluster resource (http://www.ebi.ac.uk/pride/cluster). This resource is intended, among other aims, to encourage and simplify further investigation into these unidentified spectra.
Yang, Yung-Hun; Kim, Ji-Nu; Song, Eunjung; Kim, Eunjung; Oh, Min-Kyu; Kim, Byung-Gee
2008-09-01
In order to identify the regulators involved in antibiotic production or time-specific cellular events, the messenger ribonucleic acid (mRNA) expression data of the two gene clusters, actinorhodin (ACT) and undecylprodigiosin (RED) biosynthetic genes, were clustered with known mRNA expression data of regulators from S. coelicolor using a filtering method based on standard deviation and clustering analysis. The result identified five regulators including two well-known regulators namely, SCO3579 (WlbA) and SCO6722 (SsgD). Using overexpression and deletion of the regulator genes, we were able to identify two regulators, i.e., SCO0608 and SCO6808, playing roles as repressors in antibiotics production and sporulation. This approach can be easily applied to mapping out new regulators related to any interesting target gene clusters showing characteristic expression patterns. The result can also be used to provide insightful information on the selection rules among a large number of regulators.
Tsui, Sharon; Denison, Julie A; Kennedy, Caitlin E; Chang, Larry W; Koole, Olivier; Torpey, Kwasi; Van Praag, Eric; Farley, Jason; Ford, Nathan; Stuart, Leine; Wabwire-Mangen, Fred
2017-12-06
Organization of HIV care and treatment services, including clinic staffing and services, may shape clinical and financial outcomes, yet there has been little attempt to describe different models of HIV care in sub-Saharan Africa (SSA). Information about the relative benefits and drawbacks of different models could inform the scale-up of antiretroviral therapy (ART) and associated services in resource-limited settings (RLS), especially in light of expanded client populations with country adoption of WHO's test and treat recommendation. We characterized task-shifting/task-sharing practices in 19 diverse ART clinics in Tanzania, Uganda, and Zambia and used cluster analysis to identify unique models of service provision. We ran descriptive statistics to explore how the clusters varied by environmental factors and programmatic characteristics. Finally, we employed the Delphi Method to make systematic use of expert opinions to ensure that the cluster variables were meaningful in the context of actual task-shifting of ART services in SSA. The cluster analysis identified three task-shifting/task-sharing models. The main differences across models were the availability of medical doctors, the scope of clinical responsibility assigned to nurses, and the use of lay health care workers. Patterns of healthcare staffing in HIV service delivery were associated with different environmental factors (e.g., health facility levels, urban vs. rural settings) and programme characteristics (e.g., community ART distribution or integrated tuberculosis treatment on-site). Understanding the relative advantages and disadvantages of different models of care can help national programmes adapt to increased client load, select optimal adherence strategies within decentralized models of care, and identify differentiated models of care for clients to meet the growing needs of long-term ART patients who require more complicated treatment management.
Tobacco, Marijuana, and Alcohol Use in University Students: A Cluster Analysis
Primack, Brian A.; Kim, Kevin H.; Shensa, Ariel; Sidani, Jaime E.; Barnett, Tracey E.; Switzer, Galen E.
2012-01-01
Objective Segmentation of populations may facilitate development of targeted substance abuse prevention programs. We aimed to partition a national sample of university students according to profiles based on substance use. Participants We used 2008–2009 data from the National College Health Assessment from the American College Health Association. Our sample consisted of 111,245 individuals from 158 institutions. Method We partitioned the sample using cluster analysis according to current substance use behaviors. We examined the association of cluster membership with individual and institutional characteristics. Results Cluster analysis yielded six distinct clusters. Three individual factors—gender, year in school, and fraternity/sorority membership—were the most strongly associated with cluster membership. Conclusions In a large sample of university students, we were able to identify six distinct patterns of substance abuse. It may be valuable to target specific populations of college-aged substance users based on individual factors. However, comprehensive intervention will require a multifaceted approach. PMID:22686360
Huang, Rao; Lo, Li-Ta; Wen, Yuhua; Voter, Arthur F; Perez, Danny
2017-10-21
Modern molecular-dynamics-based techniques are extremely powerful to investigate the dynamical evolution of materials. With the increase in sophistication of the simulation techniques and the ubiquity of massively parallel computing platforms, atomistic simulations now generate very large amounts of data, which have to be carefully analyzed in order to reveal key features of the underlying trajectories, including the nature and characteristics of the relevant reaction pathways. We show that clustering algorithms, such as the Perron Cluster Cluster Analysis, can provide reduced representations that greatly facilitate the interpretation of complex trajectories. To illustrate this point, clustering tools are used to identify the key kinetic steps in complex accelerated molecular dynamics trajectories exhibiting shape fluctuations in Pt nanoclusters. This analysis provides an easily interpretable coarse representation of the reaction pathways in terms of a handful of clusters, in contrast to the raw trajectory that contains thousands of unique states and tens of thousands of transitions.
NASA Astrophysics Data System (ADS)
Huang, Rao; Lo, Li-Ta; Wen, Yuhua; Voter, Arthur F.; Perez, Danny
2017-10-01
Modern molecular-dynamics-based techniques are extremely powerful to investigate the dynamical evolution of materials. With the increase in sophistication of the simulation techniques and the ubiquity of massively parallel computing platforms, atomistic simulations now generate very large amounts of data, which have to be carefully analyzed in order to reveal key features of the underlying trajectories, including the nature and characteristics of the relevant reaction pathways. We show that clustering algorithms, such as the Perron Cluster Cluster Analysis, can provide reduced representations that greatly facilitate the interpretation of complex trajectories. To illustrate this point, clustering tools are used to identify the key kinetic steps in complex accelerated molecular dynamics trajectories exhibiting shape fluctuations in Pt nanoclusters. This analysis provides an easily interpretable coarse representation of the reaction pathways in terms of a handful of clusters, in contrast to the raw trajectory that contains thousands of unique states and tens of thousands of transitions.
Maternal styles of talking about child feeding across sociodemographic groups.
Pesch, Megan H; Harrell, Kristina J; Kaciroti, Niko; Rosenblum, Katherine L; Lumeng, Julie C
2011-12-01
This study sought to identify maternal styles of talking about child feeding from a semistructured interview and to evaluate associated maternal and child characteristics. Mothers of preschool-aged children (n=133) of diverse race/ethnicity and socioeconomic status (SES) (45 lower SES black, 29 lower SES white, 32 lower SES Hispanic, 15 middle to upper SES white, and 12 middle to upper SES Asian) participated in a semistructured interview about feeding. Interviews were audiotaped and transcribed. Themes were identified, and individual interviews were coded within these themes: authority (high/low), confidence (confident/conflicted/unopinionated), and investment (deep/mild/removed). Demographic characteristics were collected and a subset of children had measured weights and heights. Cluster analysis was used to identify narrative styles. Participant characteristics were compared across clusters using Fisher's exact test and analysis of variance. Six narrative styles were identified: Easy-Going, Practical No-Nonsense, Disengaged, Effortful No-Nonsense, Indulgent Worry, and Conflicted Control. Cluster membership differed significantly based on maternal demographic group (P<0.001) and child weight status (P<0.05). More than half (60%) of children of mothers in the Conflicted Control cluster were obese. Maternal styles of talking about feeding are associated with maternal and child characteristics. Copyright © 2011 American Dietetic Association. Published by Elsevier Inc. All rights reserved.
Li, Qiuchun; Wang, Xin; Yin, Kequan; Hu, Yachen; Xu, Haiyan; Xie, Xiaolei; Xu, Lijuan; Fei, Xiao; Chen, Xiang; Jiao, Xinan
2018-02-02
Salmonella enterica serovar Enteritidis (S. Enteritidis) is one of the most prevalent serotypes in Salmonella isolated from poultry and the most commonly reported cause of human salmonellosis. In this study, we aimed to assess the genetic diversity of 329 S. Enteritidis strains isolated from different sources from 2009 to 2016 in China. Clustered regularly interspaced short palindromic repeat (CRISPR) typing was used to characterize these 262 chicken clinical isolates, 38 human isolates, 18 pig isolates, six duck isolates, three goose isolates and two isolates of unknown source. A total of 18 Enteritidis CRISPR types (ECTs) were identified, with ECT2, ECT8 and ECT4 as the top three ECTs. CRISPR typing identified ECT2 as the most prevalent ECT, which accounted for 41% of S. Enteritidis strains from all the sources except duck. ECT9 and ECT13 were identified in both pig and human isolates and revealed potential transmission from pig to human. A cluster analysis distributed 18 ECTs, including the top three ECTs, into four lineages with LI as the predominant lineage. Forty-eight out of 329 isolates were subjected to whole genome sequence typing, which divided them into four clusters, with Cluster I as the predominant cluster. Cluster I included 92% (34/37) of strains located in LI identified from the CRISPR typing, confirming the good correspondence between both typing methods. In addition, the CRISPR typing also revealed the close relationship between ECTs and isolated areas, confirming that CRISPR spacers might be obtained by bacteria from the unique phage or plasmid pools in the environment. However, further analysis is needed to determine the function of CRISPR-Cas systems in Salmonella and the relationship between spacers and the environment. Copyright © 2017 Elsevier B.V. All rights reserved.
Dietary BMAA Exposure in an Amyotrophic Lateral Sclerosis Cluster from Southern France
Masseret, Estelle; Banack, Sandra; Boumédiène, Farid; Abadie, Eric; Brient, Luc; Pernet, Fabrice; Juntas-Morales, Raoul; Pageot, Nicolas; Metcalf, James; Cox, Paul; Camu, William
2013-01-01
Background Dietary exposure to the cyanotoxin BMAA is suspected to be the cause of amyotrophic lateral sclerosis in the Western Pacific Islands. In Europe and North America, this toxin has been identified in the marine environment of amyotrophic lateral sclerosis clusters but, to date, only few dietary exposures have been described. Objectives We aimed at identifying cluster(s) of amyotrophic lateral sclerosis in the Hérault district, a coastal district from Southern France, and to search, in the identified area(s), for the existence of a potential dietary source of BMAA. Methods A spatio-temporal cluster analysis was performed in the district, considering all incident amyotrophic lateral sclerosis cases identified from 1994 to 2009 by our expert center. We investigated the cluster area with serial collections of oysters and mussels that were subsequently analyzed blind for BMAA concentrations. Results We found one significant amyotrophic lateral sclerosis cluster (p = 0.0024), surrounding the Thau lagoon, the most important area of shellfish production and consumption along the French Mediterranean coast. BMAA was identified in mussels (1.8 µg/g to 6.0 µg/g) and oysters (0.6 µg/g to 1.6 µg/g). The highest concentrations of BMAA were measured during summer when the highest picocyanobacteria abundances were recorded. Conclusions While it is not possible to ascertain a direct link between shellfish consumption and the existence of this ALS cluster, these results add new data to the potential association of BMAA with sporadic amyotrophic lateral sclerosis, one of the most severe neurodegenerative disorder. PMID:24349504
Haarmann, Thomas; Machado, Caroline; Lübbe, Yvonne; Correia, Telmo; Schardl, Christopher L; Panaccione, Daniel G; Tudzynski, Paul
2005-06-01
The genomic region of Claviceps purpurea strain P1 containing the ergot alkaloid gene cluster [Tudzynski, P., Hölter, K., Correia, T., Arntz, C., Grammel, N., Keller, U., 1999. Evidence for an ergot alkaloid gene cluster in Claviceps purpurea. Mol. Gen. Genet. 261, 133-141] was explored by chromosome walking, and additional genes probably involved in the ergot alkaloid biosynthesis have been identified. The putative cluster sequence (extending over 68.5kb) contains 4 different nonribosomal peptide synthetase (NRPS) genes and several putative oxidases. Northern analysis showed that most of the genes were co-regulated (repressed by high phosphate), and identified probable flanking genes by lack of co-regulation. Comparison of the cluster sequences of strain P1, an ergotamine producer, with that of strain ECC93, an ergocristine producer, showed high conservation of most of the cluster genes, but significant variation in the NRPS modules, strongly suggesting that evolution of these chemical races of C. purpurea is determined by evolution of NRPS module specificity.
Symptom clusters in patients with nasopharyngeal carcinoma during radiotherapy.
Xiao, Wenli; Chan, Carmen W H; Fan, Yuying; Leung, Doris Y P; Xia, Weixiong; He, Yan; Tang, Linquan
2017-06-01
Despite the improvement in radiotherapy (RT) technology, patients with nasopharyngeal carcinoma (NPC) still suffer from numerous distressing symptoms simultaneously during RT. The purpose of the study was to investigate the symptom clusters experienced by NPC patients during RT. First-treated Chinese NPC patients (n = 130) undergoing late-period RT (from week 4 till the end) were recruited for this cross-sectional study. They completed a sociodemographic and clinical data questionnaire, the Chinese version of the M. D. Anderson Symptom Inventory - Head and Neck Module (MDASI-HN-C) and the Chinese version of the Functional Assessment of Cancer Therapy - Head and Neck Scale (FACT-H&N-C). Principal axis factor analysis with oblimin rotation, independent t-test, one-way analysis of variance (ANOVA) and Pearson product-moment correlation were used to analyze the data. Four symptom clusters were identified, and labelled general, gastrointestinal, nutrition impact and social interaction impact. Of these 4 types, the nutrition impact symptom cluster was the most severe. Statistically positive correlations were found between severity of all 4 symptom clusters and symptom interference, as well as weight loss. Statistically negative correlations were detected between the cluster severity and the QOL total score and 3 out of 5 subscale scores. The four clusters identified reveal the symptom patterns experienced by NPC patients during RT. Future intervention studies on managing these symptom clusters are warranted, especially for the nutrition impact symptom cluster. Copyright © 2017 Elsevier Ltd. All rights reserved.
Vékony, Hedy; Röser, Kerstin; Löning, Thomas; Ylstra, Bauke; Meijer, Gerrit A; van Wieringen, Wessel N; van de Wiel, Mark A; Carvalho, Beatriz; Kok, Klaas; Leemans, C René; van der Waal, Isaäc; Bloemena, Elisabeth
2009-02-01
Salivary gland myoepithelial tumors are relatively uncommon tumors with an unpredictable clinical course. More knowledge about their genetic profiles is necessary to identify novel predictors of disease. In this study, we subjected 27 primary tumors (15 myoepitheliomas and 12 myoepithelial carcinomas) to genome-wide microarray-based comparative genomic hybridization (array CGH). We set out to delineate known chromosomal aberrations in more detail and to unravel chromosomal differences between benign myoepitheliomas and myoepithelial carcinomas. Patterns of DNA copy number aberrations were analyzed by unsupervised hierarchical cluster analysis. Both benign and malignant tumors revealed a limited amount of chromosomal alterations (median of 5 and 7.5, respectively). In both tumor groups, high frequency gains (> or =20%) were found mainly at loci of growth factors and growth factor receptors (e.g., PDGF, FGF(R)s, and EGFR). In myoepitheliomas, high frequency losses (> or =20%) were detected at regions of proto-cadherins. Cluster analysis of the array CGH data identified three clusters. Differential copy numbers on chromosome arm 8q and chromosome 17 set the clusters apart. Cluster 1 contained a mixture of the two phenotypes (n = 10), cluster 2 included mostly benign tumors (n = 10), and cluster 3 only contained carcinomas (n = 7). Supervised analysis between malignant and benign tumors revealed a 36 Mbp-region at 8q being more frequently gained in malignant tumors (P = 0.007, FDR = 0.05). This is the first study investigating genomic differences between benign and malignant myoepithelial tumors of the salivary glands at a genomic level. Both unsupervised and supervised analysis of the genomic profiles revealed chromosome arm 8q to be involved in the malignant phenotype of salivary gland myoepitheliomas.
Ji, N Y; Capone, G T; Kaufmann, W E
2011-11-01
The diagnostic validity of autism spectrum disorder (ASD) based on Diagnostic and Statistical Manual of Mental Disorders (DSM) has been challenged in Down syndrome (DS), because of the high prevalence of cognitive impairments in this population. Therefore, we attempted to validate DSM-based diagnoses via an unbiased categorisation of participants with a DSM-independent behavioural instrument. Based on scores on the Aberrant Behaviour Checklist - Community, we performed sequential factor (four DS-relevant factors: Autism-Like Behaviour, Disruptive Behaviour, Hyperactivity, Self-Injury) and cluster analyses on a 293-participant paediatric DS clinic cohort. The four resulting clusters were compared with DSM-delineated groups: DS + ASD, DS + None (no DSM diagnosis), DS + DBD (disruptive behaviour disorder) and DS + SMD (stereotypic movement disorder), the latter two as comparison groups. Two clusters were identified with DS + ASD: Cluster 1 (35.1%) with higher disruptive behaviour and Cluster 4 (48.2%) with more severe autistic behaviour and higher percentage of late onset ASD. The majority of participants in DS + None (71.9%) and DS + DBD (87.5%) were classified into Cluster 2 and 3, respectively, while participants in DS + SMD were relatively evenly distributed throughout the four clusters. Our unbiased, DSM-independent analyses, using a rating scale specifically designed for individuals with severe intellectual disability, demonstrated that DSM-based criteria of ASD are applicable to DS individuals despite their cognitive impairments. Two DS + ASD clusters were identified and supported the existence of at least two subtypes of ASD in DS, which deserve further characterisation. Despite the prominence of stereotypic behaviour in DS, the SMD diagnosis was not identified by cluster analysis, suggesting that high-level stereotypy is distributed throughout DS. Further supporting DSM diagnoses, typically behaving DS participants were easily distinguished as a group from those with maladaptive behaviours. © 2011 The Authors. Journal of Intellectual Disability Research © 2011 Blackwell Publishing Ltd.
Cluster Analysis of Vulnerable Groups in Acute Traumatic Brain Injury Rehabilitation.
Kucukboyaci, N Erkut; Long, Coralynn; Smith, Michelle; Rath, Joseph F; Bushnik, Tamara
2018-01-06
To analyze the complex relation between various social indicators that contribute to socioeconomic status and health care barriers. Cluster analysis of historical patient data obtained from inpatient visits. Inpatient rehabilitation unit in a large urban university hospital. Adult patients (N=148) receiving acute inpatient care, predominantly for closed head injury. Not applicable. We examined the membership of patients with traumatic brain injury in various "vulnerable group" clusters (eg, homeless, unemployed, racial/ethnic minority) and characterized the rehabilitation outcomes of patients (eg, duration of stay, changes in FIM scores between admission to inpatient stay and discharge). The cluster analysis revealed 4 major clusters (ie, clusters A-D) separated by vulnerable group memberships, with distinct durations of stay and FIM gains during their stay. Cluster B, the largest cluster and also consisting of mostly racial/ethnic minorities, had the shortest duration of hospital stay and one of the lowest FIM improvements among the 4 clusters despite higher FIM scores at admission. In cluster C, also consisting of mostly ethnic minorities with multiple socioeconomic status vulnerabilities, patients were characterized by low cognitive FIM scores at admission and the longest duration of stay, and they showed good improvement in FIM scores. Application of clustering techniques to inpatient data identified distinct clusters of patients who may experience differences in their rehabilitation outcome due to their membership in various "at-risk" groups. The results identified patients (ie, cluster B, with minority patients; and cluster D, with elderly patients) who attain below-average gains in brain injury rehabilitation. The results also suggested that systemic (eg, duration of stay) or clinical service improvements (eg, staff's language skills, ability to offer substance abuse therapy, provide appropriate referrals, liaise with intensive social work services, or plan subacute rehabilitation phase) could be beneficial for acute settings. Stronger recruitment, training, and retention initiatives for bilingual and multiethnic professionals may also be considered to optimize gains from acute inpatient rehabilitation after traumatic brain injury. Copyright © 2017 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Saavedra, Milene T; Quon, Bradley S; Faino, Anna; Caceres, Silvia M; Poch, Katie R; Sanders, Linda A; Malcolm, Kenneth C; Nichols, David P; Sagel, Scott D; Taylor-Cousar, Jennifer L; Leach, Sonia M; Strand, Matthew; Nick, Jerry A
2018-05-01
Cystic fibrosis pulmonary exacerbations accelerate pulmonary decline and increase mortality. Previously, we identified a 10-gene leukocyte panel measured directly from whole blood, which indicates response to exacerbation treatment. We hypothesized that molecular characteristics of exacerbations could also predict future disease severity. We tested whether a 10-gene panel measured from whole blood could identify patient cohorts at increased risk for severe morbidity and mortality, beyond standard clinical measures. Transcript abundance for the 10-gene panel was measured from whole blood at the beginning of exacerbation treatment (n = 57). A hierarchical cluster analysis of subjects based on their gene expression was performed, yielding four molecular clusters. An analysis of cluster membership and outcomes incorporating an independent cohort (n = 21) was completed to evaluate robustness of cluster partitioning of genes to predict severe morbidity and mortality. The four molecular clusters were analyzed for differences in forced expiratory volume in 1 second, C-reactive protein, return to baseline forced expiratory volume in 1 second after treatment, time to next exacerbation, and time to morbidity or mortality events (defined as lung transplant referral, lung transplant, intensive care unit admission for respiratory insufficiency, or death). Clustering based on gene expression discriminated between patient groups with significant differences in forced expiratory volume in 1 second, admission frequency, and overall morbidity and mortality. At 5 years, all subjects in cluster 1 (very low risk) were alive and well, whereas 90% of subjects in cluster 4 (high risk) had suffered a major event (P = 0.0001). In multivariable analysis, the ability of gene expression to predict clinical outcomes remained significant, despite adjustment for forced expiratory volume in 1 second, sex, and admission frequency. The robustness of gene clustering to categorize patients appropriately in terms of clinical characteristics, and short- and long-term clinical outcomes, remained consistent, even when adding in a secondary population with significantly different clinical outcomes. Whole blood gene expression profiling allows molecular classification of acute pulmonary exacerbations, beyond standard clinical measures, providing a predictive tool for identifying subjects at increased risk for mortality and disease progression.
Applied anatomic site study of palatal anchorage implants using cone beam computed tomography.
Lai, Ren-fa; Zou, Hui; Kong, Wei-dong; Lin, Wei
2010-06-01
The purpose of this study was to conduct quantitative research on bone height and bone mineral density of palatal implant sites for implantation, and to provide reference sites for safe and stable palatal implants. Three-dimensional reformatting images were reconstructed by cone beam computed tomography (CBCT) in 34 patients, aged 18 to 35 years, using EZ Implant software. Bone height was measured at 20 sites of interest on the palate. Bone mineral density was measured at the 10 sites with the highest implantation rate, classified using K-mean cluster analysis based on bone height and bone mineral density. According to the cluster analysis, 10 sites were classified into three clusters. Significant differences in bone height and bone mineral density were detected between these three clusters (P<0.05). The greatest bone height was obtained in cluster 2, followed by cluster 1 and cluster 3. The highest bone mineral density was found in cluster 3, followed by cluster 1 and cluster 2. CBCT plays an important role in pre-surgical treatment planning. CBCT is helpful in identifying safe and stable implantation sites for palatal anchorage.
Blecha, Kevin A.; Alldredge, Mat W.
2015-01-01
Animal space use studies using GPS collar technology are increasingly incorporating behavior based analysis of spatio-temporal data in order to expand inferences of resource use. GPS location cluster analysis is one such technique applied to large carnivores to identify the timing and location of feeding events. For logistical and financial reasons, researchers often implement predictive models for identifying these events. We present two separate improvements for predictive models that future practitioners can implement. Thus far, feeding prediction models have incorporated a small range of covariates, usually limited to spatio-temporal characteristics of the GPS data. Using GPS collared cougar (Puma concolor) we include activity sensor data as an additional covariate to increase prediction performance of feeding presence/absence. Integral to the predictive modeling of feeding events is a ground-truthing component, in which GPS location clusters are visited by human observers to confirm the presence or absence of feeding remains. Failing to account for sources of ground-truthing false-absences can bias the number of predicted feeding events to be low. Thus we account for some ground-truthing error sources directly in the model with covariates and when applying model predictions. Accounting for these errors resulted in a 10% increase in the number of clusters predicted to be feeding events. Using a double-observer design, we show that the ground-truthing false-absence rate is relatively low (4%) using a search delay of 2–60 days. Overall, we provide two separate improvements to the GPS cluster analysis techniques that can be expanded upon and implemented in future studies interested in identifying feeding behaviors of large carnivores. PMID:26398546
Identifying a typology of men who use anabolic androgenic steroids (AAS).
Zahnow, Renee; McVeigh, Jim; Bates, Geoff; Hope, Vivian; Kean, Joseph; Campbell, John; Smith, Josie
2018-05-01
Despite recognition that the Anabolic Androgenic Steroid (AAS) using population is diverse, empirical studies to develop theories to conceptualise this variance in use have been limited. In this study, using cluster analysis and multinomial logistic regression, we identify typologies of people who use AAS and examine variations in motivations for AAS use across types in a sample of 611 men who use AAS. The cluster analysis identified four groups in the data with different risk profiles. These groups largely reflect the ideal types of people who use AAS proposed by Christiansen et al. (2016): Cluster 1 (You Only Live Once (YOLO) type, n = 68, 11.1%) were younger and motivated by fat loss; Cluster 2 (Well-being type, n = 236, 38.6%) were concerned with getting fit; Cluster 3 (Athlete type, n = 155, 25.4%) were motivated by muscle and strength gains; Cluster 4 (Expert type, n = 152, 24.9%) were focused on specific goals (i.e. not 'getting fit'). The results of this study demonstrate the need to make information about AAS accessible to the general population and to inform health service providers about variations in motivations and associated risk behaviours. Attention should also be given to ensuring existing harm minimisation services are equipped to disseminate information about safe intra-muscular injecting and ensuring needle disposal sites are accessible to the different types. Copyright © 2018 Elsevier B.V. All rights reserved.
Morin, C; Gandy, J; Brazeilles, R; Moreno, L A; Kavouras, S A; Martinez, H; Salas-Salvadó, J; Bottin, J; Guelinckx, Isabelle
2018-06-01
This study aimed to identify and characterize patterns of fluid intake in children and adolescents from six countries: Argentina, Brazil, China, Indonesia, Mexico and Uruguay. Data on fluid intake volume and type amongst children (4-9 years; N = 1400) and adolescents (10-17 years; N = 1781) were collected using the validated 7-day fluid-specific record (Liq.In 7 record). To identify relatively distinct clusters of subjects based on eight fluid types (water, milk and its derivatives, hot beverages, sugar-sweetened beverages (SSB), 100% fruit juices, artificial/non-nutritive sweetened beverages, alcoholic beverages, other beverages), a cluster analysis (partitioning around k-medoids algorithm) was used. Clusters were then characterized according to their socio-demographics and lifestyle indicators. The six interpretable clusters identified were: low drinkers-SSB (n 523), low drinkers-water and milk (n 615), medium mixed drinkers (n 914), high drinkers-SSB (n 513), high drinkers-water (n 352) and very high drinkers-water (n 264). Country of residence was the dominant characteristic, followed by socioeconomic level, in all six patterns. This analysis showed that consumption of water and SSB were the primary drivers of the clusters. In addition to country, socio-demographic and lifestyle factors played a role in determining the characteristics of each cluster. This information highlights the need to target interventions in particular populations aimed at changing fluid intake behavior and improving health in children and adolescents.
NASA Astrophysics Data System (ADS)
Farsadnia, Farhad; Ghahreman, Bijan
2016-04-01
Hydrologic homogeneous group identification is considered both fundamental and applied research in hydrology. Clustering methods are among conventional methods to assess the hydrological homogeneous regions. Recently, Self-Organizing feature Map (SOM) method has been applied in some studies. However, the main problem of this method is the interpretation on the output map of this approach. Therefore, SOM is used as input to other clustering algorithms. The aim of this study is to apply a two-level Self-Organizing feature map and Ward hierarchical clustering method to determine the hydrologic homogenous regions in North and Razavi Khorasan provinces. At first by principal component analysis, we reduced SOM input matrix dimension, then the SOM was used to form a two-dimensional features map. To determine homogeneous regions for flood frequency analysis, SOM output nodes were used as input into the Ward method. Generally, the regions identified by the clustering algorithms are not statistically homogeneous. Consequently, they have to be adjusted to improve their homogeneity. After adjustment of the homogeneity regions by L-moment tests, five hydrologic homogeneous regions were identified. Finally, adjusted regions were created by a two-level SOM and then the best regional distribution function and associated parameters were selected by the L-moment approach. The results showed that the combination of self-organizing maps and Ward hierarchical clustering by principal components as input is more effective than the hierarchical method, by principal components or standardized inputs to achieve hydrologic homogeneous regions.
Editing ERTS-1 data to exclude land aids cluster analysis of water targets
NASA Technical Reports Server (NTRS)
Erb, R. B. (Principal Investigator)
1973-01-01
The author has identified the following significant results. It has been determined that an increase in the number of spectrally distinct coastal water types is achieved when data values over the adjacent land areas are excluded from the processing routine. This finding resulted from an automatic clustering analysis of ERTS-1 system corrected MSS scene 1002-18134 of 25 July 1972 over Monterey Bay, California. When the entire study area data set was submitted to the clustering only two distinct water classes were extracted. However, when the land area data points were removed from the data set and resubmitted to the clustering routine, four distinct groupings of water features were identified. Additionally, unlike the previous separation, the four types could be correlated to features observable in the associated ERTS-1 imagery. This exercise demonstrates that by proper selection of data submitted to the processing routine, based upon the specific application of study, additional information may be extracted from the ERTS-1 MSS data.
Characteristics of voxel prediction power in full-brain Granger causality analysis of fMRI data
NASA Astrophysics Data System (ADS)
Garg, Rahul; Cecchi, Guillermo A.; Rao, A. Ravishankar
2011-03-01
Functional neuroimaging research is moving from the study of "activations" to the study of "interactions" among brain regions. Granger causality analysis provides a powerful technique to model spatio-temporal interactions among brain regions. We apply this technique to full-brain fMRI data without aggregating any voxel data into regions of interest (ROIs). We circumvent the problem of dimensionality using sparse regression from machine learning. On a simple finger-tapping experiment we found that (1) a small number of voxels in the brain have very high prediction power, explaining the future time course of other voxels in the brain; (2) these voxels occur in small sized clusters (of size 1-4 voxels) distributed throughout the brain; (3) albeit small, these clusters overlap with most of the clusters identified with the non-temporal General Linear Model (GLM); and (4) the method identifies clusters which, while not determined by the task and not detectable by GLM, still influence brain activity.
Fast gene ontology based clustering for microarray experiments.
Ovaska, Kristian; Laakso, Marko; Hautaniemi, Sampsa
2008-11-21
Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.
Data depth based clustering analysis
Jeong, Myeong -Hun; Cai, Yaping; Sullivan, Clair J.; ...
2016-01-01
Here, this paper proposes a new algorithm for identifying patterns within data, based on data depth. Such a clustering analysis has an enormous potential to discover previously unknown insights from existing data sets. Many clustering algorithms already exist for this purpose. However, most algorithms are not affine invariant. Therefore, they must operate with different parameters after the data sets are rotated, scaled, or translated. Further, most clustering algorithms, based on Euclidean distance, can be sensitive to noises because they have no global perspective. Parameter selection also significantly affects the clustering results of each algorithm. Unlike many existing clustering algorithms, themore » proposed algorithm, called data depth based clustering analysis (DBCA), is able to detect coherent clusters after the data sets are affine transformed without changing a parameter. It is also robust to noises because using data depth can measure centrality and outlyingness of the underlying data. Further, it can generate relatively stable clusters by varying the parameter. The experimental comparison with the leading state-of-the-art alternatives demonstrates that the proposed algorithm outperforms DBSCAN and HDBSCAN in terms of affine invariance, and exceeds or matches the ro-bustness to noises of DBSCAN or HDBSCAN. The robust-ness to parameter selection is also demonstrated through the case study of clustering twitter data.« less
Pandolfi, Fanny; Edwards, Sandra A; Maes, Dominiek; Kyriazakis, Ilias
2018-01-01
This study aimed to provide an overview of the interconnections between biosecurity, health, welfare, and performance in commercial pig farms in Great Britain. We collected on-farm data about the level of biosecurity and animal performance in 40 fattening pig farms and 28 breeding pig farms between 2015 and 2016. We identified interconnections between these data, slaughterhouse health indicators, and welfare indicator records in fattening pig farms. After achieving the connections between databases, a secondary data analysis was performed to assess the interconnections between biosecurity, health, welfare, and performance using correlation analysis, principal component analysis, and hierarchical clustering. Although we could connect the different data sources the final sample size was limited, suggesting room for improvement in database connection to conduct secondary data analyses. The farm biosecurity scores ranged from 40 to 90 out of 100, with internal biosecurity scores being lower than external biosecurity scores. Our analysis suggested several interconnections between health, welfare, and performance. The initial correlation analysis showed that the prevalence of lameness and severe tail lesions was associated with the prevalence of enzootic pneumonia-like lesions and pyaemia, and the prevalence of severe body marks was associated with several disease indicators, including peritonitis and milk spots ( r > 0.3; P < 0.05). Higher average daily weight gain (ADG) was associated with lower prevalence of pleurisy ( r > 0.3; P < 0.05), but no connection was identified between mortality and health indicators. A subsequent cluster analysis enabled identification of patterns which considered concurrently indicators of health, welfare, and performance. Farms from cluster 1 had lower biosecurity scores, lower ADG, and higher prevalence of several disease and welfare indicators. Farms from cluster 2 had higher biosecurity scores than cluster 1, but a higher prevalence of pigs requiring hospitalization and lameness which confirmed the correlation between biosecurity and the prevalence of pigs requiring hospitalization ( r > 0.3; P < 0.05). Farms from cluster 3 had higher biosecurity, higher ADG, and lower prevalence for some disease and welfare indicators. The study suggests a smaller impact of biosecurity on issues such as mortality, prevalence of lameness, and pig requiring hospitalization. The correlations and the identified clusters suggested the importance of animal welfare for the pig industry.
Characterizing Suicide in Toronto: An Observational Study and Cluster Analysis
Sinyor, Mark; Schaffer, Ayal; Streiner, David L
2014-01-01
Objective: To determine whether people who have died from suicide in a large epidemiologic sample form clusters based on demographic, clinical, and psychosocial factors. Method: We conducted a coroner’s chart review for 2886 people who died in Toronto, Ontario, from 1998 to 2010, and whose death was ruled as suicide by the Office of the Chief Coroner of Ontario. A cluster analysis using known suicide risk factors was performed to determine whether suicide deaths separate into distinct groups. Clusters were compared according to person- and suicide-specific factors. Results: Five clusters emerged. Cluster 1 had the highest proportion of females and nonviolent methods, and all had depression and a past suicide attempt. Cluster 2 had the highest proportion of people with a recent stressor and violent suicide methods, and all were married. Cluster 3 had mostly males between the ages of 20 and 64, and all had either experienced recent stressors, suffered from mental illness, or had a history of substance abuse. Cluster 4 had the youngest people and the highest proportion of deaths by jumping from height, few were married, and nearly one-half had bipolar disorder or schizophrenia. Cluster 5 had all unmarried people with no prior suicide attempts, and were the least likely to have an identified mental illness and most likely to leave a suicide note. Conclusions: People who die from suicide assort into different patterns of demographic, clinical, and death-specific characteristics. Identifying and studying subgroups of suicides may advance our understanding of the heterogeneous nature of suicide and help to inform development of more targeted suicide prevention strategies. PMID:24444321
Luciano, Juan V; Forero, Carlos G; Cerdà-Lafont, Marta; Peñarrubia-María, María Teresa; Fernández-Vergel, Rita; Cuesta-Vargas, Antonio I; Ruíz, José M; Rozadilla-Sacanell, Antoni; Sirvent-Alierta, Elena; Santo-Panero, Pilar; García-Campayo, Javier; Serrano-Blanco, Antoni; Pérez-Aranda, Adrián; Rubio-Valera, María
2016-10-01
Although fibromyalgia syndrome (FM) is considered a heterogeneous condition, there is no generally accepted subgroup typology. We used hierarchical cluster analysis and latent profile analysis to replicate Giesecke's classification in Spanish FM patients. The second aim was to examine whether the subgroups differed in sociodemographic characteristics, functional status, quality of life, and in direct and indirect costs. A total of 160 FM patients completed the following measures for cluster derivation: the Center for Epidemiological Studies-Depression Scale, the Trait Anxiety Inventory, the Pain Catastrophizing Scale, and the Control over Pain subscale. Pain threshold was measured with a sphygmomanometer. In addition, the Fibromyalgia Impact Questionnaire-Revised, the EuroQoL-5D-3L, and the Client Service Receipt Inventory were administered for cluster validation. Two distinct clusters were identified using hierarchical cluster analysis ("hypersensitive" group, 69.8% and "functional" group, 30.2%). In contrast, the latent profile analysis goodness-of-fit indices supported the existence of 3 FM patient profiles: (1) a "functional" profile (28.1%) defined as moderate tenderness, distress, and pain catastrophizing; (2) a "dysfunctional" profile (45.6%) defined by elevated tenderness, distress, and pain catastrophizing; and (3) a "highly dysfunctional and distressed" profile (26.3%) characterized by elevated tenderness and extremely high distress and catastrophizing. We did not find significant differences in sociodemographic characteristics between the 2 clusters or among the 3 profiles. The functional profile was associated with less impairment, greater quality of life, and lower health care costs. We identified 3 distinct profiles which accounted for the heterogeneity of FM patients. Our findings might help to design tailored interventions for FM patients.
Bosomprah, Samuel; Dotse-Gborgbortsi, Winfred; Aboagye, Patrick; Matthews, Zoe
2016-11-01
To identify and evaluate clusters of births that occurred outside health facilities in Ghana for targeted intervention. A retrospective study was conducted using a convenience sample of live births registered in Ghanaian health facilities from January 1 to December 31, 2014. Data were extracted from the district health information system. A spatial scan statistic was used to investigate clusters of home births through a discrete Poisson probability model. Scanning with a circular spatial window was conducted only for clusters with high rates of such deliveries. The district was used as the geographic unit of analysis. The likelihood P value was estimated using Monte Carlo simulations. Ten statistically significant clusters with a high rate of home birth were identified. The relative risks ranged from 1.43 ("least likely" cluster; P=0.001) to 1.95 ("most likely" cluster; P=0.001). The relative risks of the top five "most likely" clusters ranged from 1.68 to 1.95; these clusters were located in Ashanti, Brong Ahafo, and the Western, Eastern, and Greater regions of Accra. Health facility records, geospatial techniques, and geographic information systems provided locally relevant information to assist policy makers in delivering targeted interventions to small geographic areas. Copyright © 2016 International Federation of Gynecology and Obstetrics. Published by Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Bekti, Rokhana Dwi; Rachmawati, Ro'fah
2014-03-01
The number of birth and death child is the benchmarks to determine and monitor the health and welfare in Indonesia. It can be used to identify groups of people who have a high mortality risk. Identifying group is important to compare the characteristics of human that have high and low risk. These characteristics can be seen from the factors that influenced it. Furthermore, there are factors which influence of birth and death child, such us economic, health facility, education, and others. The influence factors of every individual are different, but there are similarities some individuals which live close together or in the close locations. It means there was spatial effect. To identify group in this research, clustering is done by spatial cluster method, which is view to considering the influence of the location or the relationship between locations. One of spatial cluster method is Spatial 'K'luster Analysis by Tree Edge Removal (SKATER). The research was conducted in Bogor Regency, West Java. The goal was to get a cluster of districts based on the factors that influence birth and death child. SKATER build four number of cluster respectively consists of 26, 7, 2, and 5 districts. SKATER has good performance for clustering which include spatial effect. If it compare by other cluster method, Kmeans has good performance by MANOVA test.
RAPTOR-scan: Identifying and Tracking Objects Through Thousands of Sky Images
DOE Office of Scientific and Technical Information (OSTI.GOV)
Davidoff, Sherri; Wozniak, Przemyslaw
2004-09-28
The RAPTOR-scan system mines data for optical transients associated with gamma-ray bursts and is used to create a catalog for the RAPTOR telescope system. RAPTOR-scan can detect and track individual astronomical objects across data sets containing millions of observed points.Accurately identifying a real object over many optical images (clustering the individual appearances) is necessary in order to analyze object light curves. To achieve this, RAPTOR telescope observations are sent in real time to a database. Each morning, a program based on the DBSCAN algorithm clusters the observations and labels each one with an object identifier. Once clustering is complete, themore » analysis program may be used to query the database and produce light curves, maps of the sky field, or other informative displays.Although RAPTOR-scan was designed for the RAPTOR optical telescope system, it is a general tool designed to identify objects in a collection of astronomical data and facilitate quick data analysis. RAPTOR-scan will be released as free software under the GNU General Public License.« less
Pulley, Simon; Foster, Ian; Collins, Adrian L
2017-06-01
The objective classification of sediment source groups is at present an under-investigated aspect of source tracing studies, which has the potential to statistically improve discrimination between sediment sources and reduce uncertainty. This paper investigates this potential using three different source group classification schemes. The first classification scheme was simple surface and subsurface groupings (Scheme 1). The tracer signatures were then used in a two-step cluster analysis to identify the sediment source groupings naturally defined by the tracer signatures (Scheme 2). The cluster source groups were then modified by splitting each one into a surface and subsurface component to suit catchment management goals (Scheme 3). The schemes were tested using artificial mixtures of sediment source samples. Controlled corruptions were made to some of the mixtures to mimic the potential causes of tracer non-conservatism present when using tracers in natural fluvial environments. It was determined how accurately the known proportions of sediment sources in the mixtures were identified after unmixing modelling using the three classification schemes. The cluster analysis derived source groups (2) significantly increased tracer variability ratios (inter-/intra-source group variability) (up to 2122%, median 194%) compared to the surface and subsurface groupings (1). As a result, the composition of the artificial mixtures was identified an average of 9.8% more accurately on the 0-100% contribution scale. It was found that the cluster groups could be reclassified into a surface and subsurface component (3) with no significant increase in composite uncertainty (a 0.1% increase over Scheme 2). The far smaller effects of simulated tracer non-conservatism for the cluster analysis based schemes (2 and 3) was primarily attributed to the increased inter-group variability producing a far larger sediment source signal that the non-conservatism noise (1). Modified cluster analysis based classification methods have the potential to reduce composite uncertainty significantly in future source tracing studies. Copyright © 2016 Elsevier Ltd. All rights reserved.
Kandadai, Venk; Yang, Haodong; Jiang, Ling; Yang, Christopher C; Fleisher, Linda; Winston, Flaura Koplin
2016-05-05
Little is known about the ability of individual stakeholder groups to achieve health information dissemination goals through Twitter. This study aimed to develop and apply methods for the systematic evaluation and optimization of health information dissemination by stakeholders through Twitter. Tweet content from 1790 followers of @SafetyMD (July-November 2012) was examined. User emphasis, a new indicator of Twitter information dissemination, was defined and applied to retweets across two levels of retweeters originating from @SafetyMD. User interest clusters were identified based on principal component analysis (PCA) and hierarchical cluster analysis (HCA) of a random sample of 170 followers. User emphasis of keywords remained across levels but decreased by 9.5 percentage points. PCA and HCA identified 12 statistically unique clusters of followers within the @SafetyMD Twitter network. This study is one of the first to develop methods for use by stakeholders to evaluate and optimize their use of Twitter to disseminate health information. Our new methods provide preliminary evidence that individual stakeholders can evaluate the effectiveness of health information dissemination and create content-specific clusters for more specific targeted messaging.
Analysis of ligand-protein exchange by Clustering of Ligand Diffusion Coefficient Pairs (CoLD-CoP).
Snyder, David A; Chantova, Mihaela; Chaudhry, Saadia
2015-06-01
NMR spectroscopy is a powerful tool in describing protein structures and protein activity for pharmaceutical and biochemical development. This study describes a method to determine weak binding ligands in biological systems by using hierarchic diffusion coefficient clustering of multidimensional data obtained with a 400 MHz Bruker NMR. Comparison of DOSY spectrums of ligands of the chemical library in the presence and absence of target proteins show translational diffusion rates for small molecules upon interaction with macromolecules. For weak binders such as compounds found in fragment libraries, changes in diffusion rates upon macromolecular binding are on the order of the precision of DOSY diffusion measurements, and identifying such subtle shifts in diffusion requires careful statistical analysis. The "CoLD-CoP" (Clustering of Ligand Diffusion Coefficient Pairs) method presented here uses SAHN clustering to identify protein-binders in a chemical library or even a not fully characterized metabolite mixture. We will show how DOSY NMR and the "CoLD-CoP" method complement each other in identifying the most suitable candidates for lysozyme and wheat germ acid phosphatase. Copyright © 2015 Elsevier Inc. All rights reserved.
Changing the paradigm: messages for hand hygiene education and audit from cluster analysis.
Gould, D J; Navaie, D; Purssell, E; Drey, N S; Creedon, S
2018-04-01
Hand hygiene is considered to be the foremost infection prevention measure. How healthcare workers accept and make sense of the hand hygiene message is likely to contribute to the success and sustainability of initiatives to improve performance, which is often poor. A survey of nurses in critical care units in three National Health Service trusts in England was undertaken to explore opinions about hand hygiene, use of alcohol hand rubs, audit with performance feedback, and other key hand-hygiene-related issues. Data were analysed descriptively and subjected to cluster analysis. Three main clusters of opinion were visualized, each forming a significant group: positive attitudes, pragmatism and scepticism. A smaller cluster suggested possible guilt about ability to perform hand hygiene. Cluster analysis identified previously unsuspected constellations of beliefs about hand hygiene that offer a plausible explanation for behaviour. Healthcare workers might respond to education and audit differently according to these beliefs. Those holding predominantly positive opinions might comply with hand hygiene policy and perform well as infection prevention link nurses and champions. Those holding pragmatic attitudes are likely to respond favourably to the need for professional behaviour and need to protect themselves from infection. Greater persuasion may be needed to encourage those who are sceptical about the importance of hand hygiene to comply with guidelines. Interventions to increase compliance should be sufficiently broad in scope to tackle different beliefs. Alternatively, cluster analysis of hand hygiene beliefs could be used to identify the most effective educational and monitoring strategies for a particular clinical setting. Copyright © 2017 The Healthcare Infection Society. Published by Elsevier Ltd. All rights reserved.
Multiscale Embedded Gene Co-expression Network Analysis
Song, Won-Min; Zhang, Bin
2015-01-01
Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma. PMID:26618778
Multiscale Embedded Gene Co-expression Network Analysis.
Song, Won-Min; Zhang, Bin
2015-11-01
Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.
Sadsad, Rosemarie; Martinez, Elena; Jelfs, Peter; Hill-Cawthorne, Grant A.; Gilbert, Gwendolyn L.; Marais, Ben J.; Sintchenko, Vitali
2016-01-01
Background Improved tuberculosis control and the need to contain the spread of drug-resistant strains provide a strong rationale for exploring tuberculosis transmission dynamics at the population level. Whole-genome sequencing provides optimal strain resolution, facilitating detailed mapping of potential transmission pathways. Methods We sequenced 22 isolates from a Mycobacterium tuberculosis cluster in New South Wales, Australia, identified during routine 24-locus mycobacterial interspersed repetitive unit typing. Following high-depth paired-end sequencing using the Illumina HiSeq 2000 platform, two independent pipelines were employed for analysis, both employing read mapping onto reference genomes as well as de novo assembly, to control biases in variant detection. In addition to single-nucleotide polymorphisms, the analyses also sought to identify insertions, deletions and structural variants. Results Isolates were highly similar, with a distance of 13 variants between the most distant members of the cluster. The most sensitive analysis classified the 22 isolates into 18 groups. Four of the isolates did not appear to share a recent common ancestor with the largest clade; another four isolates had an uncertain ancestral relationship with the largest clade. Conclusion Whole genome sequencing, with analysis of single-nucleotide polymorphisms, insertions, deletions, structural variants and subpopulations, enabled the highest possible level of discrimination between cluster members, clarifying likely transmission pathways and exposing the complexity of strain origin. The analysis provides a basis for targeted public health intervention and enhanced classification of future isolates linked to the cluster. PMID:26938641
Identifying the ideal profile of French yogurts for different clusters of consumers.
Masson, M; Saint-Eve, A; Delarue, J; Blumenthal, D
2016-05-01
Identifying the sensory properties that affect consumer preferences for food products is an important feature of product development. Different methods, such as external preference mapping or partial least squares regression, are used to establish relationships between sensory data and consumer preferences and to identify sensory attributes that drive consumer preferences, by highlighting optimum products. Plain French yogurts were evaluated by a sensory profiling method performed by 12 trained judges. In parallel, 180 consumers were asked to score their overall liking and complete a cognitive restraint questionnaire. After hierarchical cluster analysis on the liking scores, preference mapping using a quadratic regression model was performed. Five clusters of consumers were identified as a function of different preference patterns. Contrary to our expectations, fat levels were not discriminating. For each cluster, the results of preference mapping enabled the identification of optimum products. A comparison of the 5 sensory profiles revealed numerous differences between key sensory attributes. For example, one consumer cluster had a strong preference for products perceived as very thick, grainy, but with a less flowing texture, less sticky, whey presence and color, in contrast to other clusters. In addition, each segment of consumers was characterized according to the results of the cognitive restraint questionnaire. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Characteristics of airflow and particle deposition in COPD current smokers
NASA Astrophysics Data System (ADS)
Zou, Chunrui; Choi, Jiwoong; Haghighi, Babak; Choi, Sanghun; Hoffman, Eric A.; Lin, Ching-Long
2017-11-01
A recent imaging-based cluster analysis of computed tomography (CT) lung images in a chronic obstructive pulmonary disease (COPD) cohort identified four clusters, viz. disease sub-populations. Cluster 1 had relatively normal airway structures; Cluster 2 had wall thickening; Cluster 3 exhibited decreased wall thickness and luminal narrowing; Cluster 4 had a significant decrease of luminal diameter and a significant reduction of lung deformation, thus having relatively low pulmonary functions. To better understand the characteristics of airflow and particle deposition in these clusters, we performed computational fluid and particle dynamics analyses on representative cluster patients and healthy controls using CT-based airway models and subject-specific 3D-1D coupled boundary conditions. The results show that particle deposition in central airways of cluster 4 patients was noticeably increased especially with increasing particle size despite reduced vital capacity as compared to other clusters and healthy controls. This may be attributable in part to significant airway constriction in cluster 4. This study demonstrates the potential application of cluster-guided CFD analysis in disease populations. NIH Grants U01HL114494 and S10-RR022421, and FDA Grant U01FD005837.
Amro, Amin; Waldum, Bård; von der Lippe, Nanna; Brekke, Fredrik Barth; Dammen, Toril; Miaskowski, Christine; Os, Ingrid
2015-01-01
Patients with end-stage renal disease on dialysis have reduced survival rates compared with the general population. Symptoms are frequent in dialysis patients, and a symptom cluster is defined as two or more related co-occurring symptoms. The aim of this study was to explore the associations between symptom clusters and mortality in dialysis patients. In a prospective observational cohort study of dialysis patients (n = 301), Kidney Disease and Quality of Life Short Form and Beck Depression Inventory questionnaires were administered. To generate symptom clusters, principal component analysis with varimax rotation was used on 11 kidney-specific self-reported physical symptoms. A Beck Depression Inventory score of 16 or greater was defined as clinically significant depressive symptoms. Physical and mental component summary scores were generated from Short Form-36. Multivariate Cox regression analysis was used for the survival analysis, Kaplan-Meier curves and log-rank statistics were applied to compare survival rates between the groups. Three different symptom clusters were identified; one included loading of several uremic symptoms. In multivariate analyses and after adjustment for health-related quality of life and depressive symptoms, the worst perceived quartile of the "uremic" symptom cluster independently predicted all-cause mortality (hazard ratio 2.47, 95% CI 1.44-4.22, P = 0.001) compared with the other quartiles during a follow-up period that ranged from four to 52 months. The two other symptom clusters ("neuromuscular" and "skin") or the individual symptoms did not predict mortality. Clustering of uremic symptoms predicted mortality. Assessing co-occurring symptoms rather than single symptoms may help to identify dialysis patients at high risk for mortality. Copyright © 2015 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.
Adnane, Choaib; Adouly, Taoufik; Khallouk, Amine; Rouadi, Sami; Abada, Redallah; Roubal, Mohamed; Mahtar, Mohamed
2017-02-01
The purpose of this study is to use unsupervised cluster methodology to identify phenotype and mucosal eosinophilia endotype subgroups of patients with medical refractory chronic rhinosinusitis (CRS), and evaluate the difference in quality of life (QOL) outcomes after endoscopic sinus surgery (ESS) between these clusters for better surgical case selection. A prospective cohort study included 131 patients with medical refractory CRS who elected ESS. The Sino-Nasal Outcome Test (SNOT-22) was used to evaluate QOL before and 12 months after surgery. Unsupervised two-step clustering method was performed. One hundred and thirteen subjects were retained in this study: 46 patients with CRS without nasal polyps and 67 patients with nasal polyps. Nasal polyps, gender, mucosal eosinophilia profile, and prior sinus surgery were the most discriminating factors in the generated clusters. Three clusters were identified. A significant clinical improvement was observed in all clusters 12 months after surgery with a reduction of SNOT-22 scores. There was a significant difference in QOL outcomes between clusters; cluster 1 had the worst QOL improvement after FESS in comparison with the other clusters 2 and 3. All patients in cluster 1 presented CRSwNP with the highest mucosal eosinophilia endotype. Clustering method is able to classify CRS phenotypes and endotypes with different associated surgical outcomes.
Seismic clusters analysis in Northeastern Italy by the nearest-neighbor approach
NASA Astrophysics Data System (ADS)
Peresan, Antonella; Gentili, Stefania
2018-01-01
The main features of earthquake clusters in Northeastern Italy are explored, with the aim to get new insights on local scale patterns of seismicity in the area. The study is based on a systematic analysis of robustly and uniformly detected seismic clusters, which are identified by a statistical method, based on nearest-neighbor distances of events in the space-time-energy domain. The method permits us to highlight and investigate the internal structure of earthquake sequences, and to differentiate the spatial properties of seismicity according to the different topological features of the clusters structure. To analyze seismicity of Northeastern Italy, we use information from local OGS bulletins, compiled at the National Institute of Oceanography and Experimental Geophysics since 1977. A preliminary reappraisal of the earthquake bulletins is carried out and the area of sufficient completeness is outlined. Various techniques are considered to estimate the scaling parameters that characterize earthquakes occurrence in the region, namely the b-value and the fractal dimension of epicenters distribution, required for the application of the nearest-neighbor technique. Specifically, average robust estimates of the parameters of the Unified Scaling Law for Earthquakes, USLE, are assessed for the whole outlined region and are used to compute the nearest-neighbor distances. Clusters identification by the nearest-neighbor method turn out quite reliable and robust with respect to the minimum magnitude cutoff of the input catalog; the identified clusters are well consistent with those obtained from manual aftershocks identification of selected sequences. We demonstrate that the earthquake clusters have distinct preferred geographic locations, and we identify two areas that differ substantially in the examined clustering properties. Specifically, burst-like sequences are associated with the north-western part and swarm-like sequences with the south-eastern part of the study region. The territorial heterogeneity of earthquakes clustering is in good agreement with spatial variability of scaling parameters identified by the USLE. In particular, the fractal dimension is higher to the west (about 1.2-1.4), suggesting a spatially more distributed seismicity, compared to the eastern parte of the investigated territory, where fractal dimension is very low (about 0.8-1.0).
Employment Opportunities and Job Analysis for Selected Environmental Occupations.
ERIC Educational Resources Information Center
Stitt, Thomas R.
Clusters of environmental occupations have been surveyed to identify and describe those occupations at the professional, technical, skilled, semi-skilled, and unskilled levels. The duties and responsibilities, special knowledge, and special skills required are listed for each. Occupational clusters covered are (1) applied biological and…
Accident patterns for construction-related workers: a cluster analysis
NASA Astrophysics Data System (ADS)
Liao, Chia-Wen; Tyan, Yaw-Yauan
2012-01-01
The construction industry has been identified as one of the most hazardous industries. The risk of constructionrelated workers is far greater than that in a manufacturing based industry. However, some steps can be taken to reduce worker risk through effective injury prevention strategies. In this article, k-means clustering methodology is employed in specifying the factors related to different worker types and in identifying the patterns of industrial occupational accidents. Accident reports during the period 1998 to 2008 are extracted from case reports of the Northern Region Inspection Office of the Council of Labor Affairs of Taiwan. The results show that the cluster analysis can indicate some patterns of occupational injuries in the construction industry. Inspection plans should be proposed according to the type of construction-related workers. The findings provide a direction for more effective inspection strategies and injury prevention programs.
Accident patterns for construction-related workers: a cluster analysis
NASA Astrophysics Data System (ADS)
Liao, Chia-Wen; Tyan, Yaw-Yauan
2011-12-01
The construction industry has been identified as one of the most hazardous industries. The risk of constructionrelated workers is far greater than that in a manufacturing based industry. However, some steps can be taken to reduce worker risk through effective injury prevention strategies. In this article, k-means clustering methodology is employed in specifying the factors related to different worker types and in identifying the patterns of industrial occupational accidents. Accident reports during the period 1998 to 2008 are extracted from case reports of the Northern Region Inspection Office of the Council of Labor Affairs of Taiwan. The results show that the cluster analysis can indicate some patterns of occupational injuries in the construction industry. Inspection plans should be proposed according to the type of construction-related workers. The findings provide a direction for more effective inspection strategies and injury prevention programs.
Hsueh, Po-Ren; Lee, Tai-Fen; Du, Shin-Hei; Teng, Shih-Hua; Liao, Chun-Hsing; Sheng, Wang-Hui; Teng, Lee-Jene
2014-07-01
We evaluated whether the Bruker Biotyper matrix-associated laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) system provides accurate species-level identifications of 147 isolates of aerobically growing Gram-positive rods (GPRs). The bacterial isolates included Nocardia (n = 74), Listeria (n = 39), Kocuria (n = 15), Rhodococcus (n = 10), Gordonia (n = 7), and Tsukamurella (n = 2) species, which had all been identified by conventional methods, molecular methods, or both. In total, 89.7% of Listeria monocytogenes, 80% of Rhodococcus species, 26.7% of Kocuria species, and 14.9% of Nocardia species (n = 11, all N. nova and N. otitidiscaviarum) were correctly identified to the species level (score values, ≥ 2.0). A clustering analysis of spectra generated by the Bruker Biotyper identified six clusters of Nocardia species, i.e., cluster 1 (N. cyriacigeorgica), cluster 2 (N. brasiliensis), cluster 3 (N. farcinica), cluster 4 (N. puris), cluster 5 (N. asiatica), and cluster 6 (N. beijingensis), based on the six peaks generated by ClinProTools with the genetic algorithm, i.e., m/z 2,774.477 (cluster 1), m/z 5,389.792 (cluster 2), m/z 6,505.720 (cluster 3), m/z 5,428.795 (cluster 4), m/z 6,525.326 (cluster 5), and m/z 16,085.216 (cluster 6). Two clusters of L. monocytogenes spectra were also found according to the five peaks, i.e., m/z 5,594.85, m/z 6,184.39, and m/z 11,187.31, for cluster 1 (serotype 1/2a) and m/z 5,601.21 and m/z 11,199.33 for cluster 2 (serotypes 1/2b and 4b). The Bruker Biotyper system was unable to accurately identify Nocardia (except for N. nova and N. otitidiscaviarum), Tsukamurella, or Gordonia species. Continuous expansion of the MALDI-TOF MS databases to include more GPRs is necessary. Copyright © 2014, American Society for Microbiology. All Rights Reserved.
Hinds, David R; DiSantostefano, Rachael L; Le, Hoa V; Pascoe, Steven
2016-06-01
To identify clusters of patients who may benefit from treatment with an inhaled corticosteroid (ICS)/long-acting β2 agonist (LABA) versus LABA alone, in terms of exacerbation reduction, and to validate previously identified clusters of patients with chronic obstructive pulmonary disease (COPD) (based on diuretic use and reversibility). Post hoc supervised cluster analysis using a modified recursive partitioning algorithm of two 1-year randomised, controlled trials of fluticasone furoate (FF)/vilanterol (VI) versus VI alone, with the primary end points of the annual rate of moderate-to-severe exacerbations. Global. 3255 patients with COPD (intent-to-treat populations) with a history of exacerbations in the past year. FF/VI 50/25 µg, 100/25 µg or 200/25 µg, or VI 25 µg; all one time per day. Mean annual COPD exacerbation rate to identify clusters of patients who benefit from adding an ICS (FF) to VI bronchodilator therapy. Three clusters were identified, including two groups that benefit from FF/VI versus VI: patients with blood eosinophils >2.4% (RR=0.68, 95% CI 0.58 to 0.79), or blood eosinophils ≤2.4% and smoking history ≤46 pack-years, experienced a reduced rate of exacerbations with FF/VI versus VI (RR=0.78, 95% CI 0.63 to 0.96), whereas those with blood eosinophils ≤2.4% and smoking history >46 pack-years were identified as non-responders (RR=1.22, 95% CI 0.94 to 1.58). Clusters of patients previously identified in the fluticasone propionate/salmeterol (SAL) versus SAL trials of similar design were not validated; all clusters of patients tended to benefit from FF/VI versus VI alone irrespective of diuretic use and reversibility. In patients with COPD with a history of exacerbations, those with greater blood eosinophils or a lower smoking history may benefit more from ICS/LABA versus LABA alone as measured by a reduced rate of exacerbations. In terms of eosinophils, this finding is consistent with findings from other studies; however, the validity of the 2.4% cut-off and the impact of smoking history require further investigation. NCT01009463; NCT01017952; Post-results. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Geographic Clusters of Basal Cell Carcinoma in a Northern California Health Plan Population.
Ray, G Thomas; Kulldorff, Martin; Asgari, Maryam M
2016-11-01
Rates of skin cancer, including basal cell carcinoma (BCC), the most common cancer, have been increasing over the past 3 decades. A better understanding of geographic clustering of BCCs can help target screening and prevention efforts. Present a methodology to identify spatial clusters of BCC and identify such clusters in a northern California population. This retrospective study used a BCC registry to determine rates of BCC by census block group, and used spatial scan statistics to identify statistically significant geographic clusters of BCCs, adjusting for age, sex, and socioeconomic status. The study population consisted of white, non-Hispanic members of Kaiser Permanente Northern California during years 2011 and 2012. Statistically significant geographic clusters of BCC as determined by spatial scan statistics. Spatial analysis of 28 408 individuals who received a diagnosis of at least 1 BCC in 2011 or 2012 revealed distinct geographic areas with elevated BCC rates. Among the 14 counties studied, BCC incidence ranged from 661 to 1598 per 100 000 person-years. After adjustment for age, sex, and neighborhood socioeconomic status, a pattern of 5 discrete geographic clusters emerged, with a relative risk ranging from 1.12 (95% CI, 1.03-1.21; P = .006) for a cluster in eastern Sonoma and northern Napa Counties to 1.40 (95% CI, 1.15-1.71; P < .001) for a cluster in east Contra Costa and west San Joaquin Counties, compared with persons residing outside that cluster. In this study of a northern California population, we identified several geographic clusters with modestly elevated incidence of BCC. Knowledge of geographic clusters can help inform future research on the underlying etiology of the clustering including factors related to the environment, health care access, or other characteristics of the resident population, and can help target screening efforts to areas of highest yield.
Molsberry, Samantha A; Cheng, Yu; Kingsley, Lawrence; Jacobson, Lisa; Levine, Andrew J; Martin, Eileen; Miller, Eric N; Munro, Cynthia A; Ragin, Ann; Sacktor, Ned; Becker, James T
2018-05-11
Mild forms of HIV-associated neurocognitive disorder (HAND) remain prevalent in the combination anti-retroviral therapy (cART) era. This study's objective was to identify neuropsychological subgroups within the Multicenter AIDS Cohort Study (MACS) based on the participant-based latent structure of cognitive function and to identify factors associated with subgroups. The MACS is a four-site longitudinal study of the natural and treated history of HIV disease among gay and bisexual men. Using neuropsychological domain scores we used a cluster variable selection algorithm to identify the optimal subset of domains with cluster information. Latent profile analysis was applied using scores from identified domains. Exploratory and post-hoc analyses were conducted to identify factors associated with cluster membership and the drivers of the observed associations. Cluster variable selection identified all domains as containing cluster information except for Working Memory. A three-profile solution produced the best fit for the data. Profile 1 performed below average on all domains, Profile 2 performed average on executive functioning, motor, and speed and below average on learning and memory, Profile 3 performed at or above average across all domains. Several demographic, cognitive, and social factors were associated with profile membership; these associations were driven by differences between Profile 1 and the other profiles. There is an identifiable pattern of neuropsychological performance among MACS members determined by all domains except Working Memory. Neither HIV nor HIV-related biomarkers were related with cluster membership, consistent with other findings that cognitive performance patterns do not map directly onto HIV serostatus.
Akar, Servet; Solmaz, Dilek; Kasifoglu, Timucin; Bilge, Sule Yasar; Sari, Ismail; Gumus, Zeynep Zehra; Tunca, Mehmet
2016-02-01
The aim of this study was to evaluate whether there are clinical subgroups that may have different prognoses among FMF patients. The cumulative clinical features of a large group of FMF patients [1168 patients, 593 (50.8%) male, mean age 35.3 years (s.d. 12.4)] were studied. To analyse our data and identify groups of FMF patients with similar clinical characteristics, a two-step cluster analysis using log-likelihood distance measures was performed. For clustering the FMF patients, we evaluated the following variables: gender, current age, age at symptom onset, age at diagnosis, presence of major clinical features, variables related with therapy and family history for FMF, renal failure and carriage of M694V. Three distinct groups of FMF patients were identified. Cluster 1 was characterized by a high prevalence of arthritis, pleuritis, erysipelas-like erythema (ELE) and febrile myalgia. The dosage of colchicine and the frequency of amyloidosis were lower in cluster 1. Patients in cluster 2 had an earlier age of disease onset and diagnosis. M694V carriage and amyloidosis prevalence were the highest in cluster 2. This group of patients was using the highest dose of colchicine. Patients in cluster 3 had the lowest prevalence of arthritis, ELE and febrile myalgia. The frequencies of M694V carriage and amyloidosis were lower in cluster 3 than the overall FMF patients. Non-response to colchicine was also slightly lower in cluster 3. Patients with FMF can be clustered into distinct patterns of clinical and genetic manifestations and these patterns may have different prognostic significance. © The Author 2015. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Marques, Elisa A; Pizarro, Andreia N; Figueiredo, Pedro; Mota, Jorge; Santos, Maria P
2013-06-01
To analyze how modifiable health-related variables are clustered and associated with children's participation in play, active travel and structured exercise and sport among boys and girls. Data were collected from 9 middle-schools in Porto (Portugal) area. A total of 636 children in the 6th grade (340 girls and 296 boys) with a mean age of 11.64 years old participated in the study. Cluster analyses were used to identify patterns of lifestyle and healthy/unhealthy behaviors. Multinomial logistic regression analysis was used to estimate associations between cluster allocation, sedentary time and participation in three different physical activity (PA) contexts: play, active travel, and structured exercise/sport. Four distinct clusters were identified based on four lifestyle risk factors. The most disadvantaged cluster was characterized by high body mass index, low high-density lipoprotein cholesterol and cardiorespiratory fitness and a moderate level of moderate to vigorous PA. Everyday outdoor play (OR=1.85, 95%CI 0.318-0.915) and structured exercise/sport (OR=1.85, 95%CI 0.291-0.990) were associated with healthier lifestyle patterns. There were no significant associations between health patterns and sedentary time or travel mode. Outdoor play and sport/exercise participation seem more important than active travel from school in influencing children's healthy cluster profiles. Copyright © 2013 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Bocsi, Jozsef; Mittag, Anja; Pierzchalski, Arkadiusz; Osmancik, Pavel; Dähnert, Ingo; Tárnok, Attila
2011-02-01
Introduction: Methylprednisolone (MP) is frequently preoperatively administered in children undergoing open heart surgery. The aim of this medication is to inhibit overshooting immune responses. Earlier studies demonstrated cellular and humoral immunological changes in pediatric patients undergoing heart surgeries with and without MP administration. Here in a retrospective study we investigated the modulation of the cellular immune response by MP. The aim was to identify suitable parameters characterizing MP effects by cluster analysis. Methods: Blood samples were analysed from two aged matched groups with surgical correction of septum defects. Group without MP treatment consisted of 10 patients; MP was administered on 21 patients (median dose: 11mg/kg) before cardiopulmonary bypass (CPB). EDTA anticoagulated blood was obtained 24 h preoperatively, after anesthesia, at CPB begin and end (CPB2), 4h, 24h, 48h after surgery, at discharge and at out-patient followup (8.2; 3.3-12.2 month after surgery; median and IQR). Flow cytometry showed the biggest MP relevant changes at CPB2 and 4h postoperatively. They were used for clustering analysis. Classification was made by discriminant analysis and cluster analysis by means of Genes@work software. Results & conclusion: 146 parameters were obtained from analysis. Cross-validation revealed several parameters being able to discriminate between MP groups and to identify immune modulation. MP administration resulted in a delayed activation of monocytes, increased ratio of neutrophils, reduced T-lymphocytes counts. Cluster analysis demonstrated that classification of patients is possible based on the identified cytomics parameters. Further investigation of these parameters might help to understand the MP effects in pediatric open heart surgery.
Identifying and Assessing Interesting Subgroups in a Heterogeneous Population
Lee, Woojoo; Alexeyenko, Andrey; Pernemalm, Maria; Guegan, Justine; Dessen, Philippe; Lazar, Vladimir; Lehtiö, Janne; Pawitan, Yudi
2015-01-01
Biological heterogeneity is common in many diseases and it is often the reason for therapeutic failures. Thus, there is great interest in classifying a disease into subtypes that have clinical significance in terms of prognosis or therapy response. One of the most popular methods to uncover unrecognized subtypes is cluster analysis. However, classical clustering methods such as k-means clustering or hierarchical clustering are not guaranteed to produce clinically interesting subtypes. This could be because the main statistical variability—the basis of cluster generation—is dominated by genes not associated with the clinical phenotype of interest. Furthermore, a strong prognostic factor might be relevant for a certain subgroup but not for the whole population; thus an analysis of the whole sample may not reveal this prognostic factor. To address these problems we investigate methods to identify and assess clinically interesting subgroups in a heterogeneous population. The identification step uses a clustering algorithm and to assess significance we use a false discovery rate- (FDR-) based measure. Under the heterogeneity condition the standard FDR estimate is shown to overestimate the true FDR value, but this is remedied by an improved FDR estimation procedure. As illustrations, two real data examples from gene expression studies of lung cancer are provided. PMID:26339613
Busch, Vincent; Van Stel, Henk F; Schrijvers, Augustinus J P; de Leeuw, Johannes R J
2013-12-04
Recent studies show several health-related behaviors to cluster in adolescents. This has important implications for public health. Interrelated behaviors have been shown to be most effectively targeted by multimodal interventions addressing wider-ranging improvements in lifestyle instead of via separate interventions targeting individual behaviors. However, few previous studies have taken into account a broad, multi-disciplinary range of health-related behaviors and connected these behavioral patterns to health-related outcomes. This paper presents an analysis of the clustering of a broad range of health-related behaviors with relevant demographic factors and several health-related outcomes in adolescents. Self-report questionnaire data were collected from a sample of 2,690 Dutch high school adolescents. Behavioral patterns were deducted via Principal Components Analysis. Subsequently a Two-Step Cluster Analysis was used to identify groups of adolescents with similar behavioral patterns and health-related outcomes. Four distinct behavioral patterns describe the analyzed individual behaviors: 1- risk-prone behavior, 2- bully behavior, 3- problematic screen time use, and 4- sedentary behavior. Subsequent cluster analysis identified four clusters of adolescents. Multi-problem behavior was associated with problematic physical and psychosocial health outcomes, as opposed to those exerting relatively few unhealthy behaviors. These associations were relatively independent of demographics such as ethnicity, gender and socio-economic status. The results show that health-related behaviors tend to cluster, indicating that specific behavioral patterns underlie individual health behaviors. In addition, specific patterns of health-related behaviors were associated with specific health outcomes and demographic factors. In general, unhealthy behavior on account of multiple health-related behaviors was associated with both poor psychosocial and physical health. These findings have significant meaning for future public health programs, which should be more tailored with use of such knowledge on behavioral clustering via e.g. Transfer Learning.
2013-01-01
Background Recent studies show several health-related behaviors to cluster in adolescents. This has important implications for public health. Interrelated behaviors have been shown to be most effectively targeted by multimodal interventions addressing wider-ranging improvements in lifestyle instead of via separate interventions targeting individual behaviors. However, few previous studies have taken into account a broad, multi-disciplinary range of health-related behaviors and connected these behavioral patterns to health-related outcomes. This paper presents an analysis of the clustering of a broad range of health-related behaviors with relevant demographic factors and several health-related outcomes in adolescents. Methods Self-report questionnaire data were collected from a sample of 2,690 Dutch high school adolescents. Behavioral patterns were deducted via Principal Components Analysis. Subsequently a Two-Step Cluster Analysis was used to identify groups of adolescents with similar behavioral patterns and health-related outcomes. Results Four distinct behavioral patterns describe the analyzed individual behaviors: 1- risk-prone behavior, 2- bully behavior, 3- problematic screen time use, and 4- sedentary behavior. Subsequent cluster analysis identified four clusters of adolescents. Multi-problem behavior was associated with problematic physical and psychosocial health outcomes, as opposed to those exerting relatively few unhealthy behaviors. These associations were relatively independent of demographics such as ethnicity, gender and socio-economic status. Conclusions The results show that health-related behaviors tend to cluster, indicating that specific behavioral patterns underlie individual health behaviors. In addition, specific patterns of health-related behaviors were associated with specific health outcomes and demographic factors. In general, unhealthy behavior on account of multiple health-related behaviors was associated with both poor psychosocial and physical health. These findings have significant meaning for future public health programs, which should be more tailored with use of such knowledge on behavioral clustering via e.g. Transfer Learning. PMID:24305509
Hot spot analysis applied to identify ecosystem services potential in Lithuania
NASA Astrophysics Data System (ADS)
Pereira, Paulo; Depellegrin, Daniel; Misiune, Ieva
2016-04-01
Hot spot analysis are very useful to identify areas with similar characteristics. This is important for a sustainable use of the territory, since we can identify areas that need to be protected, or restored. This is a great advantage in terms of land use planning and management, since we can allocate resources, reduce the economical costs and do a better intervention in the landscape. Ecosystem services (ES) are different according land use. Since landscape is very heterogeneous, it is of major importance understand their spatial pattern and where are located the areas that provide better ES and the others that provide less services. The objective of this work is to use hot-spot analysis to identify areas with the most valuable ES in Lithuania. CORINE land-cover (CLC) of 2006 was used as the main spatial information. This classification uses a grid of 100 m resolution and extracted a total of 31 land use types. ES ranking was carried out based on expert knowledge. They were asked to evaluate the ES potential of each different CLC from 0 (no potential) to 5 (very high potential). Hot spot analysis were evaluated using the Getis-ord test, which identifies cluster analysis available in ArcGIS toolbox. This tool identifies areas with significantly high low values and significant high values at a p level of 0.05. In this work we used hot spot analysis to assess the distribution of providing, regulating cultural and total (sum of the previous 3) ES. The Z value calculated from Getis-ord was used to statistical analysis to access the clusters of providing, regulating cultural and total ES. ES with high Z value show that they have a high number of cluster areas with high potential of ES. The results showed that the Z-score was significantly different among services (Kruskal Wallis ANOVA =834. 607, p<0.001). The Z score of providing services (0.096±2.239) were significantly higher than the total (0.093±2.045), cultural (0.080±1.979) and regulating (0.076±1.961). These results suggested that providing services are more clustered than the remaining. Ecosystem Services Z score were significantly correlated, regulating vs total (0.98, p<0.0001), regulating vs cultural (0.97, p<0.0001), cultural vs total (0.96, p<0.0001), providing vs total (0.69, p<0.0001), regulating vs providing (0.56, p<0.0001) and providing vs cultural (0.56, p<0.0001). According to these results, ES distribution potential showed a similar pattern, especially regulating, cultural and total. This an evidence that the the areas that showed high and low significant regulating and cultural ES clusters are similar. The spatial distribution of these clusters is very high, which may be attributed to the landscape diversity and fragmentation.
Comparison of organs' shapes with geometric and Zernike 3D moments.
Broggio, D; Moignier, A; Ben Brahim, K; Gardumi, A; Grandgirard, N; Pierrat, N; Chea, M; Derreumaux, S; Desbrée, A; Boisserie, G; Aubert, B; Mazeron, J-J; Franck, D
2013-09-01
The morphological similarity of organs is studied with feature vectors based on geometric and Zernike 3D moments. It is particularly investigated if outliers and average models can be identified. For this purpose, the relative proximity to the mean feature vector is defined, principal coordinate and clustering analyses are also performed. To study the consistency and usefulness of this approach, 17 livers and 76 hearts voxel models from several sources are considered. In the liver case, models with similar morphological feature are identified. For the limited amount of studied cases, the liver of the ICRP male voxel model is identified as a better surrogate than the female one. For hearts, the clustering analysis shows that three heart shapes represent about 80% of the morphological variations. The relative proximity and clustering analysis rather consistently identify outliers and average models. For the two cases, identification of outliers and surrogate of average models is rather robust. However, deeper classification of morphological feature is subject to caution and can only be performed after cross analysis of at least two kinds of feature vectors. Finally, the Zernike moments contain all the information needed to re-construct the studied objects and thus appear as a promising tool to derive statistical organ shapes. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
2011-01-01
Background Community-dwelling older people aged 65+ years sustain falls frequently; these can result in physical injuries necessitating medical attention including emergency department care and hospitalisation. Certain health conditions and impairments have been shown to contribute independently to the risk of falling or experiencing a fall injury, suggesting that individuals with these conditions or impairments should be the focus of falls prevention. Since older people commonly have multiple conditions/impairments, knowledge about which conditions/impairments coexist in at-risk individuals would be valuable in the implementation of a targeted prevention approach. The objective of this study was therefore to examine the prevalence and patterns of comorbidity in this population group. Methods We analysed hospitalisation data from Victoria, Australia's second most populous state, to estimate the prevalence of comorbidity in patients hospitalised at least once between 2005-6 and 2007-8 for treatment of acute fall-related injuries. In patients with two or more comorbid conditions (multicomorbidity) we used an agglomerative hierarchical clustering method to cluster comorbidity variables and identify constellations of conditions. Results More than one in four patients had at least one comorbid condition and among patients with comorbidity one in three had multicomorbidity (range 2-7). The prevalence of comorbidity varied by gender, age group, ethnicity and injury type; it was also associated with a significant increase in the average cumulative length of stay per patient. The cluster analysis identified five distinct, biologically plausible clusters of comorbidity: cardiopulmonary/metabolic, neurological, sensory, stroke and cancer. The cardiopulmonary/metabolic cluster was the largest cluster among the clusters identified. Conclusions The consequences of comorbidity clustering in terms of falls and/or injury outcomes of hospitalised patients should be investigated by future studies. Our findings have particular relevance for falls prevention strategies, clinical practice and planning of follow-up services for these patients. PMID:21851627
Cluster Analysis on Longitudinal Data of Patients with Adult-Onset Asthma.
Ilmarinen, Pinja; Tuomisto, Leena E; Niemelä, Onni; Tommola, Minna; Haanpää, Jussi; Kankaanranta, Hannu
Previous cluster analyses on asthma are based on cross-sectional data. To identify phenotypes of adult-onset asthma by using data from baseline (diagnostic) and 12-year follow-up visits. The Seinäjoki Adult Asthma Study is a 12-year follow-up study of patients with new-onset adult asthma. K-means cluster analysis was performed by using variables from baseline and follow-up visits on 171 patients to identify phenotypes. Five clusters were identified. Patients in cluster 1 (n = 38) were predominantly nonatopic males with moderate smoking history at baseline. At follow-up, 40% of these patients had developed persistent obstruction but the number of patients with uncontrolled asthma (5%) and rhinitis (10%) was the lowest. Cluster 2 (n = 19) was characterized by older men with heavy smoking history, poor lung function, and persistent obstruction at baseline. At follow-up, these patients were mostly uncontrolled (84%) despite daily use of inhaled corticosteroid (ICS) with add-on therapy. Cluster 3 (n = 50) consisted mostly of nonsmoking females with good lung function at diagnosis/follow-up and well-controlled/partially controlled asthma at follow-up. Cluster 4 (n = 25) had obese and symptomatic patients at baseline/follow-up. At follow-up, these patients had several comorbidities (40% psychiatric disease) and were treated daily with ICS and add-on therapy. Patients in cluster 5 (n = 39) were mostly atopic and had the earliest onset of asthma, the highest blood eosinophils, and FEV 1 reversibility at diagnosis. At follow-up, these patients used the lowest ICS dose but 56% were well controlled. Results can be used to predict outcomes of patients with adult-onset asthma and to aid in development of personalized therapy (NCT02733016 at ClinicalTrials.gov). Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Vu, Trang; Finch, Caroline F; Day, Lesley
2011-08-18
Community-dwelling older people aged 65+ years sustain falls frequently; these can result in physical injuries necessitating medical attention including emergency department care and hospitalisation. Certain health conditions and impairments have been shown to contribute independently to the risk of falling or experiencing a fall injury, suggesting that individuals with these conditions or impairments should be the focus of falls prevention. Since older people commonly have multiple conditions/impairments, knowledge about which conditions/impairments coexist in at-risk individuals would be valuable in the implementation of a targeted prevention approach. The objective of this study was therefore to examine the prevalence and patterns of comorbidity in this population group. We analysed hospitalisation data from Victoria, Australia's second most populous state, to estimate the prevalence of comorbidity in patients hospitalised at least once between 2005-6 and 2007-8 for treatment of acute fall-related injuries. In patients with two or more comorbid conditions (multicomorbidity) we used an agglomerative hierarchical clustering method to cluster comorbidity variables and identify constellations of conditions. More than one in four patients had at least one comorbid condition and among patients with comorbidity one in three had multicomorbidity (range 2-7). The prevalence of comorbidity varied by gender, age group, ethnicity and injury type; it was also associated with a significant increase in the average cumulative length of stay per patient. The cluster analysis identified five distinct, biologically plausible clusters of comorbidity: cardiopulmonary/metabolic, neurological, sensory, stroke and cancer. The cardiopulmonary/metabolic cluster was the largest cluster among the clusters identified. The consequences of comorbidity clustering in terms of falls and/or injury outcomes of hospitalised patients should be investigated by future studies. Our findings have particular relevance for falls prevention strategies, clinical practice and planning of follow-up services for these patients.
Medema, Marnix H; Blin, Kai; Cimermancic, Peter; de Jager, Victor; Zakrzewski, Piotr; Fischbach, Michael A; Weber, Tilmann; Takano, Eriko; Breitling, Rainer
2011-07-01
Bacterial and fungal secondary metabolism is a rich source of novel bioactive compounds with potential pharmaceutical applications as antibiotics, anti-tumor drugs or cholesterol-lowering drugs. To find new drug candidates, microbiologists are increasingly relying on sequencing genomes of a wide variety of microbes. However, rapidly and reliably pinpointing all the potential gene clusters for secondary metabolites in dozens of newly sequenced genomes has been extremely challenging, due to their biochemical heterogeneity, the presence of unknown enzymes and the dispersed nature of the necessary specialized bioinformatics tools and resources. Here, we present antiSMASH (antibiotics & Secondary Metabolite Analysis Shell), the first comprehensive pipeline capable of identifying biosynthetic loci covering the whole range of known secondary metabolite compound classes (polyketides, non-ribosomal peptides, terpenes, aminoglycosides, aminocoumarins, indolocarbazoles, lantibiotics, bacteriocins, nucleosides, beta-lactams, butyrolactones, siderophores, melanins and others). It aligns the identified regions at the gene cluster level to their nearest relatives from a database containing all other known gene clusters, and integrates or cross-links all previously available secondary-metabolite specific gene analysis methods in one interactive view. antiSMASH is available at http://antismash.secondarymetabolites.org.
Zhang, Xiaohua Douglas; Yang, Xiting Cindy; Chung, Namjin; Gates, Adam; Stec, Erica; Kunapuli, Priya; Holder, Dan J; Ferrer, Marc; Espeseth, Amy S
2006-04-01
RNA interference (RNAi) high-throughput screening (HTS) experiments carried out using large (>5000 short interfering [si]RNA) libraries generate a huge amount of data. In order to use these data to identify the most effective siRNAs tested, it is critical to adopt and develop appropriate statistical methods. To address the questions in hit selection of RNAi HTS, we proposed a quartile-based method which is robust to outliers, true hits and nonsymmetrical data. We compared it with the more traditional tests, mean +/- k standard deviation (SD) and median +/- 3 median of absolute deviation (MAD). The results suggested that the quartile-based method selected more hits than mean +/- k SD under the same preset error rate. The number of hits selected by median +/- k MAD was close to that by the quartile-based method. Further analysis suggested that the quartile-based method had the greatest power in detecting true hits, especially weak or moderate true hits. Our investigation also suggested that platewise analysis (determining effective siRNAs on a plate-by-plate basis) can adjust for systematic errors in different plates, while an experimentwise analysis, in which effective siRNAs are identified in an analysis of the entire experiment, cannot. However, experimentwise analysis may detect a cluster of true positive hits placed together in one or several plates, while platewise analysis may not. To display hit selection results, we designed a specific figure called a plate-well series plot. We thus suggest the following strategy for hit selection in RNAi HTS experiments. First, choose the quartile-based method, or median +/- k MAD, for identifying effective siRNAs. Second, perform the chosen method experimentwise on transformed/normalized data, such as percentage inhibition, to check the possibility of hit clusters. If a cluster of selected hits are observed, repeat the analysis based on untransformed data to determine whether the cluster is due to an artifact in the data. If no clusters of hits are observed, select hits by performing platewise analysis on transformed data. Third, adopt the plate-well series plot to visualize both the data and the hit selection results, as well as to check for artifacts.
Pellegrini, Michael; Zoghi, Maryam; Jaberzadeh, Shapour
2018-01-12
Cluster analysis and other subgrouping techniques have risen in popularity in recent years in non-invasive brain stimulation research in the attempt to investigate the issue of inter-individual variability - the issue of why some individuals respond, as traditionally expected, to non-invasive brain stimulation protocols and others do not. Cluster analysis and subgrouping techniques have been used to categorise individuals, based on their response patterns, as responder or non-responders. There is, however, a lack of consensus and consistency on the most appropriate technique to use. This systematic review aimed to provide a systematic summary of the cluster analysis and subgrouping techniques used to date and suggest recommendations moving forward. Twenty studies were included that utilised subgrouping techniques, while seven of these additionally utilised cluster analysis techniques. The results of this systematic review appear to indicate that statistical cluster analysis techniques are effective in identifying subgroups of individuals based on response patterns to non-invasive brain stimulation. This systematic review also reports a lack of consensus amongst researchers on the most effective subgrouping technique and the criteria used to determine whether an individual is categorised as a responder or a non-responder. This systematic review provides a step-by-step guide to carrying out statistical cluster analyses and subgrouping techniques to provide a framework for analysis when developing further insights into the contributing factors of inter-individual variability in response to non-invasive brain stimulation.
Liao, Minlei; Li, Yunfeng; Kianifard, Farid; Obi, Engels; Arcona, Stephen
2016-03-02
Cluster analysis (CA) is a frequently used applied statistical technique that helps to reveal hidden structures and "clusters" found in large data sets. However, this method has not been widely used in large healthcare claims databases where the distribution of expenditure data is commonly severely skewed. The purpose of this study was to identify cost change patterns of patients with end-stage renal disease (ESRD) who initiated hemodialysis (HD) by applying different clustering methods. A retrospective, cross-sectional, observational study was conducted using the Truven Health MarketScan® Research Databases. Patients aged ≥18 years with ≥2 ESRD diagnoses who initiated HD between 2008 and 2010 were included. The K-means CA method and hierarchical CA with various linkage methods were applied to all-cause costs within baseline (12-months pre-HD) and follow-up periods (12-months post-HD) to identify clusters. Demographic, clinical, and cost information was extracted from both periods, and then examined by cluster. A total of 18,380 patients were identified. Meaningful all-cause cost clusters were generated using K-means CA and hierarchical CA with either flexible beta or Ward's methods. Based on cluster sample sizes and change of cost patterns, the K-means CA method and 4 clusters were selected: Cluster 1: Average to High (n = 113); Cluster 2: Very High to High (n = 89); Cluster 3: Average to Average (n = 16,624); or Cluster 4: Increasing Costs, High at Both Points (n = 1554). Median cost changes in the 12-month pre-HD and post-HD periods increased from $185,070 to $884,605 for Cluster 1 (Average to High), decreased from $910,930 to $157,997 for Cluster 2 (Very High to High), were relatively stable and remained low from $15,168 to $13,026 for Cluster 3 (Average to Average), and increased from $57,909 to $193,140 for Cluster 4 (Increasing Costs, High at Both Points). Relatively stable costs after starting HD were associated with more stable scores on comorbidity index scores from the pre-and post-HD periods, while increasing costs were associated with more sharply increasing comorbidity scores. The K-means CA method appeared to be the most appropriate in healthcare claims data with highly skewed cost information when taking into account both change of cost patterns and sample size in the smallest cluster.
Parental Influences on Adolescent Adjustment: Parenting Styles Versus Parenting Practices
ERIC Educational Resources Information Center
Lee, Sang Min; Daniels, M. Harry; Kissinger, Daniel B.
2006-01-01
The study identified distinct patterns of parental practices that differentially influence adolescent behavior using the National Educational Longitudinal Survey (NELS:88) database. Following Brenner and Fox's research model (1999), the cluster analysis was used to classify the four types of parental practices. The clusters of parenting practices…
A Cluster Analysis of Support Networks of Older People with Severe Intellectual Impairment.
ERIC Educational Resources Information Center
Moss, Steve; Hogg, James
1989-01-01
The report describes a demographic survey of 122 people with severe intellectual impairment over 50 years of age in a British community. Seven relatively independent clusters were identified: contact with relatives, community activities, community independence, friendship, service pathways, geriatrics/mobility, and physical treatment. (Author/DB)
USDA-ARS?s Scientific Manuscript database
Risk factors for obesity and weight gain are typically evaluated individually while "adjusting for" the influence of other confounding factors, and few studies, if any, have created risk profiles by clustering risk factors. We identified subgroups of postmenopausal women homogeneous in their cluster...
Davison, K Krahnstoever; Birch, L Lipps
2008-01-01
OBJECTIVE To determine whether obesigenic families can be identified based on mothers’ and fathers’ dietary and activity patterns. METHODS A total of 197 girls and their parents were assessed when girls were 5 y old; 192 families were reassessed when girls were 7 y old. Measures of parents’ physical activity and dietary intake were obtained and entered into a cluster analysis to assess whether distinct family clusters could be identified. Girls’ skinfold thickness and body mass index (BMI) were also assessed and were used to examine the predictive validity of the clusters. RESULTS Obesigenic and a non-obesigenic family clusters were identified. Mothers and fathers in the obesigenic cluster reported high levels of dietary intake and low levels of physical activity, while mothers and fathers in the non-obesigenic cluster reported low levels of dietary intake and high levels of activity. Girls from families in the obesigenic cluster had significantly higher BMI and skinfold thickness values at age 7 and showed significantly greater increases in BMI and skinfold thickness from ages 5 to 7 y than girls from non-obesigenic families; differences were reduced but not eliminated after controlling for parents’ BMI. CONCLUSIONS Obesigenic families, defined in terms of parents’ activity and dietary patterns, can be used predict children’s risk of obesity. PMID:12187395
Microstructure and tuber properties of potato varieties with different genetic profiles.
Romano, Annalisa; Masi, Paolo; Aversano, Riccardo; Carucci, Francesca; Palomba, Sara; Carputo, Domenico
2018-01-15
The objectives of this research were to study tuber starch characteristics and chemical - thermal properties of 21 potato varieties, and to determine their genetic diversity through SSR markers. Starch granular size varied among samples, with a wide diameter distribution (5-85μm), while granule shapes were similar. Differential Scanning Calorimeter analysis showed that the transition temperatures (69°C-74°C) and enthalpies of gelatinization (0.9J/g-3.8J/g) of tubers were also variety dependent. SSR analysis allowed the detection of 157 alleles across all varieties, with an average value of 6.8 alleles per locus. Variety-specific alleles were also identified. SSR-based cluster analysis revealed that varieties with interesting quality attributes were distributed among all clusters and sub-clusters, suggesting that the genetic basis of traits analyzed may differ among our varieties. The information obtained in this study may be useful to identify and develop varieties with slowly digestible starch. Copyright © 2017 Elsevier Ltd. All rights reserved.
Wang, Z; Wang, W H; Wang, S L; Jin, J; Song, Y W; Liu, Y P; Ren, H; Fang, H; Tang, Y; Chen, B; Qi, S N; Lu, N N; Li, N; Tang, Y; Liu, X F; Yu, Z H; Li, Y X
2016-06-23
To find phenotypic subgroups of patients with pT1-2N0 invasive breast cancer by means of cluster analysis and estimate the prognosis and clinicopathological features of these subgroups. From 1999 to 2013, 4979 patients with pT1-2N0 invasive breast cancer were recruited for hierarchical clustering analysis. Age (≤40, 41-70, 70+ years), size of primary tumor, pathological type, grade of differentiation, microvascular invasion, estrogen receptor (ER), progesterone receptor (PR) and human epidermal growth factor receptor 2 (HER-2) were chosen as distance metric between patients. Hierarchical cluster analysis was performed using Ward's method. Cophenetic correlation coefficient (CPCC) and Spearman correlation coefficient were used to validate clustering structures. The CPCC was 0.603. The Spearman correlation coefficient was 0.617 (P<0.001), which indicated a good fit of hierarchy to the data. A twelve-cluster model seemed to best illustrate our patient cohort. Patients in cluster 5, 9 and 12 had best prognosis and were characterized by age >40 years, smaller primary tumor, lower histologic grade, positive ER and PR status, and mainly negative HER-2. Patients in the cluster 1 and 11 had the worst prognosis, The cluster 1 was characterized by a larger tumor, higher grade and negative ER and PR status, while the cluster 11 was characterized by positive microvascular invasion. Patients in other 7 clusters had a moderate prognosis, and patients in each cluster had distinctive clinicopathological features and recurrent patterns. This study identified distinctive clinicopathologic phenotypes in a large cohort of patients with pT1-2N0 breast cancer through hierarchical clustering and revealed different prognosis. This integrative model may help physicians to make more personalized decisions regarding adjuvant therapy.
Dong, Wen; Yang, Kun; Xu, Quanli; Liu, Lin; Chen, Juan
2017-10-24
A large number (n = 460) of A(H7N9) human infections have been reported in China from March 2013 through December 2014, and H7N9 outbreaks in humans became an emerging issue for China health, which have caused numerous disease outbreaks in domestic poultry and wild bird populations, and threatened human health severely. The aims of this study were to investigate the directional trend of the epidemic and to identify the significant presence of spatial-temporal clustering of influenza A(H7N9) human cases between March 2013 and December 2014. Three distinct epidemic phases of A(H7N9) human infections were identified in this study. In each phase, standard deviational ellipse analysis was conducted to examine the directional trend of disease spreading, and retrospective space-time permutation scan statistic was then used to identify the spatio-temporal cluster patterns of H7N9 outbreaks in humans. The ever-changing location and the increasing size of the three identified standard deviational ellipses showed that the epidemic moved from east to southeast coast, and hence to some central regions, with a future epidemiological trend of continue dispersing to more central regions of China, and a few new human cases might also appear in parts of the western China. Furthermore, A(H7N9) human infections were clustering in space and time in the first two phases with five significant spatio-temporal clusters (p < 0.05), but there was no significant cluster identified in phase III. There was a new epidemiologic pattern that the decrease in significant spatio-temporal cluster of A(H7N9) human infections was accompanied with an obvious spatial expansion of the outbreaks during the study period, and identification of the spatio-temporal patterns of the epidemic can provide valuable insights for better understanding the spreading dynamics of the disease in China.
Identification of Clinical Phenotypes in Idiopathic Interstitial Pneumonia with Pulmonary Emphysema.
Sato, Suguru; Tanino, Yoshinori; Misa, Kenichi; Fukuhara, Naoko; Nikaido, Takefumi; Uematsu, Manabu; Fukuhara, Atsuro; Wang, Xintao; Ishida, Takashi; Munakata, Mitsuru
2016-01-01
Objective Since the term "combined pulmonary fibrosis and emphysema" (CPFE) was first proposed, the co-existence of pulmonary fibrosis and pulmonary emphysema (PE) has drawn considerable attention. However, conflicting results on the clinical characteristics of patients with both pulmonary fibrosis and PE have been published because of the lack of an exact definition of CPFE. The goal of this study was thus to clarify the clinical characteristics and phenotypes of idiopathic interstitial pneumonia (IIP) with PE. Methods We retrospectively analyzed IIP patients who had been admitted to our hospital. Their chest high-resolution computed tomography images were classified into two groups according to the presence of PE. We then performed a cluster analysis to identify the phenotypes of IIP patients with PE. Results Forty-four (53.7%) out of 82 patients had at least mild emphysema in their bilateral lungs. The cluster analysis separated the IIP patients with PE into three clusters. The overall survival rate of one cluster that consisted of mainly idiopathic pulmonary fibrosis (IPF) patients was significantly worse than those of the other clusters. Conclusion Three different phenotypes can be identified in IIP patients with PE, and IPF with PE is a distinct clinical phenotype with a poor prognosis.
Multidimensional analysis of peak pain symptoms and experiences.
Kinsman, R; Dirks, J F; Wunder, J; Carbaugh, R; Stieg, R
1989-01-01
Peak pain symptoms and experiences were explored within a group of 243 intractable pain patients seen consecutively at a pain clinic. Using a 5-point scale, patients rated the frequency with which 99 symptom adjectives occurred when their pain was at its worst. Key cluster analysis identified 11 reliable, conceptually clear symptom clusters: Four affective symptom categories, Angry Depression, Diminished Drive, Intropunitive Depression and Anxiety, describing emotional states concomitant with peak pain; two somatic symptom categories, Ecto-Pain and Endo-Pain, describing surface and deep bodily pain, respectively; and five additional symptom categories including Cognitive Dysfunction, Sleep Disturbance, Fatigue, Withdrawal and Disequilibrium. Among the affective symptom clusters, symptoms of Angry Depression were reported to occur frequently by 32% of the patients while only 11% reported the frequent occurrence of Intropunitive Depression. For the somatic symptom clusters, 25 and 52% reported the frequent occurrence of Ecto-Pain and Endo-Pain, respectively. Pain reports measured by Ecto-Pain and Endo-Pain were nearly independent of all other symptom categories. The results suggest that the experiential context of pain differs widely among intractable pain patients. The study derived a Pain Symptom Checklist to measure each symptom cluster as one way to identify coping styles among chronic pain patients.
Peleg, Mor; Asbeh, Nuaman; Kuflik, Tsvi; Schertz, Mitchell
2009-02-01
Children with developmental disorders usually exhibit multiple developmental problems (comorbidities). Hence, such diagnosis needs to revolve on developmental disorder groups. Our objective is to systematically identify developmental disorder groups and represent them in an ontology. We developed a methodology that combines two methods (1) a literature-based ontology that we created, which represents developmental disorders and potential developmental disorder groups, and (2) clustering for detecting comorbid developmental disorders in patient data. The ontology is used to interpret and improve clustering results and the clustering results are used to validate the ontology and suggest directions for its development. We evaluated our methodology by applying it to data of 1175 patients from a child development clinic. We demonstrated that the ontology improves clustering results, bringing them closer to an expert generated gold-standard. We have shown that our methodology successfully combines an ontology with a clustering method to support systematic identification and representation of developmental disorder groups.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Murugesan, Sugeerth; Bouchard, Kristofer; Chang, Edward
There exists a need for effective and easy-to-use software tools supporting the analysis of complex Electrocorticography (ECoG) data. Understanding how epileptic seizures develop or identifying diagnostic indicators for neurological diseases require the in-depth analysis of neural activity data from ECoG. Such data is multi-scale and is of high spatio-temporal resolution. Comprehensive analysis of this data should be supported by interactive visual analysis methods that allow a scientist to understand functional patterns at varying levels of granularity and comprehend its time-varying behavior. We introduce a novel multi-scale visual analysis system, ECoG ClusterFlow, for the detailed exploration of ECoG data. Our systemmore » detects and visualizes dynamic high-level structures, such as communities, derived from the time-varying connectivity network. The system supports two major views: 1) an overview summarizing the evolution of clusters over time and 2) an electrode view using hierarchical glyph-based design to visualize the propagation of clusters in their spatial, anatomical context. We present case studies that were performed in collaboration with neuroscientists and neurosurgeons using simulated and recorded epileptic seizure data to demonstrate our system's effectiveness. ECoG ClusterFlow supports the comparison of spatio-temporal patterns for specific time intervals and allows a user to utilize various clustering algorithms. Neuroscientists can identify the site of seizure genesis and its spatial progression during various the stages of a seizure. Our system serves as a fast and powerful means for the generation of preliminary hypotheses that can be used as a basis for subsequent application of rigorous statistical methods, with the ultimate goal being the clinical treatment of epileptogenic zones.« less
Application of multivariable statistical techniques in plant-wide WWTP control strategies analysis.
Flores, X; Comas, J; Roda, I R; Jiménez, L; Gernaey, K V
2007-01-01
The main objective of this paper is to present the application of selected multivariable statistical techniques in plant-wide wastewater treatment plant (WWTP) control strategies analysis. In this study, cluster analysis (CA), principal component analysis/factor analysis (PCA/FA) and discriminant analysis (DA) are applied to the evaluation matrix data set obtained by simulation of several control strategies applied to the plant-wide IWA Benchmark Simulation Model No 2 (BSM2). These techniques allow i) to determine natural groups or clusters of control strategies with a similar behaviour, ii) to find and interpret hidden, complex and casual relation features in the data set and iii) to identify important discriminant variables within the groups found by the cluster analysis. This study illustrates the usefulness of multivariable statistical techniques for both analysis and interpretation of the complex multicriteria data sets and allows an improved use of information for effective evaluation of control strategies.
Selemetas, Nikolaos; de Waal, Theo
2015-04-30
Fasciolosis caused by Fasciola hepatica (liver fluke) can cause significant economic and production losses in dairy cow farms. The aim of the current study was to identify important weather and environmental predictors of the exposure risk to liver fluke by detecting clusters of fasciolosis in Ireland. During autumn 2012, bulk-tank milk samples from 4365 dairy farms were collected throughout Ireland. Using an in-house antibody-detection ELISA, the analysis of BTM samples showed that 83% (n=3602) of dairy farms had been exposed to liver fluke. The Getis-Ord Gi* statistic identified 74 high-risk and 130 low-risk significant (P<0.01) clusters of fasciolosis. The low-risk clusters were mostly located in the southern regions of Ireland, whereas the high-risk clusters were mainly situated in the western part. Several climatic variables (monthly and seasonal mean rainfall and temperatures, total wet days and rain days) and environmental datasets (soil types, enhanced vegetation index and normalised difference vegetation index) were used to investigate dissimilarities in the exposure to liver fluke between clusters. Rainfall, total wet days and rain days, and soil type were the significant classes of climatic and environmental variables explaining the differences between significant clusters. A discriminant function analysis was used to predict the exposure risk to liver fluke using 80% of data for modelling and the remaining subset of 20% for post hoc model validation. The most significant predictors of the model risk function were total rainfall in August and September and total wet days. The risk model presented 100% sensitivity and 91% specificity and an accuracy of 95% correctly classified cases. A risk map of exposure to liver fluke was constructed with higher probability of exposure in western and north-western regions. The results of this study identified differences between clusters of fasciolosis in Ireland regarding climatic and environmental variables and detected significant predictors of the exposure risk to liver fluke. Copyright © 2015 Elsevier B.V. All rights reserved.
Discovery of a large-scale clumpy structure of the Lynx supercluster at z[similar]1.27
NASA Astrophysics Data System (ADS)
Nakata, Fumiaki; Kodama, Tadayuki; Shimasaku, Kazuhiro; Doi, Mamoru; Furusawa, Hisanori; Hamabe, Masaru; Kimura, Masahiko; Komiyama, Yutaka; Miyazaki, Satoshi; Okamura, Sadanori; Ouchi, Masami; Sekiguchi, Maki; Yagi, Masafumi; Yasuda, Naoki
2004-07-01
We report the discovery of a probable large-scale structure composed of many galaxy clumps around the known twin clusters at z=1.26 and z=1.27 in the Lynx region. Our analysis is based on deep, panoramic, and multi-colour imaging with the Suprime-Cam on the 8.2 m Subaru telescope. We apply a photometric redshift technique to extract plausible cluster members at z˜1.27 down to ˜ M*+2.5. From the 2-D distribution of these photometrically selected galaxies, we newly identify seven candidates of galaxy groups or clusters where the surface density of red galaxies is significantly high (>5σ), in addition to the two known clusters, comprising the largest most distant supercluster ever identified.
Koshioka, Masaji; Umegaki, Naoko; Boontiang, Kriangsuk; Pornchuti, Witayaporn; Thammasiri, Kanchit; Yamaguchi, Satoshi; Tatsuzawa, Fumi; Nakayama, Masayoshi; Tateishi, Akira; Kubota, Satoshi
2015-03-01
Five anthocyanins, delphinidin 3-O-rutinoside, cyanidin 3-O-rutinoside, petunidin 3-O-rutinoside, malvidin 3-O-glucoside and malvidin 3-O-rutinoside, were identified. Three anthocyanins, delphinidin 3-O-glucoside, cyanidin 3-O-glucoside and pelargonidin 3-O-rutinoside, were putatively identified based on C18 HPLC retention time, absorption spectrum, including λmax, and comparisons with those of corresponding standard anthocyanins, as the compounds responsible for the pink to purple-red pigmentation of the bracts of Curcuma alismatifolia and five related species. Cluster analysis based on four major anthocyanins formed two clusters. One consisted of only one species, C. alismatifolia, and the other consisted of five. Each cluster further formed sub-clusters depending on either species or habitats.
Dietary patterns in middle-aged Irish men and women defined by cluster analysis.
Villegas, R; Salim, A; Collins, M M; Flynn, A; Perry, I J
2004-12-01
To identify and characterise dietary patterns in a middle-aged Irish population sample and study associations between these patterns, sociodemographic and anthropometric variables and major risk factors for cardiovascular disease. A cross-sectional study. A group of 1473 men and women were sampled from 17 general practice lists in the South of Ireland. A total of 1018 attended for screening, with a response rate of 69%. Participants completed a detailed health and lifestyle questionnaire and provided a fasting blood sample for glucose, lipids and homocysteine. Dietary intake was assessed using a standard food-frequency questionnaire adapted for use in the Irish population. The food-frequency questionnaire was a modification of that used in the UK arm of the European Prospective Investigation into Cancer study, which was based on that used in the US Nurses' Health Study. Dietary patterns were assessed primarily by K-means cluster analysis, following initial principal components analysis to identify the seeds. Three dietary patterns were identified. These clusters corresponded to a traditional Irish diet, a prudent diet and a diet characterised by high consumption of alcoholic drinks and convenience foods. Cluster 1 (Traditional Diet) had the highest intakes of saturated fat (SFA), monounsaturated fat (MUFA) and percentage of total energy from fat, and the lowest polyunsaturated fat (PUFA) intake and ratio of polyunsaturated to saturated fat (P:S). Cluster 2 (Prudent Diet) was characterised by significantly higher intakes of fibre, PUFA, P:S ratio and antioxidant vitamins (vitamins C and E), and lower intakes of total fat, MUFA, SFA and cholesterol. Cluster 3 (Alcohol & Convenience Foods) had the highest intakes of alcohol, protein, cholesterol, vitamin B(12), vitamin B(6), folate, iron, phosphorus, selenium and zinc, and the lowest intakes of PUFA, vitamin A and antioxidant vitamins (vitamins C and E). There were significant differences between clusters in gender distribution, smoking status, physical activity, body mass index, waist circumference and serum homocysteine concentrations. In this general population sample, cluster analysis methods yielded two major dietary patterns: prudent and traditional. The prudent dietary pattern is associated with other health-seeking behaviours. Study of dietary patterns will help elucidate links between diet and disease and contribute to the development of healthy eating guidelines for health promotion.
Water quality analysis of the Rapur area, Andhra Pradesh, South India using multivariate techniques
NASA Astrophysics Data System (ADS)
Nagaraju, A.; Sreedhar, Y.; Thejaswi, A.; Sayadi, Mohammad Hossein
2017-10-01
The groundwater samples from Rapur area were collected from different sites to evaluate the major ion chemistry. The large number of data can lead to difficulties in the integration, interpretation, and representation of the results. Two multivariate statistical methods, hierarchical cluster analysis (HCA) and factor analysis (FA), were applied to evaluate their usefulness to classify and identify geochemical processes controlling groundwater geochemistry. Four statistically significant clusters were obtained from 30 sampling stations. This has resulted two important clusters viz., cluster 1 (pH, Si, CO3, Mg, SO4, Ca, K, HCO3, alkalinity, Na, Na + K, Cl, and hardness) and cluster 2 (EC and TDS) which are released to the study area from different sources. The application of different multivariate statistical techniques, such as principal component analysis (PCA), assists in the interpretation of complex data matrices for a better understanding of water quality of a study area. From PCA, it is clear that the first factor (factor 1), accounted for 36.2% of the total variance, was high positive loading in EC, Mg, Cl, TDS, and hardness. Based on the PCA scores, four significant cluster groups of sampling locations were detected on the basis of similarity of their water quality.
Three subgroups of pain profiles identified in 227 women with arthritis: a latent class analysis.
de Luca, Katie; Parkinson, Lynne; Downie, Aron; Blyth, Fiona; Byles, Julie
2017-03-01
The objectives were to identify subgroups of women with arthritis based upon the multi-dimensional nature of their pain experience and to compare health and socio-demographic variables between subgroups. A latent class analysis of 227 women with self-reported arthritis was used to identify clusters of women based upon the sensory, affective, and cognitive dimensions of the pain experience. Multivariate multinomial logistic regression analysis was used to determine the relationship between cluster membership and health and sociodemographic characteristics. A three-class cluster model was most parsimonious. 39.5 % of women had a unidimensional pain profile; 38.6 % of women had moderate multidimensional pain profile that included additional pain symptomatology such as sensory qualities and pain catastrophizing; and 21.9 % of women had severe multidimensional pain profile that included prominent pain symptomatology such as sensory and affective qualities of pain, pain catastrophizing, and neuropathic pain. Women with severe multidimensional pain profile have a 30.5 % higher risk of poorer quality of life and a 7.3 % higher risk of suffering depression, and women with moderate multidimensional pain profile have a 6.4 % higher risk of poorer quality of life when compared to women with unidimensional pain. This study identified three distinct subgroups of pain profiles in older women with arthritis. Women had very different experiences of pain, and cluster membership impacted significantly on health-related quality of life. These preliminary findings provide a stronger understanding of profiles of pain and may contribute to the development of tailored treatment options in arthritis.
Romay-Tallon, Raquel; Rivera-Baltanas, Tania; Allen, Josh; Olivares, Jose M; Kalynchuk, Lisa E; Caruncho, Hector J
2017-01-01
The pattern of serotonin transporter clustering on the plasma membrane of lymphocytes extracted from human whole blood samples has been identified as a putative biomarker of therapeutic efficacy in major depression. Here we evaluated the possibility of performing a similar analysis using blood smears obtained from rats, and from control human subjects and depression patients. We hypothesized that we could optimize a protocol to make the analysis of serotonin protein clustering in blood smears comparable to the analysis of serotonin protein clustering using isolated lymphocytes. Our data indicate that blood smears require a longer fixation time and longer times of incubation with primary and secondary antibodies. In addition, one needs to optimize the image analysis settings for the analysis of smears. When these steps are followed, the quantitative analysis of both the number and size of serotonin transporter clusters on the plasma membrane of lymphocytes is similar using both blood smears and isolated lymphocytes. The development of this novel protocol will greatly facilitate the collection of appropriate samples by eliminating the necessity and cost of specialized personnel for drawing blood samples, and by being a less invasive procedure. Therefore, this protocol will help us advance the validation of membrane protein clustering in lymphocytes as a biomarker of therapeutic efficacy in major depression, and bring it closer to its clinical application.
Yang, Ze-Hui; Zheng, Rui; Gao, Yuan; Zhang, Qiang
2016-09-01
With the widespread application of high-throughput technology, numerous meta-analysis methods have been proposed for differential expression profiling across multiple studies. We identified the suitable differentially expressed (DE) genes that contributed to lung adenocarcinoma (ADC) clustering based on seven popular multiple meta-analysis methods. Seven microarray expression profiles of ADC and normal controls were extracted from the ArrayExpress database. The Bioconductor was used to perform the data preliminary preprocessing. Then, DE genes across multiple studies were identified. Hierarchical clustering was applied to compare the classification performance for microarray data samples. The classification efficiency was compared based on accuracy, sensitivity and specificity. Across seven datasets, 573 ADC cases and 222 normal controls were collected. After filtering out unexpressed and noninformative genes, 3688 genes were remained for further analysis. The classification efficiency analysis showed that DE genes identified by sum of ranks method separated ADC from normal controls with the best accuracy, sensitivity and specificity of 0.953, 0.969 and 0.932, respectively. The gene set with the highest classification accuracy mainly participated in the regulation of response to external stimulus (P = 7.97E-04), cyclic nucleotide-mediated signaling (P = 0.01), regulation of cell morphogenesis (P = 0.01) and regulation of cell proliferation (P = 0.01). Evaluation of DE genes identified by different meta-analysis methods in classification efficiency provided a new perspective to the choice of the suitable method in a given application. Varying meta-analysis methods always present varying abilities, so synthetic consideration should be taken when providing meta-analysis methods for particular research. © 2015 John Wiley & Sons Ltd.
Lindsey, Cary R.; Neupane, Ghanashym; Spycher, Nicolas; ...
2018-01-03
Although many Known Geothermal Resource Areas in Oregon and Idaho were identified during the 1970s and 1980s, few were subsequently developed commercially. Because of advances in power plant design and energy conversion efficiency since the 1980s, some previously identified KGRAs may now be economically viable prospects. Unfortunately, available characterization data vary widely in accuracy, precision, and granularity, making assessments problematic. In this paper, we suggest a procedure for comparing test areas against proven resources using Principal Component Analysis and cluster identification. The result is a low-cost tool for evaluating potential exploration targets using uncertain or incomplete data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lindsey, Cary R.; Neupane, Ghanashym; Spycher, Nicolas
Although many Known Geothermal Resource Areas in Oregon and Idaho were identified during the 1970s and 1980s, few were subsequently developed commercially. Because of advances in power plant design and energy conversion efficiency since the 1980s, some previously identified KGRAs may now be economically viable prospects. Unfortunately, available characterization data vary widely in accuracy, precision, and granularity, making assessments problematic. In this paper, we suggest a procedure for comparing test areas against proven resources using Principal Component Analysis and cluster identification. The result is a low-cost tool for evaluating potential exploration targets using uncertain or incomplete data.
Analysis of candidates for interacting galaxy clusters. I. A1204 and A2029/A2033
NASA Astrophysics Data System (ADS)
Gonzalez, Elizabeth Johana; de los Rios, Martín; Oio, Gabriel A.; Lang, Daniel Hernández; Tagliaferro, Tania Aguirre; Domínguez R., Mariano J.; Castellón, José Luis Nilo; Cuevas L., Héctor; Valotto, Carlos A.
2018-04-01
Context. Merging galaxy clusters allow for the study of different mass components, dark and baryonic, separately. Also, their occurrence enables to test the ΛCDM scenario, which can be used to put constraints on the self-interacting cross-section of the dark-matter particle. Aim. It is necessary to perform a homogeneous analysis of these systems. Hence, based on a recently presented sample of candidates for interacting galaxy clusters, we present the analysis of two of these cataloged systems. Methods: In this work, the first of a series devoted to characterizing galaxy clusters in merger processes, we perform a weak lensing analysis of clusters A1204 and A2029/A2033 to derive the total masses of each identified interacting structure together with a dynamical study based on a two-body model. We also describe the gas and the mass distributions in the field through a lensing and an X-ray analysis. This is the first of a series of works which will analyze these type of system in order to characterize them. Results: Neither merging cluster candidate shows evidence of having had a recent merger event. Nevertheless, there is dynamical evidence that these systems could be interacting or could interact in the future. Conclusions: It is necessary to include more constraints in order to improve the methodology of classifying merging galaxy clusters. Characterization of these clusters is important in order to properly understand the nature of these systems and their connection with dynamical studies.
Faires, Meredith C; Pearl, David L; Ciccotelli, William A; Berke, Olaf; Reid-Smith, Richard J; Weese, J Scott
2014-07-08
In healthcare facilities, conventional surveillance techniques using rule-based guidelines may result in under- or over-reporting of methicillin-resistant Staphylococcus aureus (MRSA) outbreaks, as these guidelines are generally unvalidated. The objectives of this study were to investigate the utility of the temporal scan statistic for detecting MRSA clusters, validate clusters using molecular techniques and hospital records, and determine significant differences in the rate of MRSA cases using regression models. Patients admitted to a community hospital between August 2006 and February 2011, and identified with MRSA>48 hours following hospital admission, were included in this study. Between March 2010 and February 2011, MRSA specimens were obtained for spa typing. MRSA clusters were investigated using a retrospective temporal scan statistic. Tests were conducted on a monthly scale and significant clusters were compared to MRSA outbreaks identified by hospital personnel. Associations between the rate of MRSA cases and the variables year, month, and season were investigated using a negative binomial regression model. During the study period, 735 MRSA cases were identified and 167 MRSA isolates were spa typed. Nine different spa types were identified with spa type 2/t002 (88.6%) the most prevalent. The temporal scan statistic identified significant MRSA clusters at the hospital (n=2), service (n=16), and ward (n=10) levels (P ≤ 0.05). Seven clusters were concordant with nine MRSA outbreaks identified by hospital staff. For the remaining clusters, seven events may have been equivalent to true outbreaks and six clusters demonstrated possible transmission events. The regression analysis indicated years 2009-2011, compared to 2006, and months March and April, compared to January, were associated with an increase in the rate of MRSA cases (P ≤ 0.05). The application of the temporal scan statistic identified several MRSA clusters that were not detected by hospital personnel. The identification of specific years and months with increased MRSA rates may be attributable to several hospital level factors including the presence of other pathogens. Within hospitals, the incorporation of the temporal scan statistic to standard surveillance techniques is a valuable tool for healthcare workers to evaluate surveillance strategies and aid in the identification of MRSA clusters.
González, Antonio; Paoloni, Verónica; Donolo, Danilo; Rinaudo, Cristina
2012-11-01
Previous research has focused on specific forms of self-determined motivation or discrete class-related emotions, but few studies have simultaneously examined both constructs. The aim of this study on 472 undergraduates was twofold: to perform cluster analysis to identify homogeneous groups of motivation in the sample; and to determine the profile of each cluster for emotions and academic achievement. Cluster analysis configured four groups in terms of motivation: controlled, autonomous, both high, and both low. Each cluster revealed a distinct emotional profile, autonomous motivation being the most adaptable with high scores for academic achievement and pleasant emotions and low values for unpleasant emotions. The results are discussed in the light of their implications for academic adjustment.
Marshman, Z; Broomhead, T; Rodd, H D; Jones, K; Burke, D; Baker, S R
2016-09-28
Emergency departments (EDs) have been identified as key providers of dental care although few studies have examined patterns of attendance or clusters of characteristics. The aim was to identify the reasons for visits to an ED, whether these remained stable over time, and characterize clusters of patients by socio-demographic and attendance variables. Pseudonymized data were obtained for children who attended the ED in 2003-2004, 2004-2005 and 2012-2013. Presenting complaint was categorized as attending for dental or nondental reasons. Other variables analysed included patient (age, sex, ethnicity and deprivation) and attendance characteristics (distance travelled, season, nature of complaint, time elapsed since onset of symptoms, day of week and hours of attendance), together with treatment outcome (advice, antibiotics and referral). To assess trends over time, analyses were conducted on patient, attendance and treatment outcome variables. To examine whether patients could be characterized by socio-demographic and attendance variables, a two-step cluster analysis was undertaken on 2003-2004 data set and validated on 2004-2005 and 2012-2013 data sets. In 2003-2004, 550 children attended the ED for dental reasons rising to 687 in 2012-2013. The most important predictors of dental attendance were as follows: nature of complaint, ethnicity, time elapsed, sex and deprivation of the area in which children lived. The analysis showed two clusters: cluster 1 was comprised of children who attended the ED for dental injury, were of White ethnicity and attended within 24 h of onset of symptoms. Children in this cluster were likely to be from the least or less deprived areas (compared to Cluster 2) and were more likely to be males. Cluster 2 comprised of children attending the ED for caries, oral mucosal lesions or other complaints, were likely to be of other (non-White) ethnicities and were likely to attend more than 24 h after symptoms began. Children in this cluster were more likely to come from the most deprived areas and were both males and females. The clusters varied according to treatment outcome; those patients in Cluster 2 were more likely to be prescribed medication, whilst those children in Cluster 1 were more likely to be referred to another specialty. A significant number of visits to the ED were for dental reasons with two clusters of children. The results have identified groups of patients for whom appropriate dental provision is lacking and where targeted services are needed to improve outcomes for children and reduce the burden on EDs. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Zhang, Y J; Zhou, D H; Bai, Z P; Xue, F X
2018-02-10
Objective: To quantitatively analyze the current status and development trends regarding the land use regression (LUR) models on ambient air pollution studies. Methods: Relevant literature from the PubMed database before June 30, 2017 was analyzed, using the Bibliographic Items Co-occurrence Matrix Builder (BICOMB 2.0). Keywords co-occurrence networks, cluster mapping and timeline mapping were generated, using the CiteSpace 5.1.R5 software. Relevant literature identified in three Chinese databases was also reviewed. Results: Four hundred sixty four relevant papers were retrieved from the PubMed database. The number of papers published showed an annual increase, in line with the growing trend of the index. Most papers were published in the journal of Environmental Health Perspectives . Results from the Co-word cluster analysis identified five clusters: cluster#0 consisted of birth cohort studies related to the health effects of prenatal exposure to air pollution; cluster#1 referred to land use regression modeling and exposure assessment; cluster#2 was related to the epidemiology on traffic exposure; cluster#3 dealt with the exposure to ultrafine particles and related health effects; cluster#4 described the exposure to black carbon and related health effects. Data from Timeline mapping indicated that cluster#0 and#1 were the main research areas while cluster#3 and#4 were the up-coming hot areas of research. Ninety four relevant papers were retrieved from the Chinese databases with most of them related to studies on modeling. Conclusion: In order to better assess the health-related risks of ambient air pollution, and to best inform preventative public health intervention policies, application of LUR models to environmental epidemiology studies in China should be encouraged.
Social Media Use and Depression and Anxiety Symptoms: A Cluster Analysis.
Shensa, Ariel; Sidani, Jaime E; Dew, Mary Amanda; Escobar-Viera, César G; Primack, Brian A
2018-03-01
Individuals use social media with varying quantity, emotional, and behavioral at- tachment that may have differential associations with mental health outcomes. In this study, we sought to identify distinct patterns of social media use (SMU) and to assess associations between those patterns and depression and anxiety symptoms. In October 2014, a nationally-representative sample of 1730 US adults ages 19 to 32 completed an online survey. Cluster analysis was used to identify patterns of SMU. Depression and anxiety were measured using respective 4-item Patient-Reported Outcome Measurement Information System (PROMIS) scales. Multivariable logistic regression models were used to assess associations between clus- ter membership and depression and anxiety. Cluster analysis yielded a 5-cluster solu- tion. Participants were characterized as "Wired," "Connected," "Diffuse Dabblers," "Concentrated Dabblers," and "Unplugged." Membership in 2 clusters - "Wired" and "Connected" - increased the odds of elevated depression and anxiety symptoms (AOR = 2.7, 95% CI = 1.5-4.7; AOR = 3.7, 95% CI = 2.1-6.5, respectively, and AOR = 2.0, 95% CI = 1.3-3.2; AOR = 2.0, 95% CI = 1.3-3.1, respectively). SMU pattern characterization of a large population suggests 2 pat- terns are associated with risk for depression and anxiety. Developing educational interventions that address use patterns rather than single aspects of SMU (eg, quantity) would likely be useful.
Identification of Clusters of Foot Pain Location in a Community Sample.
Gill, Tiffany K; Menz, Hylton B; Landorf, Karl B; Arnold, John B; Taylor, Anne W; Hill, Catherine L
2017-12-01
To identify foot pain clusters according to pain location in a community-based sample of the general population. This study analyzed data from the North West Adelaide Health Study. Data were obtained between 2004 and 2006, using computer-assisted telephone interviewing, clinical assessment, and self-completed questionnaire. The location of foot pain was assessed using a diagram during the clinical assessment. Hierarchical cluster analysis was undertaken to identify foot pain location clusters, which were then compared in relation to demographics, comorbidities, and podiatry services utilization. There were 558 participants with foot pain (mean age 54.4 years, 57.5% female). Five clusters were identified: 1 with predominantly arch and ball pain (26.8%), 1 with rearfoot pain (20.9%), 1 with heel pain (13.3%), and 2 with predominantly forefoot, toe, and nail pain (28.3% and 10.7%). Each cluster was distinct in age, sex, and comorbidity profile. Of the two clusters with predominantly forefoot, toe, and nail pain, one of them had a higher proportion of men and those classified as obese, had diabetes mellitus, and used podiatry services (30%), while the other was comprised of a higher proportion of women who were overweight and reported less use of podiatry services (17.5%). Five clusters of foot pain according to pain location were identified, all with distinct age, sex, and comorbidity profiles. These findings may assist in the identification of individuals at risk for developing foot pain and in the development of targeted preventive strategies and treatments. © 2017, American College of Rheumatology.
Jagodzinski, Linda L.; Liu, Ying; Pham, Peter T.; Kijak, Gustavo H.; Tovanabutra, Sodsai; McCutchan, Francine E.; Scoville, Stephanie L.; Cersovsky, Steven B.; Michael, Nelson L.; Scott, Paul T.; Peel, Sheila A.
2017-01-01
Objective Recent surveillance data suggests the United States (U.S.) Army HIV epidemic is concentrated among men who have sex with men. To identify potential targets for HIV prevention strategies, the relationship between demographic and clinical factors and membership within transmission clusters based on baseline pol sequences of HIV-infected Soldiers from 2001 through 2012 were analyzed. Methods We conducted a retrospective analysis of baseline partial pol sequences, demographic and clinical characteristics available for all Soldiers in active service and newly-diagnosed with HIV-1 infection from January 1, 2001 through December 31, 2012. HIV-1 subtype designations and transmission clusters were identified from phylogenetic analysis of sequences. Univariate and multivariate logistic regression models were used to evaluate and adjust for the association between characteristics and cluster membership. Results Among 518 of 995 HIV-infected Soldiers with available partial pol sequences, 29% were members of a transmission cluster. Assignment to a southern U.S. region at diagnosis and year of diagnosis were independently associated with cluster membership after adjustment for other significant characteristics (p<0.10) of age, race, year of diagnosis, region of duty assignment, sexually transmitted infections, last negative HIV test, antiretroviral therapy, and transmitted drug resistance. Subtyping of the pol fragment indicated HIV-1 subtype B infection predominated (94%) among HIV-infected Soldiers. Conclusion These findings identify areas to explore as HIV prevention targets in the U.S. Army. An increased frequency of current force testing may be justified, especially among Soldiers assigned to duty in installations with high local HIV prevalence such as southern U.S. states. PMID:28759645
Psychiatrist-patient verbal and nonverbal communications during split-treatment appointments.
Cruz, Mario; Roter, Debra; Cruz, Robyn Flaum; Wieland, Melissa; Cooper, Lisa A; Larson, Susan; Pincus, Harold Alan
2011-11-01
This study characterized psychiatrist and patient communication behaviors and affective voice tones during pharmacotherapy appointments with depressed patients at four community-based mental health clinics where psychiatrists provided medication management and other mental health professionals provided therapy ("split treatment"). Audiorecordings of 84 unique pairs of psychiatrists and patients with a depressive disorder were analyzed with the Roter Interaction Analysis System, which identifies 41 discrete speech categories that can be grouped into composites representing broad conceptual communication domains. Cluster analysis identified psychiatrist communication patterns. T test and chi square analyses compared the clusters for verbal dominance, affective voice tone, and characteristics of psychiatrist and patients. On average, 53% of psychiatrist talk was devoted to partnering and relationship building, and 67% of patient talk was about biomedical subjects, such as depression symptoms, and psychosocial information giving. Psychiatrist communication patterns were characterized by two clusters, a biomedical-centered cluster that emphasized biomedical questions (η²=.22, df=82, p<.001) and education or counseling (η²=.20, df=82, p<.001) and a patient-centered cluster focused on psychosocial and lifestyle questions (η²=.24, df=82, p<.001) and information giving (η²=.17, df=82, p<.001). The patient-centered cluster was associated with patients' expression of distress, anger, or other negative affects (t=3.22, df= 82, p=.002). Psychiatrists devoted much of their talk to partnering and relationship building while maintaining a focus on symptoms or psychosocial issues. However, patient behaviors did not reflect a similar level of partnering. Future studies should identify psychiatrist communication behaviors that activate collaborative patient communications or improve treatment outcomes.
Berenguer, Roberto; Pastor-Juan, María Del Rosario; Canales-Vázquez, Jesús; Castro-García, Miguel; Villas, María Victoria; Legorburo, Francisco Mansilla; Sabater, Sebastià
2018-04-24
Purpose To identify the reproducible and nonredundant radiomics features (RFs) for computed tomography (CT). Materials and Methods Two phantoms were used to test RF reproducibility by using test-retest analysis, by changing the CT acquisition parameters (hereafter, intra-CT analysis), and by comparing five different scanners with the same CT parameters (hereafter, inter-CT analysis). Reproducible RFs were selected by using the concordance correlation coefficient (as a measure of the agreement between variables) and the coefficient of variation (defined as the ratio of the standard deviation to the mean). Redundant features were grouped by using hierarchical cluster analysis. Results A total of 177 RFs including intensity, shape, and texture features were evaluated. The test-retest analysis showed that 91% (161 of 177) of the RFs were reproducible according to concordance correlation coefficient. Reproducibility of intra-CT RFs, based on coefficient of variation, ranged from 89.3% (151 of 177) to 43.1% (76 of 177) where the pitch factor and the reconstruction kernel were modified, respectively. Reproducibility of inter-CT RFs, based on coefficient of variation, also showed large material differences, from 85.3% (151 of 177; wood) to only 15.8% (28 of 177; polyurethane). Ten clusters were identified after the hierarchical cluster analysis and one RF per cluster was chosen as representative. Conclusion Many RFs were redundant and nonreproducible. If all the CT parameters are fixed except field of view, tube voltage, and milliamperage, then the information provided by the analyzed RFs can be summarized in only 10 RFs (each representing a cluster) because of redundancy. © RSNA, 2018 Online supplemental material is available for this article.
Geographic clustering of elevated blood heavy metal levels in pregnant women.
King, Katherine E; Darrah, Thomas H; Money, Eric; Meentemeyer, Ross; Maguire, Rachel L; Nye, Monica D; Michener, Lloyd; Murtha, Amy P; Jirtle, Randy; Murphy, Susan K; Mendez, Michelle A; Robarge, Wayne; Vengosh, Avner; Hoyo, Cathrine
2015-10-09
Cadmium (Cd), lead (Pb), mercury (Hg), and arsenic (As) exposure is ubiquitous and has been associated with higher risk of growth restriction and cardiometabolic and neurodevelopmental disorders. However, cost-efficient strategies to identify at-risk populations and potential sources of exposure to inform mitigation efforts are limited. The objective of this study was to describe the spatial distribution and identify factors associated with Cd, Pb, Hg, and As concentrations in peripheral blood of pregnant women. Heavy metals were measured in whole peripheral blood of 310 pregnant women obtained at gestational age ~12 weeks. Prenatal residential addresses were geocoded and geospatial analysis (Getis-Ord Gi* statistics) was used to determine if elevated blood concentrations were geographically clustered. Logistic regression models were used to identify factors associated with elevated blood metal levels and cluster membership. Geospatial clusters for Cd and Pb were identified with high confidence (p-value for Gi* statistic <0.01). The Cd and Pb clusters comprised 10.5 and 9.2 % of Durham County residents, respectively. Medians and interquartile ranges of blood concentrations (μg/dL) for all participants were Cd 0.02 (0.01-0.04), Hg 0.03 (0.01-0.07), Pb 0.34 (0.16-0.83), and As 0.04 (0.04-0.05). In the Cd cluster, medians and interquartile ranges of blood concentrations (μg/dL) were Cd 0.06 (0.02-0.16), Hg 0.02 (0.00-0.05), Pb 0.54 (0.23-1.23), and As 0.05 (0.04-0.05). In the Pb cluster, medians and interquartile ranges of blood concentrations (μg/dL) were Cd 0.03 (0.02-0.15), Hg 0.01 (0.01-0.05), Pb 0.39 (0.24-0.74), and As 0.04 (0.04-0.05). Co-exposure with Pb and Cd was also clustered, the p-values for the Gi* statistic for Pb and Cd was <0.01. Cluster membership was associated with lower education levels and higher pre-pregnancy BMI. Our data support that elevated blood concentrations of Cd and Pb are spatially clustered in this urban environment compared to the surrounding areas. Spatial analysis of metals concentrations in peripheral blood or urine obtained routinely during prenatal care can be useful in surveillance of heavy metal exposure.
Bowd, Christopher; Weinreb, Robert N; Balasubramanian, Madhusudhanan; Lee, Intae; Jang, Giljin; Yousefi, Siamak; Zangwill, Linda M; Medeiros, Felipe A; Girkin, Christopher A; Liebmann, Jeffrey M; Goldbaum, Michael H
2014-01-01
The variational Bayesian independent component analysis-mixture model (VIM), an unsupervised machine-learning classifier, was used to automatically separate Matrix Frequency Doubling Technology (FDT) perimetry data into clusters of healthy and glaucomatous eyes, and to identify axes representing statistically independent patterns of defect in the glaucoma clusters. FDT measurements were obtained from 1,190 eyes with normal FDT results and 786 eyes with abnormal FDT results from the UCSD-based Diagnostic Innovations in Glaucoma Study (DIGS) and African Descent and Glaucoma Evaluation Study (ADAGES). For all eyes, VIM input was 52 threshold test points from the 24-2 test pattern, plus age. FDT mean deviation was -1.00 dB (S.D. = 2.80 dB) and -5.57 dB (S.D. = 5.09 dB) in FDT-normal eyes and FDT-abnormal eyes, respectively (p<0.001). VIM identified meaningful clusters of FDT data and positioned a set of statistically independent axes through the mean of each cluster. The optimal VIM model separated the FDT fields into 3 clusters. Cluster N contained primarily normal fields (1109/1190, specificity 93.1%) and clusters G1 and G2 combined, contained primarily abnormal fields (651/786, sensitivity 82.8%). For clusters G1 and G2 the optimal number of axes were 2 and 5, respectively. Patterns automatically generated along axes within the glaucoma clusters were similar to those known to be indicative of glaucoma. Fields located farther from the normal mean on each glaucoma axis showed increasing field defect severity. VIM successfully separated FDT fields from healthy and glaucoma eyes without a priori information about class membership, and identified familiar glaucomatous patterns of loss.
2015-12-01
group assignment of samples in unsupervised hierarchical clustering by the Unweighted Pair-Group Method using Arithmetic averages ( UPGMA ) based on...log2 transformed MAS5.0 signal values; probe set clustering was performed by the UPGMA method using Cosine correlation as the similarity met- ric. For...differentially-regulated genes identified were subjected to unsupervised hierarchical clustering analysis using the UPGMA algorithm with cosine correlation as
Influence of birth cohort on age of onset cluster analysis in bipolar I disorder.
Bauer, M; Glenn, T; Alda, M; Andreassen, O A; Angelopoulos, E; Ardau, R; Baethge, C; Bauer, R; Bellivier, F; Belmaker, R H; Berk, M; Bjella, T D; Bossini, L; Bersudsky, Y; Cheung, E Y W; Conell, J; Del Zompo, M; Dodd, S; Etain, B; Fagiolini, A; Frye, M A; Fountoulakis, K N; Garneau-Fournier, J; Gonzalez-Pinto, A; Harima, H; Hassel, S; Henry, C; Iacovides, A; Isometsä, E T; Kapczinski, F; Kliwicki, S; König, B; Krogh, R; Kunz, M; Lafer, B; Larsen, E R; Lewitzka, U; Lopez-Jaramillo, C; MacQueen, G; Manchia, M; Marsh, W; Martinez-Cengotitabengoa, M; Melle, I; Monteith, S; Morken, G; Munoz, R; Nery, F G; O'Donovan, C; Osher, Y; Pfennig, A; Quiroz, D; Ramesar, R; Rasgon, N; Reif, A; Ritter, P; Rybakowski, J K; Sagduyu, K; Scippa, A M; Severus, E; Simhandl, C; Stein, D J; Strejilevich, S; Hatim Sulaiman, A; Suominen, K; Tagata, H; Tatebayashi, Y; Torrent, C; Vieta, E; Viswanath, B; Wanchoo, M J; Zetin, M; Whybrow, P C
2015-01-01
Two common approaches to identify subgroups of patients with bipolar disorder are clustering methodology (mixture analysis) based on the age of onset, and a birth cohort analysis. This study investigates if a birth cohort effect will influence the results of clustering on the age of onset, using a large, international database. The database includes 4037 patients with a diagnosis of bipolar I disorder, previously collected at 36 collection sites in 23 countries. Generalized estimating equations (GEE) were used to adjust the data for country median age, and in some models, birth cohort. Model-based clustering (mixture analysis) was then performed on the age of onset data using the residuals. Clinical variables in subgroups were compared. There was a strong birth cohort effect. Without adjusting for the birth cohort, three subgroups were found by clustering. After adjusting for the birth cohort or when considering only those born after 1959, two subgroups were found. With results of either two or three subgroups, the youngest subgroup was more likely to have a family history of mood disorders and a first episode with depressed polarity. However, without adjusting for birth cohort (three subgroups), family history and polarity of the first episode could not be distinguished between the middle and oldest subgroups. These results using international data confirm prior findings using single country data, that there are subgroups of bipolar I disorder based on the age of onset, and that there is a birth cohort effect. Including the birth cohort adjustment altered the number and characteristics of subgroups detected when clustering by age of onset. Further investigation is needed to determine if combining both approaches will identify subgroups that are more useful for research. Copyright © 2014 Elsevier Masson SAS. All rights reserved.
Torheim, Turid; Groendahl, Aurora R; Andersen, Erlend K F; Lyng, Heidi; Malinen, Eirik; Kvaal, Knut; Futsaether, Cecilia M
2016-11-01
Solid tumors are known to be spatially heterogeneous. Detection of treatment-resistant tumor regions can improve clinical outcome, by enabling implementation of strategies targeting such regions. In this study, K-means clustering was used to group voxels in dynamic contrast enhanced magnetic resonance images (DCE-MRI) of cervical cancers. The aim was to identify clusters reflecting treatment resistance that could be used for targeted radiotherapy with a dose-painting approach. Eighty-one patients with locally advanced cervical cancer underwent DCE-MRI prior to chemoradiotherapy. The resulting image time series were fitted to two pharmacokinetic models, the Tofts model (yielding parameters K trans and ν e ) and the Brix model (A Brix , k ep and k el ). K-means clustering was used to group similar voxels based on either the pharmacokinetic parameter maps or the relative signal increase (RSI) time series. The associations between voxel clusters and treatment outcome (measured as locoregional control) were evaluated using the volume fraction or the spatial distribution of each cluster. One voxel cluster based on the RSI time series was significantly related to locoregional control (adjusted p-value 0.048). This cluster consisted of low-enhancing voxels. We found that tumors with poor prognosis had this RSI-based cluster gathered into few patches, making this cluster a potential candidate for targeted radiotherapy. None of the voxels clusters based on Tofts or Brix parameter maps were significantly related to treatment outcome. We identified one group of tumor voxels significantly associated with locoregional relapse that could potentially be used for dose painting. This tumor voxel cluster was identified using the raw MRI time series rather than the pharmacokinetic maps.
Patterns of Dysmorphic Features in Schizophrenia
Scutt, L.E.; Chow, E.W.C.; Weksberg, R.; Honer, W.G.; Bassett, Anne S.
2011-01-01
Congenital dysmorphic features are prevalent in schizophrenia and may reflect underlying neurodevelopmental abnormalities. A cluster analysis approach delineating patterns of dysmorphic features has been used in genetics to classify individuals into more etiologically homogeneous subgroups. In the present study, this approach was applied to schizophrenia, using a sample with a suspected genetic syndrome as a testable model. Subjects (n = 159) with schizophrenia or schizoaffective disorder were ascertained from chronic patient populations (random, n=123) or referred with possible 22q11 deletion syndrome (referred, n = 36). All subjects were evaluated for presence or absence of 70 reliably assessed dysmorphic features, which were used in a three-step cluster analysis. The analysis produced four major clusters with different patterns of dysmorphic features. Significant between-cluster differences were found for rates of 37 dysmorphic features (P < 0.05), median number of dysmorphic features (P = 0.0001), and validating features not used in the cluster analysis: mild mental retardation (P = 0.001) and congenital heart defects (P = 0.002). Two clusters (1 and 4) appeared to represent more developmental subgroups of schizophrenia with elevated rates of dysmorphic features and validating features. Cluster 1 (n = 27) comprised mostly referred subjects. Cluster 4 (n= 18) had a different pattern of dysmorphic features; one subject had a mosaic Turner syndrome variant. Two other clusters had lower rates and patterns of features consistent with those found in previous studies of schizophrenia. Delineating patterns of dysmorphic features may help identify subgroups that could represent neurodevelopmental forms of schizophrenia with more homogeneous origins. PMID:11803519
COVARIATE-ADAPTIVE CLUSTERING OF EXPOSURES FOR AIR POLLUTION EPIDEMIOLOGY COHORTS*
Keller, Joshua P.; Drton, Mathias; Larson, Timothy; Kaufman, Joel D.; Sandler, Dale P.; Szpiro, Adam A.
2017-01-01
Cohort studies in air pollution epidemiology aim to establish associations between health outcomes and air pollution exposures. Statistical analysis of such associations is complicated by the multivariate nature of the pollutant exposure data as well as the spatial misalignment that arises from the fact that exposure data are collected at regulatory monitoring network locations distinct from cohort locations. We present a novel clustering approach for addressing this challenge. Specifically, we present a method that uses geographic covariate information to cluster multi-pollutant observations and predict cluster membership at cohort locations. Our predictive k-means procedure identifies centers using a mixture model and is followed by multi-class spatial prediction. In simulations, we demonstrate that predictive k-means can reduce misclassification error by over 50% compared to ordinary k-means, with minimal loss in cluster representativeness. The improved prediction accuracy results in large gains of 30% or more in power for detecting effect modification by cluster in a simulated health analysis. In an analysis of the NIEHS Sister Study cohort using predictive k-means, we find that the association between systolic blood pressure (SBP) and long-term fine particulate matter (PM2.5) exposure varies significantly between different clusters of PM2.5 component profiles. Our cluster-based analysis shows that for subjects assigned to a cluster located in the Midwestern U.S., a 10 μg/m3 difference in exposure is associated with 4.37 mmHg (95% CI, 2.38, 6.35) higher SBP. PMID:28572869
NASA Technical Reports Server (NTRS)
Li, Z. K.
1985-01-01
A specialized program was developed for flow cytometric list-mode data using an heirarchical tree method for identifying and enumerating individual subpopulations, the method of principal components for a two-dimensional display of 6-parameter data array, and a standard sorting algorithm for characterizing subpopulations. The program was tested against a published data set subjected to cluster analysis and experimental data sets from controlled flow cytometry experiments using a Coulter Electronics EPICS V Cell Sorter. A version of the program in compiled BASIC is usable on a 16-bit microcomputer with the MS-DOS operating system. It is specialized for 6 parameters and up to 20,000 cells. Its two-dimensional display of Euclidean distances reveals clusters clearly, as does its 1-dimensional display. The identified subpopulations can, in suitable experiments, be related to functional subpopulations of cells.
Lin, Shih-Yen; Liu, Chih-Wei
2014-01-01
This study combines cluster analysis and LRFM (length, recency, frequency, and monetary) model in a pediatric dental clinic in Taiwan to analyze patients' values. A two-stage approach by self-organizing maps and K-means method is applied to segment 1,462 patients into twelve clusters. The average values of L, R, and F excluding monetary covered by national health insurance program are computed for each cluster. In addition, customer value matrix is used to analyze customer values of twelve clusters in terms of frequency and monetary. Customer relationship matrix considering length and recency is also applied to classify different types of customers from these twelve clusters. The results show that three clusters can be classified into loyal patients with L, R, and F values greater than the respective average L, R, and F values, while three clusters can be viewed as lost patients without any variable above the average values of L, R, and F. When different types of patients are identified, marketing strategies can be designed to meet different patients' needs. PMID:25045741
Wu, Hsin-Hung; Lin, Shih-Yen; Liu, Chih-Wei
2014-01-01
This study combines cluster analysis and LRFM (length, recency, frequency, and monetary) model in a pediatric dental clinic in Taiwan to analyze patients' values. A two-stage approach by self-organizing maps and K-means method is applied to segment 1,462 patients into twelve clusters. The average values of L, R, and F excluding monetary covered by national health insurance program are computed for each cluster. In addition, customer value matrix is used to analyze customer values of twelve clusters in terms of frequency and monetary. Customer relationship matrix considering length and recency is also applied to classify different types of customers from these twelve clusters. The results show that three clusters can be classified into loyal patients with L, R, and F values greater than the respective average L, R, and F values, while three clusters can be viewed as lost patients without any variable above the average values of L, R, and F. When different types of patients are identified, marketing strategies can be designed to meet different patients' needs.
Psychological profiling of offender characteristics from crime behaviors in serial rape offences.
Kocsis, Richard N; Cooksey, Ray W; Irwin, Harvey J
2002-04-01
Criminal psychological profiling has progressively been incorporated into police procedures despite a dearth of empirical research. Indeed, in the study of serial violent crimes for the purpose of psychological profiling, very few original, quantitative, academically reviewed studies actually exist. This article reports on the analysis of 62 incidents of serial sexual assault. The statistical procedure of multidimensional scaling was employed in the analysis of this data, which in turn produced a five-cluster model of serial rapist behavior. First, a central cluster of behaviors were identified that represent common behaviors to all patterns of serial rape. Second, four distinct outlying patterns were identified as demonstrating distinct offence styles, these being assigned the following descriptive labels brutality, intercourse, chaotic, and ritual. Furthermore, analysis of these patterns also identified distinct offender characteristics that allow for the use of empirically robust offender profiles in future serial rape investigations.
Pego-Reigosa, José María; Lois-Iglesias, Ana; Rúa-Figueroa, Íñigo; Galindo, María; Calvo-Alén, Jaime; de Uña-Álvarez, Jacobo; Balboa-Barreiro, Vanessa; Ibáñez Ruan, Jesús; Olivé, Alejandro; Rodríguez-Gómez, Manuel; Fernández Nebro, Antonio; Andrés, Mariano; Erausquin, Celia; Tomero, Eva; Horcada Rubio, Loreto; Uriarte Isacelaya, Esther; Freire, Mercedes; Montilla, Carlos; Sánchez-Atrio, Ana I; Santos-Soler, Gregorio; Zea, Antonio; Díez, Elvira; Narváez, Javier; Blanco-Alonso, Ricardo; Silva-Fernández, Lucía; Ruiz-Lucea, María Esther; Fernández-Castro, Mónica; Hernández-Beriain, José Ángel; Gantes-Mora, Marian; Hernández-Cruz, Blanca; Pérez-Venegas, José; Pecondón-Español, Ángela; Marras Fernández-Cid, Carlos; Ibáñez-Barcelo, Mónica; Bonilla, Gema; Torrente-Segarra, Vicenç; Castellví, Iván; Alegre, Juan José; Calvet, Joan; Marenco de la Fuente, José Luis; Raya, Enrique; Vázquez-Rodríguez, Tomás Ramón; Quevedo-Vila, Víctor; Muñoz-Fernández, Santiago; Otón, Teresa; Rahman, Anisur; López-Longo, Francisco Javier
2016-07-01
To identify patterns (clusters) of damage manifestations within a large cohort of SLE patients and evaluate the potential association of these clusters with a higher risk of mortality. This is a multicentre, descriptive, cross-sectional study of a cohort of 3656 SLE patients from the Spanish Society of Rheumatology Lupus Registry. Organ damage was ascertained using the Systemic Lupus International Collaborating Clinics Damage Index. Using cluster analysis, groups of patients with similar patterns of damage manifestations were identified. Then, overall clusters were compared as well as the subgroup of patients within every cluster with disease duration shorter than 5 years. Three damage clusters were identified. Cluster 1 (80.6% of patients) presented a lower amount of individuals with damage (23.2 vs 100% in clusters 2 and 3, P < 0.001). Cluster 2 (11.4% of patients) was characterized by musculoskeletal damage in all patients. Cluster 3 (8.0% of patients) was the only group with cardiovascular damage, and this was present in all patients. The overall mortality rate of patients in clusters 2 and 3 was higher than that in cluster 1 (P < 0.001 for both comparisons) and in patients with disease duration shorter than 5 years as well. In a large cohort of SLE patients, cardiovascular and musculoskeletal damage manifestations were the two dominant forms of damage to sort patients into clinically meaningful clusters. Both in early and late stages of the disease, there was a significant association of these clusters with an increased risk of mortality. Physicians should pay special attention to the early prevention of damage in these two systems. © The Author 2016. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Topic modeling for cluster analysis of large biological and medical datasets
2014-01-01
Background The big data moniker is nowhere better deserved than to describe the ever-increasing prodigiousness and complexity of biological and medical datasets. New methods are needed to generate and test hypotheses, foster biological interpretation, and build validated predictors. Although multivariate techniques such as cluster analysis may allow researchers to identify groups, or clusters, of related variables, the accuracies and effectiveness of traditional clustering methods diminish for large and hyper dimensional datasets. Topic modeling is an active research field in machine learning and has been mainly used as an analytical tool to structure large textual corpora for data mining. Its ability to reduce high dimensionality to a small number of latent variables makes it suitable as a means for clustering or overcoming clustering difficulties in large biological and medical datasets. Results In this study, three topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, are proposed and tested on the cluster analysis of three large datasets: Salmonella pulsed-field gel electrophoresis (PFGE) dataset, lung cancer dataset, and breast cancer dataset, which represent various types of large biological or medical datasets. All three various methods are shown to improve the efficacy/effectiveness of clustering results on the three datasets in comparison to traditional methods. A preferable cluster analysis method emerged for each of the three datasets on the basis of replicating known biological truths. Conclusion Topic modeling could be advantageously applied to the large datasets of biological or medical research. The three proposed topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, yield clustering improvements for the three different data types. Clusters more efficaciously represent truthful groupings and subgroupings in the data than traditional methods, suggesting that topic model-based methods could provide an analytic advancement in the analysis of large biological or medical datasets. PMID:25350106
Topic modeling for cluster analysis of large biological and medical datasets.
Zhao, Weizhong; Zou, Wen; Chen, James J
2014-01-01
The big data moniker is nowhere better deserved than to describe the ever-increasing prodigiousness and complexity of biological and medical datasets. New methods are needed to generate and test hypotheses, foster biological interpretation, and build validated predictors. Although multivariate techniques such as cluster analysis may allow researchers to identify groups, or clusters, of related variables, the accuracies and effectiveness of traditional clustering methods diminish for large and hyper dimensional datasets. Topic modeling is an active research field in machine learning and has been mainly used as an analytical tool to structure large textual corpora for data mining. Its ability to reduce high dimensionality to a small number of latent variables makes it suitable as a means for clustering or overcoming clustering difficulties in large biological and medical datasets. In this study, three topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, are proposed and tested on the cluster analysis of three large datasets: Salmonella pulsed-field gel electrophoresis (PFGE) dataset, lung cancer dataset, and breast cancer dataset, which represent various types of large biological or medical datasets. All three various methods are shown to improve the efficacy/effectiveness of clustering results on the three datasets in comparison to traditional methods. A preferable cluster analysis method emerged for each of the three datasets on the basis of replicating known biological truths. Topic modeling could be advantageously applied to the large datasets of biological or medical research. The three proposed topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, yield clustering improvements for the three different data types. Clusters more efficaciously represent truthful groupings and subgroupings in the data than traditional methods, suggesting that topic model-based methods could provide an analytic advancement in the analysis of large biological or medical datasets.
Ivors, K; Garbelotto, M; Vries, I D E; Ruyter-Spira, C; Te Hekkert, B; Rosenzweig, N; Bonants, P
2006-05-01
Analysis of 12 polymorphic simple sequence repeats identified in the genome sequence of Phytophthora ramorum, causal agent of 'sudden oak death', revealed genotypic diversity to be significantly higher in nurseries (91% of total) than in forests (18% of total). Our analysis identified only two closely related genotypes in US forests, while the genetic structure of populations from European nurseries was of intermediate complexity, including multiple, closely related genotypes. Multilocus analysis determined populations in US forests reproduce clonally and are likely descendants of a single introduced individual. The 151 isolates analysed clustered in three clades. US forest and European nursery isolates clustered into two distinct clades, while one isolate from a US nursery belonged to a third novel clade. The combined microsatellite, sequencing and morphological analyses suggest the three clades represent distinct evolutionary lineages. All three clades were identified in some US nurseries, emphasizing the role of commercial plant trade in the movement of this pathogen.
Yennurajalingam, Sriram; Williams, Janet L; Chisholm, Gary; Bruera, Eduardo
2016-03-01
Advanced cancer patients frequently experience debilitating symptoms that occur in clusters, but few pharmacological studies have targeted symptom clusters. Our objective was to examine the effects of dexamethasone on symptom clusters in patients with advanced cancer. We reviewed the data from a previous randomized clinical trial to determine the effects of dexamethasone on cancer symptoms. Symptom clusters were identified according to baseline symptoms by using principal component analysis. Correlations and change in the severity of symptom clusters were analyzed after study treatment. A total of 114 participants were included in this study. Three clusters were identified: fatigue/anorexia-cachexia/depression (FAD), sleep/anxiety/drowsiness (SAD), and pain/dyspnea (PD). Changes in severity of FAD and PD significantly correlated over time (at baseline, day 8, and day 15). The FAD cluster was associated with significant improvement in severity at day 8 and day 15, whereas no significant change was observed with the SAD cluster or PD cluster after dexamethasone treatment. The results of this preliminary study suggest significant correlation over time and improvement in the FAD cluster at day 8 and day 15 after treatment with dexamethasone. These findings suggest that fatigue, anorexia-cachexia, and depression may share a common pathophysiologic basis. Further studies are needed to investigate this cluster and target anti-inflammatory therapies. ©AlphaMed Press.
A Typology of Social Workers in Long-Term Care Facilities in Israel.
Lev, Sagit; Ayalon, Liat
2018-04-01
This article explores moral distress among long-term care facility (LTCF) social workers by examining the relationships between moral distress and environmental and personal features. Based on these features, authors identified a typology of LTCF social workers and how they handle moral distress. Such a typology can assist in the identification of social workers who are in a particular need for assistance. Overall, 216 LTCF social workers took part in the study. A two-step cluster analysis was conducted to identify a typology of LTCF social workers based on features such as ethical environment, support in workplace, mastery, and resilience. The variance of the identified clusters and their associations with moral distress were examined, and four clusters of LTCF social workers were identified. The clusters varied from each other in relation to their personal and environmental features and in relation to their experience of moral distress. The article concludes with a discussion of the importance of developing programs for LTCF social workers that provide support and enhancement of personal resources and an adequate and ethical environment for practice.
Applications of modern statistical methods to analysis of data in physical science
NASA Astrophysics Data System (ADS)
Wicker, James Eric
Modern methods of statistical and computational analysis offer solutions to dilemmas confronting researchers in physical science. Although the ideas behind modern statistical and computational analysis methods were originally introduced in the 1970's, most scientists still rely on methods written during the early era of computing. These researchers, who analyze increasingly voluminous and multivariate data sets, need modern analysis methods to extract the best results from their studies. The first section of this work showcases applications of modern linear regression. Since the 1960's, many researchers in spectroscopy have used classical stepwise regression techniques to derive molecular constants. However, problems with thresholds of entry and exit for model variables plagues this analysis method. Other criticisms of this kind of stepwise procedure include its inefficient searching method, the order in which variables enter or leave the model and problems with overfitting data. We implement an information scoring technique that overcomes the assumptions inherent in the stepwise regression process to calculate molecular model parameters. We believe that this kind of information based model evaluation can be applied to more general analysis situations in physical science. The second section proposes new methods of multivariate cluster analysis. The K-means algorithm and the EM algorithm, introduced in the 1960's and 1970's respectively, formed the basis of multivariate cluster analysis methodology for many years. However, several shortcomings of these methods include strong dependence on initial seed values and inaccurate results when the data seriously depart from hypersphericity. We propose new cluster analysis methods based on genetic algorithms that overcomes the strong dependence on initial seed values. In addition, we propose a generalization of the Genetic K-means algorithm which can accurately identify clusters with complex hyperellipsoidal covariance structures. We then use this new algorithm in a genetic algorithm based Expectation-Maximization process that can accurately calculate parameters describing complex clusters in a mixture model routine. Using the accuracy of this GEM algorithm, we assign information scores to cluster calculations in order to best identify the number of mixture components in a multivariate data set. We will showcase how these algorithms can be used to process multivariate data from astronomical observations.
Gould, Madelyn S; Kleinman, Marjorie H; Lake, Alison M; Forman, Judith; Midle, Jennifer Bassett
2014-06-01
Public health and clinical efforts to prevent suicide clusters are seriously hampered by the unanswered question of why such outbreaks occur. We aimed to establish whether an environmental factor-newspaper reports of suicide-has a role in the emergence of suicide clusters. In this retrospective, population-based, case-control study, we identified suicide clusters in young people aged 13-20 years in the USA from 1988 to 1996 (preceding the advent of social media) using the time-space Scan statistic. For each cluster community, we selected two matched non-cluster control communities in which suicides of similarly aged youth occurred, from non-contiguous counties within the same state as the cluster. We examined newspapers within each cluster community for stories about suicide published in the days between the first and second suicides in the cluster. In non-cluster communities, we examined a matched length of time after the matched control suicide. We used a content-analysis procedure to code the characteristics of each story and compared newspaper stories about suicide published in case and control communities with mixed-effect regression analyses. We identified 53 suicide clusters, of which 48 were included in the media review. For one cluster we could identify only one appropriate control; therefore, 95 matched control communities were included. The mean number of news stories about suicidal individuals published after an index cluster suicide (7·42 [SD 10·02]) was significantly greater than the mean number of suicide stories published after a non-cluster suicide (5·14 [6.00]; p<0·0001). Several story characteristics, including front-page placement, headlines containing the word suicide or a description of the method used, and detailed descriptions of the suicidal individual and act, appeared more often in stories published after the index cluster suicides than after non-cluster suicides. Our identification of an association between newspaper reports about suicide (including specific story characteristics) and the initiation of teenage suicide clusters should provide an empirical basis to support efforts by mental health professionals, community officials, and the media to work together to identify and prevent the onset of suicide clusters. US National Institute of Mental Health and American Foundation for Suicide Prevention. Copyright © 2014 Elsevier Ltd. All rights reserved.
Mehta, S; Rice, D; McIntyre, A; Getty, H; Speechley, M; Sequeira, K; Shapiro, A P; Morley-Forster, P; Teasell, R W
2016-01-01
Objective. The current study attempted to identify and characterize distinct CP subgroups based on their level of dispositional personality traits. The secondary objective was to compare the difference among the subgroups in mood, coping, and disability. Methods. Individuals with chronic pain were assessed for demographic, psychosocial, and personality measures. A two-step cluster analysis was conducted in order to identify distinct subgroups of patients based on their level of personality traits. Differences in clinical outcomes were compared using the multivariate analysis of variance based on cluster membership. Results. In 229 participants, three clusters were formed. No significant difference was seen among the clusters on patient demographic factors including age, sex, relationship status, duration of pain, and pain intensity. Those with high levels of dispositional personality traits had greater levels of mood impairment compared to the other two groups (p < 0.05). Significant difference in disability was seen between the subgroups. Conclusions. The study identified a high risk group of CP individuals whose level of personality traits significantly correlated with impaired mood and coping. Use of pharmacological treatment alone may not be successful in improving clinical outcomes among these individuals. Instead, a more comprehensive treatment involving psychological treatments may be important in managing the personality traits that interfere with recovery.
Anatomical relationships between serotonin 5-HT2A and dopamine D2 receptors in living human brain.
Ishii, Tatsuya; Kimura, Yasuyuki; Ichise, Masanori; Takahata, Keisuke; Kitamura, Soichiro; Moriguchi, Sho; Kubota, Manabu; Zhang, Ming-Rong; Yamada, Makiko; Higuchi, Makoto; Okubo, Yoshinori; Suhara, Tetsuya
2017-01-01
Seven healthy volunteers underwent PET scans with [18F]altanserin and [11C]FLB 457 for 5-HT2A and D2 receptors, respectively. As a measure of receptor density, a binding potential (BP) was calculated from PET data for 76 cerebral cortical regions. A correlation matrix was calculated between the binding potentials of [18F]altanserin and [11C]FLB 457 for those regions. The regional relationships were investigated using a bicluster analysis of the correlation matrix with an iterative signature algorithm. We identified two clusters of regions. The first cluster identified a distinct profile of correlation coefficients between 5-HT2A and D2 receptors, with the former in regions related to sensorimotor integration (supplementary motor area, superior parietal gyrus, and paracentral lobule) and the latter in most cortical regions. The second cluster identified another distinct profile of correlation coefficients between 5-HT2A receptors in the bilateral hippocampi and D2 receptors in most cortical regions. The observation of two distinct clusters in the correlation matrix suggests regional interactions between 5-HT2A and D2 receptors in sensorimotor integration and hippocampal function. A bicluster analysis of the correlation matrix of these neuroreceptors may be beneficial in understanding molecular networks in the human brain.
USDA-ARS?s Scientific Manuscript database
Secondary metabolite genes are often clustered together and situated in particular genomic regions such as the subtelomere, which can facilitate niche adaptation in fungi. Solanapyrones are toxic secondary metabolites produced by fungi occupying different ecological niches. Full genome sequencing of...
Behavioral Profiles in 4-5 Year-Old Children: Normal and Pathological Variants
ERIC Educational Resources Information Center
Larsson, Jan-Olov; Bergman, Lars R.; Earls, Felton; Rydelius, Per-Anders
2004-01-01
Normal and psychopathological patterns of behavior symptoms in preschool children were described by a classification approach using cluster analysis. The behavior of 406 children, average age 4 years 9 months, from the general population was evaluated at home visits. Seven clusters were identified based on empirically defined dimensions:…
Order-Constrained Solutions in K-Means Clustering: Even Better than Being Globally Optimal
ERIC Educational Resources Information Center
Steinley, Douglas; Hubert, Lawrence
2008-01-01
This paper proposes an order-constrained K-means cluster analysis strategy, and implements that strategy through an auxiliary quadratic assignment optimization heuristic that identifies an initial object order. A subsequent dynamic programming recursion is applied to optimally subdivide the object set subject to the order constraint. We show that…
Tashobya, Christine K; Dubourg, Dominique; Ssengooba, Freddie; Speybroeck, Niko; Macq, Jean; Criel, Bart
2016-03-01
In 2003, the Uganda Ministry of Health introduced the district league table for district health system performance assessment. The league table presents district performance against a number of input, process and output indicators and a composite index to rank districts. This study explores the use of hierarchical cluster analysis for analysing and presenting district health systems performance data and compares this approach with the use of the league table in Uganda. Ministry of Health and district plans and reports, and published documents were used to provide information on the development and utilization of the Uganda district league table. Quantitative data were accessed from the Ministry of Health databases. Statistical analysis using SPSS version 20 and hierarchical cluster analysis, utilizing Wards' method was used. The hierarchical cluster analysis was conducted on the basis of seven clusters determined for each year from 2003 to 2010, ranging from a cluster of good through moderate-to-poor performers. The characteristics and membership of clusters varied from year to year and were determined by the identity and magnitude of performance of the individual variables. Criticisms of the league table include: perceived unfairness, as it did not take into consideration district peculiarities; and being oversummarized and not adequately informative. Clustering organizes the many data points into clusters of similar entities according to an agreed set of indicators and can provide the beginning point for identifying factors behind the observed performance of districts. Although league table ranking emphasize summation and external control, clustering has the potential to encourage a formative, learning approach. More research is required to shed more light on factors behind observed performance of the different clusters. Other countries especially low-income countries that share many similarities with Uganda can learn from these experiences. © The Author 2015. Published by Oxford University Press in association with The London School of Hygiene and Tropical Medicine.
Tashobya, Christine K; Dubourg, Dominique; Ssengooba, Freddie; Speybroeck, Niko; Macq, Jean; Criel, Bart
2016-01-01
In 2003, the Uganda Ministry of Health introduced the district league table for district health system performance assessment. The league table presents district performance against a number of input, process and output indicators and a composite index to rank districts. This study explores the use of hierarchical cluster analysis for analysing and presenting district health systems performance data and compares this approach with the use of the league table in Uganda. Ministry of Health and district plans and reports, and published documents were used to provide information on the development and utilization of the Uganda district league table. Quantitative data were accessed from the Ministry of Health databases. Statistical analysis using SPSS version 20 and hierarchical cluster analysis, utilizing Wards’ method was used. The hierarchical cluster analysis was conducted on the basis of seven clusters determined for each year from 2003 to 2010, ranging from a cluster of good through moderate-to-poor performers. The characteristics and membership of clusters varied from year to year and were determined by the identity and magnitude of performance of the individual variables. Criticisms of the league table include: perceived unfairness, as it did not take into consideration district peculiarities; and being oversummarized and not adequately informative. Clustering organizes the many data points into clusters of similar entities according to an agreed set of indicators and can provide the beginning point for identifying factors behind the observed performance of districts. Although league table ranking emphasize summation and external control, clustering has the potential to encourage a formative, learning approach. More research is required to shed more light on factors behind observed performance of the different clusters. Other countries especially low-income countries that share many similarities with Uganda can learn from these experiences. PMID:26024882
Fascioliasis risk factors and space-time clusters in domestic ruminants in Bangladesh.
Rahman, A K M Anisur; Islam, S K Shaheenur; Talukder, Md Hasanuzzaman; Hassan, Md Kumrul; Dhand, Navneet K; Ward, Michael P
2017-05-08
A retrospective observational study was conducted to identify fascioliasis hotspots, clusters, potential risk factors and to map fascioliasis risk in domestic ruminants in Bangladesh. Cases of fascioliasis in cattle, buffalo, sheep and goats from all districts in Bangladesh between 2011 and 2013 were identified via secondary surveillance data from the Department of Livestock Services' Epidemiology Unit. From each case report, date of report, species affected and district data were extracted. The total number of domestic ruminants in each district was used to calculate fascioliasis cases per ten thousand animals at risk per district, and this was used for cluster and hotspot analysis. Clustering was assessed with Moran's spatial autocorrelation statistic, hotspots with the local indicator of spatial association (LISA) statistic and space-time clusters with the scan statistic (Poisson model). The association between district fascioliasis prevalence and climate (temperature, precipitation), elevation, land cover and water bodies was investigated using a spatial regression model. A total of 1,723,971 cases of fascioliasis were reported in the three-year study period in cattle (1,164,560), goats (424,314), buffalo (88,924) and sheep (46,173). A total of nine hotspots were identified; one of these persisted in each of the three years. Only two local clusters were found. Five space-time clusters located within 22 districts were also identified. Annual risk maps of fascioliasis cases correlated with the hotspots and clusters detected. Cultivated and managed (P < 0.001) and artificial surface (P = 0.04) land cover areas, and elevation (P = 0.003) were positively and negatively associated with fascioliasis in Bangladesh, respectively. Results indicate that due to land use characteristics some areas of Bangladesh are at greater risk of fascioliasis. The potential risk factors, hot spots and clusters identified in this study can be used to guide science-based treatment and control decisions for fascioliasis in Bangladesh and in other similar geo-climatic zones throughout the world.
Seasonal and spatial variations of water quality and trophic status in Daya Bay, South China Sea.
Wu, Mei-Lin; Wang, You-Shao; Wang, Yu-Tu; Sun, Fu-Lin; Sun, Cui-Ci; Cheng, Hao; Dong, Jun-De
2016-11-15
Coastal water quality and trophic status are subject to intensive environmental stress induced by human activities and climate change. Quarterly cruises were conducted to identify environmental characteristics in Daya Bay in 2013. Water quality is spatially and temporally dynamic in the bay. Cluster analysis (CA) groups 12 monitoring stations into two clusters. Cluster I consists of stations (S1, S2, S4-S7, S9, and S12) located in the central, eastern, and southern parts of the bay, representing less polluted regions. Cluster II includes stations (S3, S8, S10, and S11) located in the western and northern parts of the bay, indicating the highly polluted regions receiving a high amount of wastewater and freshwater discharge. Principal component analysis (PCA) identified that water quality experience seasonal change (summer, winter, and spring-autumn seasons) because of two monsoons in the study area. Eutrophication in the bay is graded as high by Assessment of Estuarine Trophic Status (ASSETS). Copyright © 2016 Elsevier Ltd. All rights reserved.
Kudo, Fumitaka; Matsuura, Yasunori; Hayashi, Takaaki; Fukushima, Masayuki; Eguchi, Tadashi
2016-07-01
Sordarin is a glycoside antibiotic with a unique tetracyclic diterpene aglycone structure called sordaricin. To understand its intriguing biosynthetic pathway that may include a Diels-Alder-type [4+2]cycloaddition, genome mining of the gene cluster from the draft genome sequence of the producer strain, Sordaria araneosa Cain ATCC 36386, was carried out. A contiguous 67 kb gene cluster consisting of 20 open reading frames encoding a putative diterpene cyclase, a glycosyltransferase, a type I polyketide synthase, and six cytochrome P450 monooxygenases were identified. In vitro enzymatic analysis of the putative diterpene cyclase SdnA showed that it catalyzes the transformation of geranylgeranyl diphosphate to cycloaraneosene, a known biosynthetic intermediate of sordarin. Furthermore, a putative glycosyltransferase SdnJ was found to catalyze the glycosylation of sordaricin in the presence of GDP-6-deoxy-d-altrose to give 4'-O-demethylsordarin. These results suggest that the identified sdn gene cluster is responsible for the biosynthesis of sordarin. Based on the isolated potential biosynthetic intermediates and bioinformatics analysis, a plausible biosynthetic pathway for sordarin is proposed.
Improved Ant Colony Clustering Algorithm and Its Performance Study
Gao, Wei
2016-01-01
Clustering analysis is used in many disciplines and applications; it is an important tool that descriptively identifies homogeneous groups of objects based on attribute values. The ant colony clustering algorithm is a swarm-intelligent method used for clustering problems that is inspired by the behavior of ant colonies that cluster their corpses and sort their larvae. A new abstraction ant colony clustering algorithm using a data combination mechanism is proposed to improve the computational efficiency and accuracy of the ant colony clustering algorithm. The abstraction ant colony clustering algorithm is used to cluster benchmark problems, and its performance is compared with the ant colony clustering algorithm and other methods used in existing literature. Based on similar computational difficulties and complexities, the results show that the abstraction ant colony clustering algorithm produces results that are not only more accurate but also more efficiently determined than the ant colony clustering algorithm and the other methods. Thus, the abstraction ant colony clustering algorithm can be used for efficient multivariate data clustering. PMID:26839533
Eher, R; Windhaber, J; Rau, H; Schmitt, M; Kellner, E
2000-05-01
Conflict and conflict resolution in intimate relationships are not only among the most important factors influencing relationship satisfaction but are also seen in association with clinical symptoms. Styles of conflict will be assessed in patients suffering from panic disorder with and without agoraphobia, in alcoholics and in patients suffering from rheumatoid arthritis. 176 patients and healthy controls filled out the Styles of Conflict Inventory and questionnaires concerning severity of clinical symptoms. A cluster analysis revealed 5 types of conflict management. Healthy controls showed predominantely assertive and constructive styles, patients with panic disorder showed high levels of cognitive and/or behavioral aggression. Alcoholics showed high levels of repressed aggression, and patients with rheumatoid arthritis often did not exhibit any aggression during conflict. 5 Clusters of conflict pattern have been identified by cluster analysis. Each patient group showed considerable different patterns of conflict management.
Ávila-Jiménez, María Luisa; Coulson, Stephen James
2011-01-01
We aimed to describe the main Arctic biogeographical patterns of the Collembola, and analyze historical factors and current climatic regimes determining Arctic collembolan species distribution. Furthermore, we aimed to identify possible dispersal routes, colonization sources and glacial refugia for Arctic collembola. We implemented a Gaussian Mixture Clustering method on species distribution ranges and applied a distance- based parametric bootstrap test on presence-absence collembolan species distribution data. Additionally, multivariate analysis was performed considering species distributions, biodiversity, cluster distribution and environmental factors (temperature and precipitation). No clear relation was found between current climatic regimes and species distribution in the Arctic. Gaussian Mixture Clustering found common elements within Siberian areas, Atlantic areas, the Canadian Arctic, a mid-Siberian cluster and specific Beringian elements, following the same pattern previously described, using a variety of molecular methods, for Arctic plants. Species distribution hence indicate the influence of recent glacial history, as LGM glacial refugia (mid-Siberia, and Beringia) and major dispersal routes to high Arctic island groups can be identified. Endemic species are found in the high Arctic, but no specific biogeographical pattern can be clearly identified as a sign of high Arctic glacial refugia. Ocean currents patterns are suggested as being an important factor shaping the distribution of Arctic Collembola, which is consistent with Antarctic studies in collembolan biogeography. The clear relations between cluster distribution and geographical areas considering their recent glacial history, lack of relationship of species distribution with current climatic regimes, and consistency with previously described Arctic patterns in a series of organisms inferred using a variety of methods, suggest that historical phenomena shaping contemporary collembolan distribution can be inferred through biogeographical analysis. PMID:26467728
Franklyn-Miller, A; Richter, C; King, E; Gore, S; Moran, K; Strike, S; Falvey, E C
2017-01-01
Background Athletic groin pain (AGP) is prevalent in sports involving repeated accelerations, decelerations, kicking and change-of-direction movements. Clinical and radiological examinations lack the ability to assess pathomechanics of AGP, but three-dimensional biomechanical movement analysis may be an important innovation. Aim The primary aim was to describe and analyse movements used by patients with AGP during a maximum effort change-of-direction task. The secondary aim was to determine if specific anatomical diagnoses were related to a distinct movement strategy. Methods 322 athletes with a current symptom of chronic AGP participated. Structured and standardised clinical assessments and radiological examinations were performed on all participants. Additionally, each participant performed multiple repetitions of a planned maximum effort change-of-direction task during which whole body kinematics were recorded. Kinematic and kinetic data were examined using continuous waveform analysis techniques in combination with a subgroup design that used gap statistic and hierarchical clustering. Results Three subgroups (clusters) were identified. Kinematic and kinetic measures of the clusters differed strongly in patterns observed in thorax, pelvis, hip, knee and ankle. Cluster 1 (40%) was characterised by increased ankle eversion, external rotation and knee internal rotation and greater knee work. Cluster 2 (15%) was characterised by increased hip flexion, pelvis contralateral drop, thorax tilt and increased hip work. Cluster 3 (45%) was characterised by high ankle dorsiflexion, thorax contralateral drop, ankle work and prolonged ground contact time. No correlation was observed between movement clusters and clinically palpated location of the participant's pain. Conclusions We identified three distinct movement strategies among athletes with long-standing groin pain during a maximum effort change-of-direction task These movement strategies were not related to clinical assessment findings but highlighted targets for rehabilitation in response to possible propagative mechanisms. Trial registration number NCT02437942, pre results. PMID:28209597
Evolution of the degree of substructures in simulated galaxy clusters
NASA Astrophysics Data System (ADS)
De Boni, Cristiano; Böhringer, Hans; Chon, Gayoung; Dolag, Klaus
2018-05-01
We study the evolution of substructure in the mass distribution with mass, redshift and radius in a sample of simulated galaxy clusters. The sample, containing 1226 objects, spans the mass range M200 = 1014 - 1.74 × 1015 M⊙ h-1 in six redshift bins from z = 0 to z = 1.179. We consider three different diagnostics: 1) subhalos identified with SUBFIND; 2) overdense regions localized by dividing the cluster into octants; 3) offset between the potential minimum and the center of mass. The octant analysis is a new method that we introduce in this work. We find that none of the diagnostics indicate a correlation between the mass of the cluster and the fraction of substructures. On the other hand, all the diagnostics suggest an evolution of substructures with redshift. For SUBFIND halos, the mass fraction is constant with redshift at Rvir, but shows a mild evolution at R200 and R500. Also, the fraction of clusters with at least a subhalo more massive than one thirtieth of the total mass is less than 20%. Our new method based on the octants returns a mass fraction in substructures which has a strong evolution with redshift at all radii. The offsets also evolve strongly with redshift. We also find a strong correlation for individual clusters between the offset and the fraction of substructures identified with the octant analysis. Our work puts strong constraints on the amount of substructures we expect to find in galaxy clusters and on their evolution with redshift.
Lochner, Christine; Hemmings, Sian M J; Kinnear, Craig J; Niehaus, Dana J H; Nel, Daniel G; Corfield, Valerie A; Moolman-Smook, Johanna C; Seedat, Soraya; Stein, Dan J
2005-01-01
Comorbidity of certain obsessive-compulsive spectrum disorders (OCSDs; such as Tourette's disorder) in obsessive-compulsive disorder (OCD) may serve to define important OCD subtypes characterized by differing phenomenology and neurobiological mechanisms. Comorbidity of the putative OCSDs in OCD has, however, not often been systematically investigated. The Structured Clinical Interview for Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition , Axis I Disorders-Patient Version as well as a Structured Clinical Interview for Putative OCSDs (SCID-OCSD) were administered to 210 adult patients with OCD (N = 210, 102 men and 108 women; mean age, 35.7 +/- 13.3). A subset of Caucasian subjects (with OCD, n = 171; control subjects, n = 168), including subjects from the genetically homogeneous Afrikaner population (with OCD, n = 77; control subjects, n = 144), was genotyped for polymorphisms in genes involved in monoamine function. Because the items of the SCID-OCSD are binary (present/absent), a cluster analysis (Ward's method) using the items of SCID-OCSD was conducted. The association of identified clusters with demographic variables (age, gender), clinical variables (age of onset, obsessive-compulsive symptom severity and dimensions, level of insight, temperament/character, treatment response), and monoaminergic genotypes was examined. Cluster analysis of the OCSDs in our sample of patients with OCD identified 3 separate clusters at a 1.1 linkage distance level. The 3 clusters were named as follows: (1) "reward deficiency" (including trichotillomania, Tourette's disorder, pathological gambling, and hypersexual disorder), (2) "impulsivity" (including compulsive shopping, kleptomania, eating disorders, self-injury, and intermittent explosive disorder), and (3) "somatic" (including body dysmorphic disorder and hypochondriasis). Several significant associations were found between cluster scores and other variables; for example, cluster I scores were associated with earlier age of onset of OCD and the presence of tics, cluster II scores were associated with female gender and childhood emotional abuse, and cluster III scores were associated with less insight and with somatic obsessions and compulsions. However, none of these clusters were associated with any particular genetic variant. Analysis of comorbid OCSDs in OCD suggested that these lie on a number of different dimensions. These dimensions are partially consistent with previous theoretical approaches taken toward classifying OCD spectrum disorders. The lack of genetic validation of these clusters in the present study may indicate the involvement of other, as yet untested, genes. Further genetic and cluster analyses of comorbid OCSDs in OCD may ultimately contribute to a better delineation of OCD endophenotypes.
Assessment of stem cell differentiation based on genome-wide expression profiles.
Godoy, Patricio; Schmidt-Heck, Wolfgang; Hellwig, Birte; Nell, Patrick; Feuerborn, David; Rahnenführer, Jörg; Kattler, Kathrin; Walter, Jörn; Blüthgen, Nils; Hengstler, Jan G
2018-07-05
In recent years, protocols have been established to differentiate stem and precursor cells into more mature cell types. However, progress in this field has been hampered by difficulties to assess the differentiation status of stem cell-derived cells in an unbiased manner. Here, we present an analysis pipeline based on published data and methods to quantify the degree of differentiation and to identify transcriptional control factors explaining differences from the intended target cells or tissues. The pipeline requires RNA-Seq or gene array data of the stem cell starting population, derived 'mature' cells and primary target cells or tissue. It consists of a principal component analysis to represent global expression changes and to identify possible problems of the dataset that require special attention, such as: batch effects; clustering techniques to identify gene groups with similar features; over-representation analysis to characterize biological motifs and transcriptional control factors of the identified gene clusters; and metagenes as well as gene regulatory networks for quantitative cell-type assessment and identification of influential transcription factors. Possibilities and limitations of the analysis pipeline are illustrated using the example of human embryonic stem cell and human induced pluripotent cells to generate 'hepatocyte-like cells'. The pipeline quantifies the degree of incomplete differentiation as well as remaining stemness and identifies unwanted features, such as colon- and fibroblast-associated gene clusters that are absent in real hepatocytes but typically induced by currently available differentiation protocols. Finally, transcription factors responsible for incomplete and unwanted differentiation are identified. The proposed method is widely applicable and allows an unbiased and quantitative assessment of stem cell-derived cells.This article is part of the theme issue 'Designer human tissue: coming to a lab near you'. © 2018 The Author(s).
Qin, Qianqian; Guo, Wei; Tang, Weiming; Mahapatra, Tanmay; Wang, Liyan; Zhang, Nanci; Ding, Zhengwei; Cai, Chang; Cui, Yan; Sun, Jiangping
2017-04-01
Studies have shown a recent upsurge in human immunodeficiency virus (HIV) burden among men who have sex with men (MSM) in China, especially in urban areas. For intervention planning and resource allocation, spatial analyses of HIV/AIDS case-clusters were required to identify epidemic foci and trends among MSM in China. Information regarding MSM recorded as HIV/AIDS cases during 2006-2015 were extracted from the National Case Reporting System. Demographic trends were determined through Cochran-Armitage trend tests. Distribution of case-clusters was examined using spatial autocorrelation. Spatial-temporal scan was used to detect disease clustering. Spatial correlations between cases and socioenvironmental factors were determined by spatial regression. Between 2006 and 2015, in China, 120 371 HIV/AIDS cases were identified among MSM. Newly identified HIV/AIDS cases among self-reported MSM increased from 487 cases in 2006 to >30 000 cases in 2015. Among those HIV/AIDS cases recorded during 2006-2015, 47.0% were 20-29 years old and 24.9% were aged 30-39 years. Based on clusters of HIV/AIDS cases identified through spatial analysis, the epidemic was concentrated among MSM in large cities. Spatial-temporal clusters contained municipalities, provincial capitals, and main cities such as Beijing, Shanghai, Chongqing, Chengdu, and Guangzhou. Spatial regression analysis showed that sociodemographic indicators such as population density, per capita gross domestic product, and number of county-level medical institutions had statistically significant positive correlations with HIV/AIDS among MSM. Assorted spatial analyses revealed an increasingly concentrated HIV epidemic among young MSM in Chinese cities, calling for targeted health education and intensive interventions at an early age. © The Author 2017. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail: journals.permissions@oup.com.
NASA Technical Reports Server (NTRS)
Wolf, S. F.; Lipschutz, M. E.
1993-01-01
Multivariate statistical analysis techniques (linear discriminant analysis and logistic regression) can provide powerful discrimination tools which are generally unfamiliar to the planetary science community. Fall parameters were used to identify a group of 17 H chondrites (Cluster 1) that were part of a coorbital stream which intersected Earth's orbit in May, from 1855 - 1895, and can be distinguished from all other H chondrite falls. Using multivariate statistical techniques, it was demonstrated that a totally different criterion, labile trace element contents - hence thermal histories - or 13 Cluster 1 meteorites are distinguishable from those of 45 non-Cluster 1 H chondrites. Here, we focus upon the principles of multivariate statistical techniques and illustrate their application using non-meteoritic and meteoritic examples.
Joint fMRI analysis and subject clustering using sparse dictionary learning
NASA Astrophysics Data System (ADS)
Kim, Seung-Jun; Dontaraju, Krishna K.
2017-08-01
Multi-subject fMRI data analysis methods based on sparse dictionary learning are proposed. In addition to identifying the component spatial maps by exploiting the sparsity of the maps, clusters of the subjects are learned by postulating that the fMRI volumes admit a subspace clustering structure. Furthermore, in order to tune the associated hyper-parameters systematically, a cross-validation strategy is developed based on entry-wise sampling of the fMRI dataset. Efficient algorithms for solving the proposed constrained dictionary learning formulations are developed. Numerical tests performed on synthetic fMRI data show promising results and provides insights into the proposed technique.
Han, Bangxing; Peng, Huasheng; Yan, Hui
2016-01-01
Mugua is a common Chinese herbal medicine. There are three main medicinal origin places in China, Xuancheng City Anhui Province, Qijiang District Chongqing City, Yichang City, Hubei Province, and suitable for food origin places Linyi City Shandong Province. To construct a qualitative analytical method to identify the origin of medicinal Mugua by near infrared spectroscopy (NIRS). Partial least squares discriminant analysis (PLSDA) model was established after the Mugua derived from five different origins were preprocessed by the original spectrum. Moreover, the hierarchical cluster analysis was performed. The result showed that PLSDA model was established. According to the relationship of the origins-related important score and wavenumber, and K-mean cluster analysis, the Muguas derived from different origins were effectively identified. NIRS technology can quickly and accurately identify the origin of Mugua, provide a new method and technology for the identification of Chinese medicinal materials. After preprocessed by D1+autoscale, more peaks were increased in the preprocessed Mugua in the near infrared spectrumFive latent variable scores could reflect the information related to the origin place of MuguaOrigins of Mugua were well-distinguished according to K. mean value clustering analysis. Abbreviations used: TCM: Traditional Chinese Medicine, NIRS: Near infrared spectroscopy, SG: Savitzky-Golay smoothness, D1: First derivative, D2: Second derivative, SNV: Standard normal variable transformation, MSC: Multiplicative scatter correction, PLSDA: Partial least squares discriminant analysis, LV: Latent variable, VIP scores: Important score.
fluff: exploratory analysis and visualization of high-throughput sequencing data
Georgiou, Georgios
2016-01-01
Summary. In this article we describe fluff, a software package that allows for simple exploration, clustering and visualization of high-throughput sequencing data mapped to a reference genome. The package contains three command-line tools to generate publication-quality figures in an uncomplicated manner using sensible defaults. Genome-wide data can be aggregated, clustered and visualized in a heatmap, according to different clustering methods. This includes a predefined setting to identify dynamic clusters between different conditions or developmental stages. Alternatively, clustered data can be visualized in a bandplot. Finally, fluff includes a tool to generate genomic profiles. As command-line tools, the fluff programs can easily be integrated into standard analysis pipelines. The installation is straightforward and documentation is available at http://fluff.readthedocs.org. Availability. fluff is implemented in Python and runs on Linux. The source code is freely available for download at https://github.com/simonvh/fluff. PMID:27547532
A Model-Based Cluster Analysis of Maternal Emotion Regulation and Relations to Parenting Behavior.
Shaffer, Anne; Whitehead, Monica; Davis, Molly; Morelen, Diana; Suveg, Cynthia
2017-10-15
In a diverse community sample of mothers (N = 108) and their preschool-aged children (M age = 3.50 years), this study conducted person-oriented analyses of maternal emotion regulation (ER) based on a multimethod assessment incorporating physiological, observational, and self-report indicators. A model-based cluster analysis was applied to five indicators of maternal ER: maternal self-report, observed negative affect in a parent-child interaction, baseline respiratory sinus arrhythmia (RSA), and RSA suppression across two laboratory tasks. Model-based cluster analyses revealed four maternal ER profiles, including a group of mothers with average ER functioning, characterized by socioeconomic advantage and more positive parenting behavior. A dysregulated cluster demonstrated the greatest challenges with parenting and dyadic interactions. Two clusters of intermediate dysregulation were also identified. Implications for assessment and applications to parenting interventions are discussed. © 2017 Family Process Institute.
Characteristic and factors of competitive maritime industry clusters in Indonesia
NASA Astrophysics Data System (ADS)
Marlyana, N.; Tontowi, A. E.; Yuniarto, H. A.
2017-12-01
Indonesia is situated in the strategic position between two oceans therefore is identified as a maritime state. The fact opens big opportunity to build a competitive maritime industry. However, potential factors to boost the competitive maritime industry still need to be explored. The objective of this paper is then to determine the main characteristics and potential factors of competitive maritime industry cluster. Qualitative analysis based on literature review has been carried out in two aspects. First, benchmarking analysis conducted to distinguish the most relevant factors of maritime clusters in several countries in Europe (Norway, Spain, South West of England) and Asia (China, South Korea, Malaysia). Seven key dimensions are used for this benchmarking. Secondly, the competitiveness of maritime clusters in Indonesia was diagnosed through a reconceptualization of Porter’s Diamond model. There were four interlinked of advanced factors in and between companies within clusters, which can be influenced in a proactive way by government.
Knoble, Naomi B; Alderfer, Melissa A; Hossain, Md Jobayer
2016-10-01
Socioeconomic status (SES) is a complex construct of multiple indicators, known to impact cancer outcomes, but has not been adequately examined among pediatric AML patients. This study aimed to identify the patterns of co-occurrence of multiple community-level SES indicators and to explore associations between various patterns of these indicators and pediatric AML mortality risk. A nationally representative US sample of 3651 pediatric AML patients, aged 0-19 years at diagnosis was drawn from 17 Surveillance, Epidemiology, and End Results (SEER) database registries created between 1973 and 2012. Factor analysis, cluster analysis, stratified univariable and multivariable Cox proportional hazards models were used. Four SES factors accounting for 87% of the variance in SES indicators were identified: F1) economic/educational disadvantage, less immigration; F2) immigration-related features (foreign-born, language-isolation, crowding), less mobility; F3) housing instability; and, F4) absence of moving. F1 and F3 showed elevated risk of mortality, adjusted hazards ratios (aHR) (95% CI): 1.07(1.02-1.12) and 1.05(1.00-1.10), respectively. Seven SES-defined cluster groups were identified. Cluster 1 (low economic/educational disadvantage, few immigration-related features, and residential-stability) showed the minimum risk of mortality. Compared to Cluster 1, Cluster 3 (high economic/educational disadvantage, high-mobility) and Cluster 6 (moderately-high economic/educational disadvantages, housing-instability and immigration-related features) exhibited substantially greater risk of mortality, aHR(95% CI)=1.19(1.0-1.4) and 1.23 (1.1-1.5), respectively. Factors of correlated SES-indicators and their pattern-based groups demonstrated differential risks in the pediatric AML mortality indicating the need of special public-health attention in areas with economic-educational disadvantages, housing-instability and immigration-related features. Copyright © 2016 Elsevier Ltd. All rights reserved.
2013-10-01
correct group assignment of samples in unsupervised hierarchical clustering by the Unweighted Pair-Group Method using Arithmetic averages ( UPGMA ) based on...centering of log2 transformed MAS5.0 signal values; probe set clustering was performed by the UPGMA method using Cosine correlation as the similarity met...A) The 108 differentially-regulated genes identified were subjected to unsupervised hierarchical clustering analysis using the UPGMA algorithm with
NASA Astrophysics Data System (ADS)
Ward, W. O. C.; Wilkinson, P. B.; Chambers, J. E.; Oxby, L. S.; Bai, L.
2014-04-01
A novel method for the effective identification of bedrock subsurface elevation from electrical resistivity tomography images is described. Identifying subsurface boundaries in the topographic data can be difficult due to smoothness constraints used in inversion, so a statistical population-based approach is used that extends previous work in calculating isoresistivity surfaces. The analysis framework involves a procedure for guiding a clustering approach based on the fuzzy c-means algorithm. An approximation of resistivity distributions, found using kernel density estimation, was utilized as a means of guiding the cluster centroids used to classify data. A fuzzy method was chosen over hard clustering due to uncertainty in hard edges in the topography data, and a measure of clustering uncertainty was identified based on the reciprocal of cluster membership. The algorithm was validated using a direct comparison of known observed bedrock depths at two 3-D survey sites, using real-time GPS information of exposed bedrock by quarrying on one site, and borehole logs at the other. Results show similarly accurate detection as a leading isosurface estimation method, and the proposed algorithm requires significantly less user input and prior site knowledge. Furthermore, the method is effectively dimension-independent and will scale to data of increased spatial dimensions without a significant effect on the runtime. A discussion on the results by automated versus supervised analysis is also presented.
Bolstad, Heather M; Botelho, Danielle J; Wood, Matthew J
2010-10-01
Fe-S cluster biogenesis is of interest to many fields, including bioenergetics and gene regulation. The CSD system is one of three Fe-S cluster biogenesis systems in E. coli and is comprised of the cysteine desulfurase CsdA, the sulfur acceptor protein CsdE, and the E1-like protein CsdL. The biological role, biochemical mechanism, and protein targets of the system remain uncharacterized. Here we present that the active site CsdE C61 has a lowered pK(a) value of 6.5, which is nearly identical to that of C51 in the homologous SufE protein and which is likely critical for its function. We observed that CsdE forms disulfide bonds with multiple proteins and identified the proteins that copurify with CsdE. The identification of Fe-S proteins and both putative and established Fe-S cluster assembly (ErpA, glutaredoxin-3, glutaredoxin-4) and sulfur trafficking (CsdL, YchN) proteins supports the two-pathway model, in which the CSD system is hypothesized to synthesize both Fe-S clusters and other sulfur-containing cofactors. We suggest that the identified Fe-S cluster assembly proteins may be the scaffold and/or shuttle proteins for the CSD system. By comparison with previous analysis of SufE, we demonstrate that there is some overlap in the CsdE and SufE interactomes.
Ye, Weimin; Robbins, R. T.
2004-01-01
Hierarchical cluster analysis based on female morphometric character means including body length, distance from vulva opening to anterior end, head width, odontostyle length, esophagus length, body width, tail length, and tail width were used to examine the morphometric relationships and create dendrograms for (i) 62 populations belonging to 9 Longidorus species from Arkansas, (ii) 137 published Longidorus species, and (iii) 137 published Longidorus species plus 86 populations of 16 Longidorus species from Arkansas and various other locations by using JMP 4.02 software (SAS Institute, Cary, NC). Cluster analysis dendograms visually illustrated the grouping and morphometric relationships of the species and populations. It provided a computerized statistical approach to assist by helping to identify and distinguish species, by indicating morphometric relationships among species, and by assisting with new species diagnosis. The preliminary species identification can be accomplished by running cluster analysis for unknown species together with the data matrix of known published Longidorus species. PMID:19262809
Identifying clusters of active transportation using spatial scan statistics.
Huang, Lan; Stinchcomb, David G; Pickle, Linda W; Dill, Jennifer; Berrigan, David
2009-08-01
There is an intense interest in the possibility that neighborhood characteristics influence active transportation such as walking or biking. The purpose of this paper is to illustrate how a spatial cluster identification method can evaluate the geographic variation of active transportation and identify neighborhoods with unusually high/low levels of active transportation. Self-reported walking/biking prevalence, demographic characteristics, street connectivity variables, and neighborhood socioeconomic data were collected from respondents to the 2001 California Health Interview Survey (CHIS; N=10,688) in Los Angeles County (LAC) and San Diego County (SDC). Spatial scan statistics were used to identify clusters of high or low prevalence (with and without age-adjustment) and the quantity of time spent walking and biking. The data, a subset from the 2001 CHIS, were analyzed in 2007-2008. Geographic clusters of significantly high or low prevalence of walking and biking were detected in LAC and SDC. Structural variables such as street connectivity and shorter block lengths are consistently associated with higher levels of active transportation, but associations between active transportation and socioeconomic variables at the individual and neighborhood levels are mixed. Only one cluster with less time spent walking and biking among walkers/bikers was detected in LAC, and this was of borderline significance. Age-adjustment affects the clustering pattern of walking/biking prevalence in LAC, but not in SDC. The use of spatial scan statistics to identify significant clustering of health behaviors such as active transportation adds to the more traditional regression analysis that examines associations between behavior and environmental factors by identifying specific geographic areas with unusual levels of the behavior independent of predefined administrative units.
Identifying Clusters of Active Transportation Using Spatial Scan Statistics
Huang, Lan; Stinchcomb, David G.; Pickle, Linda W.; Dill, Jennifer; Berrigan, David
2009-01-01
Background There is an intense interest in the possibility that neighborhood characteristics influence active transportation such as walking or biking. The purpose of this paper is to illustrate how a spatial cluster identification method can evaluate the geographic variation of active transportation and identify neighborhoods with unusually high/low levels of active transportation. Methods Self-reported walking/biking prevalence, demographic characteristics, street connectivity variables, and neighborhood socioeconomic data were collected from respondents to the 2001 California Health Interview Survey (CHIS; N=10,688) in Los Angeles County (LAC) and San Diego County (SDC). Spatial scan statistics were used to identify clusters of high or low prevalence (with and without age-adjustment) and the quantity of time spent walking and biking. The data, a subset from the 2001 CHIS, were analyzed in 2007–2008. Results Geographic clusters of significantly high or low prevalence of walking and biking were detected in LAC and SDC. Structural variables such as street connectivity and shorter block lengths are consistently associated with higher levels of active transportation, but associations between active transportation and socioeconomic variables at the individual and neighborhood levels are mixed. Only one cluster with less time spent walking and biking among walkers/bikers was detected in LAC, and this was of borderline significance. Age-adjustment affects the clustering pattern of walking/biking prevalence in LAC, but not in SDC. Conclusions The use of spatial scan statistics to identify significant clustering of health behaviors such as active transportation adds to the more traditional regression analysis that examines associations between behavior and environmental factors by identifying specific geographic areas with unusual levels of the behavior independent of predefined administrative units. PMID:19589451
Phenotypes determined by cluster analysis in severe or difficult-to-treat asthma.
Schatz, Michael; Hsu, Jin-Wen Y; Zeiger, Robert S; Chen, Wansu; Dorenbaum, Alejandro; Chipps, Bradley E; Haselkorn, Tmirah
2014-06-01
Asthma phenotyping can facilitate understanding of disease pathogenesis and potential targeted therapies. To further characterize the distinguishing features of phenotypic groups in difficult-to-treat asthma. Children ages 6-11 years (n = 518) and adolescents and adults ages ≥12 years (n = 3612) with severe or difficult-to-treat asthma from The Epidemiology and Natural History of Asthma: Outcomes and Treatment Regimens (TENOR) study were evaluated in this post hoc cluster analysis. Analyzed variables included sex, race, atopy, age of asthma onset, smoking (adolescents and adults), passive smoke exposure (children), obesity, and aspirin sensitivity. Cluster analysis used the hierarchical clustering algorithm with the Ward minimum variance method. The results were compared among clusters by χ(2) analysis; variables with significant (P < .05) differences among clusters were considered as distinguishing feature candidates. Associations among clusters and asthma-related health outcomes were assessed in multivariable analyses by adjusting for socioeconomic status, environmental exposures, and intensity of therapy. Five clusters were identified in each age stratum. Sex, atopic status, and nonwhite race were distinguishing variables in both strata; passive smoke exposure was distinguishing in children and aspirin sensitivity in adolescents and adults. Clusters were not related to outcomes in children, but 2 adult and adolescent clusters distinguished by nonwhite race and aspirin sensitivity manifested poorer quality of life (P < .0001), and the aspirin-sensitive cluster experienced more frequent asthma exacerbations (P < .0001). Distinct phenotypes appear to exist in patients with severe or difficult-to-treat asthma, which is related to outcomes in adolescents and adults but not in children. The study of the therapeutic implications of these phenotypes is warranted. Copyright © 2013 American Academy of Allergy, Asthma & Immunology. Published by Mosby, Inc. All rights reserved.
Brain structure and function correlates of cognitive subtypes in schizophrenia.
Geisler, Daniel; Walton, Esther; Naylor, Melissa; Roessner, Veit; Lim, Kelvin O; Charles Schulz, S; Gollub, Randy L; Calhoun, Vince D; Sponheim, Scott R; Ehrlich, Stefan
2015-10-30
Stable neuropsychological deficits may provide a reliable basis for identifying etiological subtypes of schizophrenia. The aim of this study was to identify clusters of individuals with schizophrenia based on dimensions of neuropsychological performance, and to characterize their neural correlates. We acquired neuropsychological data as well as structural and functional magnetic resonance imaging from 129 patients with schizophrenia and 165 healthy controls. We derived eight cognitive dimensions and subsequently applied a cluster analysis to identify possible schizophrenia subtypes. Analyses suggested the following four cognitive clusters of schizophrenia: (1) Diminished Verbal Fluency, (2) Diminished Verbal Memory and Poor Motor Control, (3) Diminished Face Memory and Slowed Processing, and (4) Diminished Intellectual Function. The clusters were characterized by a specific pattern of structural brain changes in areas such as Wernicke's area, lingual gyrus and occipital face area, and hippocampus as well as differences in working memory-elicited neural activity in several fronto-parietal brain regions. Separable measures of cognitive function appear to provide a method for deriving cognitive subtypes meaningfully related to brain structure and function. Because the present study identified brain-based neural correlates of the cognitive clusters, the proposed groups of individuals with schizophrenia have some external validity. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
STAR FORMATION ACROSS THE W3 COMPLEX
DOE Office of Scientific and Technical Information (OSTI.GOV)
Román-Zúñiga, Carlos G.; Ybarra, Jason E.; Tapia, Mauricio
We present a multi-wavelength analysis of the history of star formation in the W3 complex. Using deep, near-infrared ground-based images combined with images obtained with Spitzer and Chandra observatories, we identified and classified young embedded sources. We identified the principal clusters in the complex and determined their structure and extension. We constructed extinction-limited samples for five principal clusters and constructed K-band luminosity functions that we compare with those of artificial clusters with varying ages. This analysis provided mean ages and possible age spreads for the clusters. We found that IC 1795, the centermost cluster of the complex, still hosts amore » large fraction of young sources with circumstellar disks. This indicates that star formation was active in IC 1795 as recently as 2 Myr ago, simultaneous to the star-forming activity in the flanking embedded clusters, W3-Main and W3(OH). A comparison with carbon monoxide emission maps indicates strong velocity gradients in the gas clumps hosting W3-Main and W3(OH) and shows small receding clumps of gas at IC 1795, suggestive of rapid gas removal (faster than the T Tauri timescale) in the cluster-forming regions. We discuss one possible scenario for the progression of cluster formation in the W3 complex. We propose that early processes of gas collapse in the main structure of the complex could have defined the progression of cluster formation across the complex with relatively small age differences from one group to another. However, triggering effects could act as catalysts for enhanced efficiency of formation at a local level, in agreement with previous studies.« less
Chen, Wen; Zhou, Fangjing; Hall, Brian J; Wang, Yu; Latkin, Carl; Ling, Li; Tucker, Joseph D
2016-01-01
Objectives To assess associations between residences location, risky sexual behaviours and sexually transmitted diseases (STDs) among adults living in Guangzhou, China. Methods Data were obtained from 751 Chinese adults aged 18–59 years in Guangzhou, China, using stratified random sampling by using spatial epidemiological methods. Face-to-face household interviews were conducted to collect self-report data on risky sexual behaviours and diagnosed STDs. Kulldorff’s spatial scan statistic was implemented to identify and detect spatial distribution and clusters of risky sexual behaviours and STDs. The presence and location of statistically significant clusters were mapped in the study areas using ArcGIS software. Results The prevalence of self-reported risky sexual behaviours was between 5.1% and 50.0%. The self-reported lifetime prevalence of diagnosed STDs was 7.06%. Anal intercourse clustered in an area located along the border within the rural–urban continuum (p=0.001). High rate clusters for alcohol or other drugs using before sex (p=0.008) and migrants who lived in Guangzhou <1 year (p=0.007) overlapped this cluster. Excess cases for unprotected sex (p=0.031) overlapped the cluster for college students (p<0.001). Five of nine (55.6%) students who had sexual experience during the last 12 months located in the cluster of unprotected sex. Conclusions Short-term migrants and college students reported greater risky sexual behaviours. Programmes to increase safer sex within these communities to reduce the risk of STDs are warranted in Guangzhou. Spatial analysis identified geographical clusters of risky sexual behaviours, which is critical for optimising surveillance and targeting control measures for these locations in the future. PMID:26843400
[Study on HPLC fingerprint of Oldenlandia diffusa].
Chen, Yan; Yao, Zhi-Hong; Dai, Yi; Cheng, Hong; Wen, Li-Rong; Zhou, Guang-Xiong; Yao, Xin-Sheng
2012-06-01
To establish the HPLC fingerprint chromatogram of Oldenlandia diffusa coupled with chemometrics means for the quality control of multi-batches of medicinal material. The separation was developed on C18 column(4.6 mm x 250 mm, 5 microm) by gradient elution with acetonitrile-water(both containing 0.1 per thousand (V/V) ocetic acid) as mobile phase at a flow rate of 0.8 mL/min, the detection wavelength at 238 nm and column temperature at 30 degrees C. The HPLC fingerprint chromatogram of Oldenlandia diffusa was set up and the main characteristic peaks were identified by comparing with chemical reference substance. The quality of 22 batches of medicinal material was evaluated by similarity assay as well as principal component analysis (PCA) and cluster analysis. The established HPLC fingerprint chromatogram of Oldenlandia diffusa was specific, precise, reproducible and stable. 11 peaks were chemically identified. The similarity of 17 batches of Oldenlandia diffusa was obviously higher than 5 batches of adulterants. PCA showed that 17 batches of Oldenlandia diffusa were in a domain and 5 batches of adulterants were far apart from the domain. The cluster analysis of the 22 batches of medicinal material showed that 17 batches of Oldenlandia diffusa were in a cluster while 5 batches of adulterants were excluded. Further cluster analysis was carried out for the quality consistency of 17 batches of Oldenlandia diffusa and accordingly they were devided into 4 clusters. With the combination of chemometrics means, the HPLC fingerprint chromatogram provides a method for evaluation of authenticity and quality control of Oldenlandia diffusa, which is favorable to improve overall quality control of Oldenlandia diffusa.
Le Vu, Stéphane; Ratmann, Oliver; Delpech, Valerie; Brown, Alison E; Gill, O Noel; Tostevin, Anna; Fraser, Christophe; Volz, Erik M
2018-06-01
Phylogenetic clustering of HIV sequences from a random sample of patients can reveal epidemiological transmission patterns, but interpretation is hampered by limited theoretical support and statistical properties of clustering analysis remain poorly understood. Alternatively, source attribution methods allow fitting of HIV transmission models and thereby quantify aspects of disease transmission. A simulation study was conducted to assess error rates of clustering methods for detecting transmission risk factors. We modeled HIV epidemics among men having sex with men and generated phylogenies comparable to those that can be obtained from HIV surveillance data in the UK. Clustering and source attribution approaches were applied to evaluate their ability to identify patient attributes as transmission risk factors. We find that commonly used methods show a misleading association between cluster size or odds of clustering and covariates that are correlated with time since infection, regardless of their influence on transmission. Clustering methods usually have higher error rates and lower sensitivity than source attribution method for identifying transmission risk factors. But neither methods provide robust estimates of transmission risk ratios. Source attribution method can alleviate drawbacks from phylogenetic clustering but formal population genetic modeling may be required to estimate quantitative transmission risk factors. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
Optimization of self-interstitial clusters in 3C-SiC with genetic algorithm
NASA Astrophysics Data System (ADS)
Ko, Hyunseok; Kaczmarowski, Amy; Szlufarska, Izabela; Morgan, Dane
2017-08-01
Under irradiation, SiC develops damage commonly referred to as black spot defects, which are speculated to be self-interstitial atom clusters. To understand the evolution of these defect clusters and their impacts (e.g., through radiation induced swelling) on the performance of SiC in nuclear applications, it is important to identify the cluster composition, structure, and shape. In this work the genetic algorithm code StructOpt was utilized to identify groundstate cluster structures in 3C-SiC. The genetic algorithm was used to explore clusters of up to ∼30 interstitials of C-only, Si-only, and Si-C mixtures embedded in the SiC lattice. We performed the structure search using Hamiltonians from both density functional theory and empirical potentials. The thermodynamic stability of clusters was investigated in terms of their composition (with a focus on Si-only, C-only, and stoichiometric) and shape (spherical vs. planar), as a function of the cluster size (n). Our results suggest that large Si-only clusters are likely unstable, and clusters are predominantly C-only for n ≤ 10 and stoichiometric for n > 10. The results imply that there is an evolution of the shape of the most stable clusters, where small clusters are stable in more spherical geometries while larger clusters are stable in more planar configurations. We also provide an estimated energy vs. size relationship, E(n), for use in future analysis.
Fetterman, Christina D; Rannala, Bruce; Walter, Michael A
2008-09-24
Members of the forkhead gene family act as transcription regulators in biological processes including development and metabolism. The evolution of forkhead genes has not been widely examined and selection pressures at the molecular level influencing subfamily evolution and differentiation have not been explored. Here, in silico methods were used to examine selection pressures acting on the coding sequence of five multi-species FOX protein subfamily clusters; FoxA, FoxD, FoxI, FoxO and FoxP. Application of site models, which estimate overall selection pressures on individual codons throughout the phylogeny, showed that the amino acid changes observed were either neutral or under negative selection. Branch-site models, which allow estimated selection pressures along specified lineages to vary as compared to the remaining phylogeny, identified positive selection along branches leading to the FoxA3 and Protostomia clades in the FoxA cluster and the branch leading to the FoxO3 clade in the FoxO cluster. Residues that may differentiate paralogs were identified in the FoxA and FoxO clusters and residues that differentiate orthologs were identified in the FoxA cluster. Neutral amino acid changes were identified in the forkhead domain of the FoxA, FoxD and FoxP clusters while positive selection was identified in the forkhead domain of the Protostomia lineage of the FoxA cluster. A series of residues under strong negative selection adjacent to the N- and C-termini of the forkhead domain were identified in all clusters analyzed suggesting a new method for refinement of domain boundaries. Extrapolation of domains among cluster members in conjunction with selection pressure information allowed prediction of residue function in the FoxA, FoxO and FoxP clusters and exclusion of known domain function in residues of the FoxA and FoxI clusters. Consideration of selection pressures observed in conjunction with known functional information allowed prediction of residue function and refinement of domain boundaries. Identification of residues that differentiate orthologs and paralogs provided insight into the development and functional consequences of paralogs and forkhead subfamily composition differences among species. Overall we found that after gene duplication of forkhead family members, rapid differentiation and subsequent fixation of amino acid changes through negative selection has occurred.
Scharff, Lisa; Langan, Nicole; Rotter, Nancy; Scott-Sutherland, Jennifer; Schenck, Clorinda; Tayor, Neil; McDonald-Nolan, Lori; Masek, Bruce
2005-01-01
There has been a longstanding recognition that adult patients with chronic pain are not a homogenous population and that there are subgroups of patients who report high levels of distress and interpersonal difficulties as well as subgroups of patients who report little distress and high functioning. The purpose of the present study was to attempt to identify similar subgroups in a pediatric chronic pain population. The sample consisted of 117 children with chronic pain and their parents who were assessed in a multidisciplinary pain clinic during 2001. Participants completed a set of psychologic self-report questionnaires, as well as demographic and pain characteristic information. A cluster analysis was conducted to identify 3 distinct subgroups of patients to replicate similar studies of adult chronic pain sufferers. Overall, mean scores were within population norms on measures of distress and family functioning, with somatic symptoms at a level of clinical significance. The cluster analysis identified the 3 subgroups that were strikingly similar to those identified in adult chronic pain populations: one with high levels of distress and disability, another with relatively low scores on distress and disability, and a third group that scored in between the other 2 on these measures but with marked low family cohesion. The similarity of these subgroups to the adult chronic pain population subgroups as well as implications for future studies are discussed.
A cluster analysis investigation of workaholism as a syndrome.
Aziz, Shahnaz; Zickar, Michael J
2006-01-01
Workaholism has been conceptualized as a syndrome although there have been few tests that explicitly consider its syndrome status. The authors analyzed a three-dimensional scale of workaholism developed by Spence and Robbins (1992) using cluster analysis. The authors identified three clusters of individuals, one of which corresponded to Spence and Robbins's profile of the workaholic (high work involvement, high drive to work, low work enjoyment). Consistent with previously conjectured relations with workaholism, individuals in the workaholic cluster were more likely to label themselves as workaholics, more likely to have acquaintances label them as workaholics, and more likely to have lower life satisfaction and higher work-life imbalance. The importance of considering workaholism as a syndrome and the implications for effective interventions are discussed. Copyright 2006 APA.
NASA Astrophysics Data System (ADS)
Hynds, Paul; Misstear, Bruce D.; Gill, Laurence W.; Murphy, Heather M.
2014-04-01
An integrated domestic well sampling and "susceptibility assessment" programme was undertaken in the Republic of Ireland from April 2008 to November 2010. Overall, 211 domestic wells were sampled, assessed and collated with local climate data. Based upon groundwater physicochemical profile, three clusters have been identified and characterised by source type (borehole or hand-dug well) and local geological setting. Statistical analysis indicates that cluster membership is significantly associated with the prevalence of bacteria (p = 0.001), with mean Escherichia coli presence within clusters ranging from 15.4% (Cluster-1) to 47.6% (Cluster-3). Bivariate risk factor analysis shows that on-site septic tank presence was the only risk factor significantly associated (p < 0.05) with bacterial presence within all clusters. Point agriculture adjacency was significantly associated with both borehole-related clusters. Well design criteria were associated with hand-dug wells and boreholes in areas characterised by high permeability subsoils, while local geological setting was significant for hand-dug wells and boreholes in areas dominated by low/moderate permeability subsoils. Multivariate susceptibility models were developed for all clusters, with predictive accuracies of 84% (Cluster-1) to 91% (Cluster-2) achieved. Septic tank setback was a common variable within all multivariate models, while agricultural sources were also significant, albeit to a lesser degree. Furthermore, well liner clearance was a significant factor in all models, indicating that direct surface ingress is a significant well contamination mechanism. Identification and elucidation of cluster-specific contamination mechanisms may be used to develop improved overall risk management and wellhead protection strategies, while also informing future remediation and maintenance efforts.
Description and typology of intensive Chios dairy sheep farms in Greece.
Gelasakis, A I; Valergakis, G E; Arsenos, G; Banos, G
2012-06-01
The aim was to assess the intensified dairy sheep farming systems of the Chios breed in Greece, establishing a typology that may properly describe and characterize them. The study included the total of the 66 farms of the Chios sheep breeders' cooperative Macedonia. Data were collected using a structured direct questionnaire for in-depth interviews, including questions properly selected to obtain a general description of farm characteristics and overall management practices. A multivariate statistical analysis was used on the data to obtain the most appropriate typology. Initially, principal component analysis was used to produce uncorrelated variables (principal components), which would be used for the consecutive cluster analysis. The number of clusters was decided using hierarchical cluster analysis, whereas, the farms were allocated in 4 clusters using k-means cluster analysis. The identified clusters were described and afterward compared using one-way ANOVA or a chi-squared test. The main differences were evident on land availability and use, facility and equipment availability and type, expansion rates, and application of preventive flock health programs. In general, cluster 1 included newly established, intensive, well-equipped, specialized farms and cluster 2 included well-established farms with balanced sheep and feed/crop production. In cluster 3 were assigned small flock farms focusing more on arable crops than on sheep farming with a tendency to evolve toward cluster 2, whereas cluster 4 included farms representing a rather conservative form of Chios sheep breeding with low/intermediate inputs and choosing not to focus on feed/crop production. In the studied set of farms, 4 different farmer attitudes were evident: 1) farming disrupts sheep breeding; feed should be purchased and economies of scale will decrease costs (mainly cluster 1), 2) only exercise/pasture land is necessary; at least part of the feed (pasture) must be home-grown to decrease costs (clusters 1 and 4), 3) providing pasture to sheep is essential; on-farm feed production decreases costs (mainly cluster 3), and 4) large-scale farming (feed production and cash crops) does not disrupt sheep breeding; all feed must be produced on-farm to decrease costs (mainly cluster 3). Conducting a profitability analysis among different clusters, exploring and discovering the most beneficial levels of intensified management and capital investment should now be considered. Copyright © 2012 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Choi, S; Ryu, E
2018-01-01
People with advanced lung cancer experience later symptoms after treatment that is related to poorer psychosocial and quality of life (QOL) outcomes. The purpose of this study was to identify the effect of symptom clusters and depression on the QOL of patients with advanced lung cancer. A sample of 178 patients with advanced lung cancer at the National Cancer Center in Korea completed a demographic questionnaire, the M.D. Anderson Symptom Inventory-Lung Cancer, the Center for Epidemiological Studies Depression Scale, and the Functional Assessment of Cancer Therapy-General scale. The most frequently experienced symptom was fatigue, anguish was the most severe symptom-associated distress, and 28.9% of participants were clinically depressed. Factor analysis was used to identify symptom clusters based on the severity of patients' symptom experiences. Three symptom clusters were identified: treatment-associated, lung cancer and psychological symptom clusters. The regression model found a significant negative impact on QOL for depression and lung cancer symptom cluster. Age as the control variable was found to be significant impact on QOL. Therefore, psychological screening and appropriate intervention is an essential part of advanced cancer care. Both pharmacological and non-pharmacological approaches for alleviating depression may help to improve the QOL of lung cancer patients. © 2016 John Wiley & Sons Ltd.
Profiles of Reactivity in Cocaine-Exposed Children
Schuetze, Pamela; Molnar, Danielle S.; Eiden, Rina D.
2012-01-01
This study explored the possibility that specific, theoretically consistent profiles of reactivity could be identified in a sample of cocaine-exposed infants and whether these profiles were associated with a range of infant and/or maternal characteristics. Cluster analysis was used to identify distinct groups of infants based on physiological, behavioral and maternal reported measures of reactivity. Five replicable clusters were identified which corresponded to 1) Dysregulated/High Maternal Report Reactors, 2) Low Behavioral Reactors, 3) High Reactors, 4) Optimal Reactors and 5) Dysregulated/Low Maternal Report Reactors. These clusters were associated with differences in prenatal cocaine exposure status, birthweight, maternal depressive symptoms, and maternal negative affect during mother-infant interactions. These results support the presence of distinct reactivity profiles among high risk infants recruited on the basis of prenatal cocaine exposure and demographically similar control group infants not exposed to cocaine. PMID:23204615
Microforms in gravel bed rivers: Formation, disintegration, and effects on bedload transport
Strom, K.; Papanicolaou, A.N.; Evangelopoulos, N.; Odeh, M.
2004-01-01
This research aims to advance current knowledge on cluster formation and evolution by tackling some of the aspects associated with cluster microtopography and the effects of clusters on bedload transport. The specific objectives of the study are (1) to identify the bed shear stress range in which clusters form and disintegrate, (2) to quantitatively describe the spacing characteristics and orientation of clusters with respect to flow characteristics, (3) to quantify the effects clusters have on the mean bedload rate, and (4) to assess the effects of clusters on the pulsating nature of bedload. In order to meet the objectives of this study, two main experimental scenarios, namely, Test Series A and B (20 experiments overall) are considered in a laboratory flume under well-controlled conditions. Series A tests are performed to address objectives (1) and (2) while Series B is designed to meet objectives (3) and (4). Results show that cluster microforms develop in uniform sediment at 1.25 to 2 times the Shields parameter of an individual particle and start disintegrating at about 2.25 times the Shields parameter. It is found that during an unsteady flow event, effects of clusters on bedload transport rate can be classified in three different phases: a sink phase where clusters absorb incoming sediment, a neutral phase where clusters do not affect bedload, and a source phase where clusters release particles. Clusters also increase the magnitude of the fluctuations in bedload transport rate, showing that clusters amplify the unsteady nature of bedload transport. A fourth-order autoregressive, autoregressive integrated moving average model is employed to describe the time series of bedload and provide a predictive formula for predicting bedload at different periods. Finally, a change-point analysis enhanced with a binary segmentation procedure is performed to identify the abrupt changes in the bedload statistic characteristics due to the effects of clusters and detect the different phases in bedload time series using probability theory. The analysis verifies the experimental findings that three phases are detected in the bedload rate time series structure, namely, sink, neutral, and source. ?? ASCE / JUNE 2004.
On the Distribution of Orbital Poles of Milky Way Satellites
NASA Astrophysics Data System (ADS)
Palma, Christopher; Majewski, Steven R.; Johnston, Kathryn V.
2002-01-01
In numerous studies of the outer Galactic halo some evidence for accretion has been found. If the outer halo did form in part or wholly through merger events, we might expect to find coherent streams of stars and globular clusters following orbits similar to those of their parent objects, which are assumed to be present or former Milky Way dwarf satellite galaxies. We present a study of this phenomenon by assessing the likelihood of potential descendant ``dynamical families'' in the outer halo. We conduct two analyses: one that involves a statistical analysis of the spatial distribution of all known Galactic dwarf satellite galaxies (DSGs) and globular clusters, and a second, more specific analysis of those globular clusters and DSGs for which full phase space dynamical data exist. In both cases our methodology is appropriate only to members of descendant dynamical families that retain nearly aligned orbital poles today. Since the Sagittarius dwarf (Sgr) is considered a paradigm for the type of merger/tidal interaction event for which we are searching, we also undertake a case study of the Sgr system and identify several globular clusters that may be members of its extended dynamical family. In our first analysis, the distribution of possible orbital poles for the entire sample of outer (Rgc>8 kpc) halo globular clusters is tested for statistically significant associations among globular clusters and DSGs. Our methodology for identifying possible associations is similar to that used by Lynden-Bell & Lynden-Bell, but we put the associations on a more statistical foundation. Moreover, we study the degree of possible dynamical clustering among various interesting ensembles of globular clusters and satellite galaxies. Among the ensembles studied, we find the globular cluster subpopulation with the highest statistical likelihood of association with one or more of the Galactic DSGs to be the distant, outer halo (Rgc>25 kpc), second-parameter globular clusters. The results of our orbital pole analysis are supported by the great circle cell count methodology of Johnston, Hernquist, & Bolte. The space motions of the clusters Pal 4, NGC 6229, NGC 7006, and Pyxis are predicted to be among those most likely to show the clusters to be following stream orbits, since these clusters are responsible for the majority of the statistical significance of the association between outer halo, second-parameter globular clusters and the Milky Way DSGs. In our second analysis, we study the orbits of the 41 globular clusters and six Milky Way-bound DSGs having measured proper motions to look for objects with both coplanar orbits and similar angular momenta. Unfortunately, the majority of globular clusters with measured proper motions are inner halo clusters that are less likely to retain memory of their original orbit. Although four potential globular cluster/DSG associations are found, we believe three of these associations involving inner halo clusters to be coincidental. While the present sample of objects with complete dynamical data is small and does not include many of the globular clusters that are more likely to have been captured by the Milky Way, the methodology we adopt will become increasingly powerful as more proper motions are measured for distant Galactic satellites and globular clusters, and especially as results from the Space Interferometry Mission (SIM) become available.
Naturally selected hepatitis C virus polymorphisms confer broad neutralizing antibody resistance.
Bailey, Justin R; Wasilewski, Lisa N; Snider, Anna E; El-Diwany, Ramy; Osburn, William O; Keck, Zhenyong; Foung, Steven K H; Ray, Stuart C
2015-01-01
For hepatitis C virus (HCV) and other highly variable viruses, broadly neutralizing mAbs are an important guide for vaccine development. The development of resistance to anti-HCV mAbs is poorly understood, in part due to a lack of neutralization testing against diverse, representative panels of HCV variants. Here, we developed a neutralization panel expressing diverse, naturally occurring HCV envelopes (E1E2s) and used this panel to characterize neutralizing breadth and resistance mechanisms of 18 previously described broadly neutralizing anti-HCV human mAbs. The observed mAb resistance could not be attributed to polymorphisms in E1E2 at known mAb-binding residues. Additionally, hierarchical clustering analysis of neutralization resistance patterns revealed relationships between mAbs that were not predicted by prior epitope mapping, identifying 3 distinct neutralization clusters. Using this clustering analysis and envelope sequence data, we identified polymorphisms in E2 that confer resistance to multiple broadly neutralizing mAbs. These polymorphisms, which are not at mAb contact residues, also conferred resistance to neutralization by plasma from HCV-infected subjects. Together, our method of neutralization clustering with sequence analysis reveals that polymorphisms at noncontact residues may be a major immune evasion mechanism for HCV, facilitating viral persistence and presenting a challenge for HCV vaccine development.
Analysis of correlated mutations in HIV-1 protease using spectral clustering.
Liu, Ying; Eyal, Eran; Bahar, Ivet
2008-05-15
The ability of human immunodeficiency virus-1 (HIV-1) protease to develop mutations that confer multi-drug resistance (MDR) has been a major obstacle in designing rational therapies against HIV. Resistance is usually imparted by a cooperative mechanism that can be elucidated by a covariance analysis of sequence data. Identification of such correlated substitutions of amino acids may be obscured by evolutionary noise. HIV-1 protease sequences from patients subjected to different specific treatments (set 1), and from untreated patients (set 2) were subjected to sequence covariance analysis by evaluating the mutual information (MI) between all residue pairs. Spectral clustering of the resulting covariance matrices disclosed two distinctive clusters of correlated residues: the first, observed in set 1 but absent in set 2, contained residues involved in MDR acquisition; and the second, included those residues differentiated in the various HIV-1 protease subtypes, shortly referred to as the phylogenetic cluster. The MDR cluster occupies sites close to the central symmetry axis of the enzyme, which overlap with the global hinge region identified from coarse-grained normal-mode analysis of the enzyme structure. The phylogenetic cluster, on the other hand, occupies solvent-exposed and highly mobile regions. This study demonstrates (i) the possibility of distinguishing between the correlated substitutions resulting from neutral mutations and those induced by MDR upon appropriate clustering analysis of sequence covariance data and (ii) a connection between global dynamics and functional substitution of amino acids.
NASA Astrophysics Data System (ADS)
Effendi, Hefni; Wardiatno, Yusli; Kawaroe, Mujizat; Mursalin; Fauzia Lestari, Dea
2017-01-01
The surface sediments were identified from west part of Java Sea to evaluate spatial distribution and ecological risk potential of heavy metals (Hg, As, Cd, Cr, Cu, Pb, Zn and Ni). The samples were taken from surface sediment (<0.5 m) in 26 m up to 80 m water depth with Eikman grab. The average material composition on sediment samples were clay (9.86%), sand (8.57%) and mud sand (81.57%). The analysis showed that Pb (11.2%), Cd (49.7%), and Ni (59.5%) exceeded of Probably Effect Level (PEL). Base on ecological risk analysis, {{Cd }}≤ft( {E_r^i:300.64} \\right) and {{Cr }}≤ft( {E_r^i:0.02} \\right) were categorized to high risk and low risk criteria. The ecological risk potential sequences of this study were Cd>Hg>Pb>Ni>Cu>As>Zn>Cr. Furthermore, the result of multivariate statistical analysis shows that correlation among heavy metals (As/Ni, Cd/Ni, and Cu/Zn) and heavy metals with Risk Index (Cd/Ri and Ni/Ri) had positive correlation in significance level p<0.05. Total variance of analysis factor was 80.04% and developed into 3 factors (eigenvalues >1). On the cluster analysis, Cd, Ni, Pb were identified as fairly high contaminations level (cluster 1), Hg as moderate contamination level (cluster 2) and Cu, Zn, Cr with lower contamination level (cluster 3).
Lifestyle Patterns and Weight Status in Spanish Adults: The ANIBES Study.
Pérez-Rodrigo, Carmen; Gianzo-Citores, Marta; Gil, Ángel; González-Gross, Marcela; Ortega, Rosa M; Serra-Majem, Lluis; Varela-Moreiras, Gregorio; Aranceta-Bartrina, Javier
2017-06-14
Limited knowledge is available on lifestyle patterns in Spanish adults. We investigated dietary patterns and possible meaningful clustering of physical activity, sedentary behavior, sleep time, and smoking in Spanish adults aged 18-64 years and their association with obesity. Analysis was based on a subsample ( n = 1617) of the cross-sectional ANIBES study in Spain. We performed exploratory factor analysis and subsequent cluster analysis of dietary patterns, physical activity, sedentary behaviors, sleep time, and smoking. Logistic regression analysis was used to explore the association between the cluster solutions and obesity. Factor analysis identified four dietary patterns, " Traditional DP ", " Mediterranean DP ", " Snack DP " and " Dairy-sweet DP ". Dietary patterns, physical activity behaviors, sedentary behaviors, sleep time, and smoking in Spanish adults aggregated into three different clusters of lifestyle patterns: " Mixed diet-physically active-low sedentary lifestyle pattern ", " Not poor diet-low physical activity-low sedentary lifestyle pattern " and " Poor diet-low physical activity-sedentary lifestyle pattern ". A higher proportion of people aged 18-30 years was classified into the " Poor diet-low physical activity-sedentary lifestyle pattern ". The prevalence odds ratio for obesity in men in the " Mixed diet-physically active-low sedentary lifestyle pattern " was significantly lower compared to those in the " Poor diet-low physical activity-sedentary lifestyle pattern ". Those behavior patterns are helpful to identify specific issues in population subgroups and inform intervention strategies. The findings in this study underline the importance of designing and implementing interventions that address multiple health risk practices, considering lifestyle patterns and associated determinants.
Classification of cassava genotypes based on qualitative and quantitative data.
Oliveira, E J; Oliveira Filho, O S; Santos, V S
2015-02-02
We evaluated the genetic variation of cassava accessions based on qualitative (binomial and multicategorical) and quantitative traits (continuous). We characterized 95 accessions obtained from the Cassava Germplasm Bank of Embrapa Mandioca e Fruticultura; we evaluated these accessions for 13 continuous, 10 binary, and 25 multicategorical traits. First, we analyzed the accessions based only on quantitative traits; next, we conducted joint analysis (qualitative and quantitative traits) based on the Ward-MLM method, which performs clustering in two stages. According to the pseudo-F, pseudo-t2, and maximum likelihood criteria, we identified five and four groups based on quantitative trait and joint analysis, respectively. The smaller number of groups identified based on joint analysis may be related to the nature of the data. On the other hand, quantitative data are more subject to environmental effects in the phenotype expression; this results in the absence of genetic differences, thereby contributing to greater differentiation among accessions. For most of the accessions, the maximum probability of classification was >0.90, independent of the trait analyzed, indicating a good fit of the clustering method. Differences in clustering according to the type of data implied that analysis of quantitative and qualitative traits in cassava germplasm might explore different genomic regions. On the other hand, when joint analysis was used, the means and ranges of genetic distances were high, indicating that the Ward-MLM method is very useful for clustering genotypes when there are several phenotypic traits, such as in the case of genetic resources and breeding programs.
Cholera epidemic in Guinea-Bissau (2008): the importance of "place".
Luquero, Francisco J; Banga, Cunhate Na; Remartínez, Daniel; Palma, Pedro Pablo; Baron, Emanuel; Grais, Rebeca F
2011-05-04
As resources are limited when responding to cholera outbreaks, knowledge about where to orient interventions is crucial. We describe the cholera epidemic affecting Guinea-Bissau in 2008 focusing on the geographical spread in order to guide prevention and control activities. We conducted two studies: 1) a descriptive analysis of the cholera epidemic in Guinea-Bissau focusing on its geographical spread (country level and within the capital); and 2) a cross-sectional study to measure the prevalence of houses with at least one cholera case in the most affected neighbourhood of the capital (Bairro Bandim) to detect clustering of households with cases (cluster analysis). All cholera cases attending the cholera treatment centres in Guinea-Bissau who fulfilled a modified World Health Organization clinical case definition during the epidemic were included in the descriptive study. For the cluster analysis, a sample of houses was selected from a satellite photo (Google Earth™); 140 houses (and the four closest houses) were assessed from the 2,202 identified structures. We applied K-functions and Kernel smoothing to detect clustering. We confirmed the clustering using Kulldorff's spatial scan statistic. A total of 14,222 cases and 225 deaths were reported in the country (AR = 0.94%, CFR = 1.64%). The more affected regions were Biombo, Bijagos and Bissau (the capital). Bairro Bandim was the most affected neighborhood of the capital (AR = 4.0). We found at least one case in 22.7% of the houses (95%CI: 19.5-26.2) in this neighborhood. The cluster analysis identified two areas within Bairro Bandim at highest risk: a market and an intersection where runoff accumulates waste (p<0.001). Our analysis allowed for the identification of the most affected regions in Guinea-Bissau during the 2008 cholera outbreak, and the most affected areas within the capital. This information was essential for making decisions on where to reinforce treatment and to guide control and prevention activities.
Successful ageing: A study of the literature using citation network analysis.
Kusumastuti, Sasmita; Derks, Marloes G M; Tellier, Siri; Di Nucci, Ezio; Lund, Rikke; Mortensen, Erik Lykke; Westendorp, Rudi G J
2016-11-01
Ageing is accompanied by an increased risk of disease and a loss of functioning on several bodily and mental domains and some argue that maintaining health and functioning is essential for a successful old age. Paradoxically, studies have shown that overall wellbeing follows a curvilinear pattern with the lowest point at middle age but increases thereafter up to very old age. To shed further light on this paradox, we reviewed the existing literature on how scholars define successful ageing and how they weigh the contribution of health and functioning to define success. We performed a novel, hypothesis-free and quantitative analysis of citation networks exploring the literature on successful ageing that exists in the Web of Science Core Collection Database using the CitNetExplorer software. Outcomes were visualized using timeline-based citation patterns. The clusters and sub-clusters of citation networks identified were starting points for in-depth qualitative analysis. Within the literature from 1902 through 2015, two distinct citation networks were identified. The first cluster had 1146 publications and 3946 citation links. It focused on successful ageing from the perspective of older persons themselves. Analysis of the various sub-clusters emphasized the importance of coping strategies, psycho-social engagement, and cultural differences. The second cluster had 609 publications and 1682 citation links and viewed successful ageing based on the objective measurements as determined by researchers. Subsequent sub-clustering analysis pointed to different domains of functioning and various ways of assessment. In the current literature two mutually exclusive concepts of successful ageing are circulating that depend on whether the individual himself or an outsider judges the situation. These different points of view help to explain the disability paradox, as successful ageing lies in the eyes of the beholder. Copyright © 2016 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.
Cholera Epidemic in Guinea-Bissau (2008): The Importance of “Place”
Luquero, Francisco J.; Banga, Cunhate Na; Remartínez, Daniel; Palma, Pedro Pablo; Baron, Emanuel; Grais, Rebeca F.
2011-01-01
Background As resources are limited when responding to cholera outbreaks, knowledge about where to orient interventions is crucial. We describe the cholera epidemic affecting Guinea-Bissau in 2008 focusing on the geographical spread in order to guide prevention and control activities. Methodology/Principal Findings We conducted two studies: 1) a descriptive analysis of the cholera epidemic in Guinea-Bissau focusing on its geographical spread (country level and within the capital); and 2) a cross-sectional study to measure the prevalence of houses with at least one cholera case in the most affected neighbourhood of the capital (Bairro Bandim) to detect clustering of households with cases (cluster analysis). All cholera cases attending the cholera treatment centres in Guinea-Bissau who fulfilled a modified World Health Organization clinical case definition during the epidemic were included in the descriptive study. For the cluster analysis, a sample of houses was selected from a satellite photo (Google Earth™); 140 houses (and the four closest houses) were assessed from the 2,202 identified structures. We applied K-functions and Kernel smoothing to detect clustering. We confirmed the clustering using Kulldorff's spatial scan statistic. A total of 14,222 cases and 225 deaths were reported in the country (AR = 0.94%, CFR = 1.64%). The more affected regions were Biombo, Bijagos and Bissau (the capital). Bairro Bandim was the most affected neighborhood of the capital (AR = 4.0). We found at least one case in 22.7% of the houses (95%CI: 19.5–26.2) in this neighborhood. The cluster analysis identified two areas within Bairro Bandim at highest risk: a market and an intersection where runoff accumulates waste (p<0.001). Conclusions/Significance Our analysis allowed for the identification of the most affected regions in Guinea-Bissau during the 2008 cholera outbreak, and the most affected areas within the capital. This information was essential for making decisions on where to reinforce treatment and to guide control and prevention activities. PMID:21572530
Novel molecular subtypes of serous and endometrioid ovarian cancer linked to clinical outcome.
Tothill, Richard W; Tinker, Anna V; George, Joshy; Brown, Robert; Fox, Stephen B; Lade, Stephen; Johnson, Daryl S; Trivett, Melanie K; Etemadmoghadam, Dariush; Locandro, Bianca; Traficante, Nadia; Fereday, Sian; Hung, Jillian A; Chiew, Yoke-Eng; Haviv, Izhak; Gertig, Dorota; DeFazio, Anna; Bowtell, David D L
2008-08-15
The study aim to identify novel molecular subtypes of ovarian cancer by gene expression profiling with linkage to clinical and pathologic features. Microarray gene expression profiling was done on 285 serous and endometrioid tumors of the ovary, peritoneum, and fallopian tube. K-means clustering was applied to identify robust molecular subtypes. Statistical analysis identified differentially expressed genes, pathways, and gene ontologies. Laser capture microdissection, pathology review, and immunohistochemistry validated the array-based findings. Patient survival within k-means groups was evaluated using Cox proportional hazards models. Class prediction validated k-means groups in an independent dataset. A semisupervised survival analysis of the array data was used to compare against unsupervised clustering results. Optimal clustering of array data identified six molecular subtypes. Two subtypes represented predominantly serous low malignant potential and low-grade endometrioid subtypes, respectively. The remaining four subtypes represented higher grade and advanced stage cancers of serous and endometrioid morphology. A novel subtype of high-grade serous cancers reflected a mesenchymal cell type, characterized by overexpression of N-cadherin and P-cadherin and low expression of differentiation markers, including CA125 and MUC1. A poor prognosis subtype was defined by a reactive stroma gene expression signature, correlating with extensive desmoplasia in such samples. A similar poor prognosis signature could be found using a semisupervised analysis. Each subtype displayed distinct levels and patterns of immune cell infiltration. Class prediction identified similar subtypes in an independent ovarian dataset with similar prognostic trends. Gene expression profiling identified molecular subtypes of ovarian cancer of biological and clinical importance.
Barczak, Amy K; Avraham, Roi; Singh, Shantanu; Luo, Samantha S; Zhang, Wei Ran; Bray, Mark-Anthony; Hinman, Amelia E; Thompson, Matthew; Nietupski, Raymond M; Golas, Aaron; Montgomery, Paul; Fitzgerald, Michael; Smith, Roger S; White, Dylan W; Tischler, Anna D; Carpenter, Anne E; Hung, Deborah T
2017-05-01
A key to the pathogenic success of Mycobacterium tuberculosis (Mtb), the causative agent of tuberculosis, is the capacity to survive within host macrophages. Although several factors required for this survival have been identified, a comprehensive knowledge of such factors and how they work together to manipulate the host environment to benefit bacterial survival are not well understood. To systematically identify Mtb factors required for intracellular growth, we screened an arrayed, non-redundant Mtb transposon mutant library by high-content imaging to characterize the mutant-macrophage interaction. Based on a combination of imaging features, we identified mutants impaired for intracellular survival. We then characterized the phenotype of infection with each mutant by profiling the induced macrophage cytokine response. Taking a systems-level approach to understanding the biology of identified mutants, we performed a multiparametric analysis combining pathogen and host phenotypes to predict functional relationships between mutants based on clustering. Strikingly, mutants defective in two well-known virulence factors, the ESX-1 protein secretion system and the virulence lipid phthiocerol dimycocerosate (PDIM), clustered together. Building upon the shared phenotype of loss of the macrophage type I interferon (IFN) response to infection, we found that PDIM production and export are required for coordinated secretion of ESX-1-substrates, for phagosomal permeabilization, and for downstream induction of the type I IFN response. Multiparametric clustering also identified two novel genes that are required for PDIM production and induction of the type I IFN response. Thus, multiparametric analysis combining host and pathogen infection phenotypes can be used to identify novel functional relationships between genes that play a role in infection.
A novel symptom cluster analysis among ambulatory HIV/AIDS patients in Uganda.
Namisango, Eve; Harding, Richard; Katabira, Elly T; Siegert, Richard J; Powell, Richard A; Atuhaire, Leonard; Moens, Katrien; Taylor, Steve
2015-01-01
Symptom clusters are gaining importance given HIV/AIDS patients experience multiple, concurrent symptoms. This study aimed to: determine clusters of patients with similar symptom combinations; describe symptom combinations distinguishing the clusters; and evaluate the clusters regarding patient socio-demographic, disease and treatment characteristics, quality of life (QOL) and functional performance. This was a cross-sectional study of 302 adult HIV/AIDS outpatients consecutively recruited at two teaching and referral hospitals in Uganda. Socio-demographic and seven-day period symptom prevalence and distress data were self-reported using the Memorial Symptom Assessment Schedule. QOL was assessed using the Medical Outcome Scale and functional performance using the Karnofsky Performance Scale. Symptom clusters were established using hierarchical cluster analysis with squared Euclidean distances using Ward's clustering methods based on symptom occurrence. Analysis of variance compared clusters on mean QOL and functional performance scores. Patient subgroups were categorised based on symptom occurrence rates. Five symptom occurrence clusters were identified: Cluster 1 (n=107), high-low for sensory discomfort and eating difficulties symptoms; Cluster 2 (n=47), high-low for psycho-gastrointestinal symptoms; Cluster 3 (n=71), high for pain and sensory disturbance symptoms; Cluster 4 (n=35), all high for general HIV/AIDS symptoms; and Cluster 5 (n=48), all low for mood-cognitive symptoms. The all high occurrence cluster was associated with worst functional status, poorest QOL scores and highest symptom-associated distress. Use of antiretroviral therapy was associated with all high symptom occurrence rate (Fisher's exact=4, P<0.001). CD4 count group below 200 was associated with the all high occurrence rate symptom cluster (Fisher's exact=41, P<0.001). Symptom clusters have a differential, affect HIV/AIDS patients' self-reported outcomes, with the subgroup experiencing high-symptom occurrence rates having a higher risk of poorer outcomes. Identification of symptom clusters could provide insights into commonly co-occurring symptoms that should be jointly targeted for management in patients with multiple complaints.
Guo, Bing; Greenwood, Paul L; Cafe, Linda M; Zhou, Guanghong; Zhang, Wangang; Dalrymple, Brian P
2015-03-13
This study aimed to identify markers for muscle growth rate and the different cellular contributors to cattle muscle and to link the muscle growth rate markers to specific cell types. The expression of two groups of genes in the longissimus muscle (LM) of 48 Brahman steers of similar age, significantly enriched for "cell cycle" and "ECM (extracellular matrix) organization" Gene Ontology (GO) terms was correlated with average daily gain/kg liveweight (ADG/kg) of the animals. However, expression of the same genes was only partly related to growth rate across a time course of postnatal LM development in two cattle genotypes, Piedmontese x Hereford (high muscling) and Wagyu x Hereford (high marbling). The deposition of intramuscular fat (IMF) altered the relationship between the expression of these genes and growth rate. K-means clustering across the development time course with a large set of genes (5,596) with similar expression profiles to the ECM genes was undertaken. The locations in the clusters of published markers of different cell types in muscle were identified and used to link clusters of genes to the cell type most likely to be expressing them. Overall correspondence between published cell type expression of markers and predicted major cell types of expression in cattle LM was high. However, some exceptions were identified: expression of SOX8 previously attributed to muscle satellite cells was correlated with angiogenesis. Analysis of the clusters and cell types suggested that the "cell cycle" and "ECM" signals were from the fibro/adipogenic lineage. Significant contributions to these signals from the muscle satellite cells, angiogenic cells and adipocytes themselves were not as strongly supported. Based on the clusters and cell type markers, sets of five genes predicted to be representative of fibro/adipogenic precursors (FAPs) and endothelial cells, and/or ECM remodelling and angiogenesis were identified. Gene sets and gene markers for the analysis of many of the major processes/cell populations contributing to muscle composition and growth have been proposed, enabling a consistent interpretation of gene expression datasets from cattle LM. The same gene sets are likely to be applicable in other cattle muscles and in other species.
Messier 35 (NGC 2168) DANCe. I. Membership, proper motions, and multiwavelength photometry
NASA Astrophysics Data System (ADS)
Bouy, H.; Bertin, E.; Barrado, D.; Sarro, L. M.; Olivares, J.; Moraux, E.; Bouvier, J.; Cuillandre, J.-C.; Ribas, Á.; Beletsky, Y.
2015-03-01
Context. Messier 35 (NGC 2168) is an important young nearby cluster. Its age, richness and relative proximity make it an ideal target for stellar evolution studies. The Kepler K2 mission recently observed it and provided a high accuracy photometric time series of a large number of sources in this area of the sky. Identifying the cluster's members is therefore of high importance to optimize the interpretation and analysis of the Kepler K2 data. Aims: We aim to identify the cluster's members by deriving membership probabilities for the sources within 1° of the cluster's center, which is farther away than equivalent previous studies. Methods: We measure accurate proper motions and multiwavelength (optical and near-infrared) photometry using ground-based archival images of the cluster. We use these measurements to compute membership probabilities. The list of candidate members from the literature is used as a training set to identify the cluster's locus in a multidimensional space made of proper motions, luminosities, and colors. Results: The final catalog includes 338 892 sources with multiwavelength photometry. Approximately half (194 452) were detected at more than two epochs and we measured their proper motion and used it to derive membership probability. A total of 4349 candidate members with membership probabilities greater than 50% are found in this sample in the luminosity range between 10 mag and 22 mag. The slow proper motion of the cluster and the overlap of its sequence with the field and background sequences in almost all color-magnitude and color-color diagrams complicate the analysis and the contamination level is expected to be significant. Our study, nevertheless, provides a coherent and quantitative membership analysis of Messier 35 based on a large fraction of the best ground-based data sets obtained over the past 18 years. As such, it represents a valuable input for follow-up studies using, in particular, the Kepler K2 photometric time series. Table 3 is only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (ftp://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/575/A120
STAR CLUSTERS IN A NUCLEAR STAR FORMING RING: THE DISAPPEARING STRING OF PEARLS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Väisänen, Petri; Barway, Sudhanshu; Randriamanakoto, Zara, E-mail: petri@saao.ac.za
2014-12-20
An analysis of the star cluster population in a low-luminosity early-type galaxy, NGC 2328, is presented. The clusters are found in a tight star forming nuclear spiral/ring pattern and we also identify a bar from structural two-dimensional decomposition. These massive clusters are forming very efficiently in the circumnuclear environment and they are young, possibly all less than 30 Myr of age. The clusters indicate an azimuthal age gradient, consistent with a ''pearls-on-a-string'' formation scenario, suggesting bar-driven gas inflow. The cluster mass function has a robust down turn at low masses at all age bins. Assuming clusters are born with a power-lawmore » distribution, this indicates extremely rapid disruption at timescales of just several million years. If found to be typical, it means that clusters born in dense circumnuclear rings do not survive to become old globular clusters in non-interacting systems.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Ling; Harley, Robert A.; Brown, Nancy J.
Cluster analysis was applied to daily 8 h ozone maxima modeled for a summer season to characterize meteorology-induced variations in the spatial distribution of ozone. Principal component analysis is employed to form a reduced dimension set to describe and interpret ozone spatial patterns. The first three principal components (PCs) capture {approx}85% of total variance, with PC1 describing a general spatial trend, and PC2 and PC3 each describing a spatial contrast. Six clusters were identified for California's San Joaquin Valley (SJV) with two low, three moderate, and one high-ozone cluster. The moderate ozone clusters are distinguished by elevated ozone levels inmore » different parts of the valley: northern, western, and eastern, respectively. The SJV ozone clusters have stronger coupling with the San Francisco Bay area (SFB) than with the Sacramento Valley (SV). Variations in ozone spatial distributions induced by anthropogenic emission changes are small relative to the overall variations in ozone amomalies observed for the whole summer. Ozone regimes identified here are mostly determined by the direct and indirect meteorological effects. Existing measurement sites are sufficiently representative to capture ozone spatial patterns in the SFB and SV, but the western side of the SJV is under-sampled.« less
Socioeconomic Status (SES) and Childhood Acute Myeloid Leukemia (AML) Mortality
Knoble, Naomi B.; Alderfer, Melissa A.; Hossain, Md Jobayer
2016-01-01
Socioeconomic status (SES) is a complex construct of multiple indicators, known to impact cancer outcomes, but has not been adequately examined among pediatric AML patients. This study aimed to identify the patterns of co-occurrence of multiple community-level SES indicators and to explore associations between various patterns of these indicators and pediatric AML mortality risk. A nationally representative US sample of 3,651 pediatric AML patients, aged 0–19 years at diagnosis was drawn from 17 Surveillance, Epidemiology, and End Results (SEER) database registries created between 1973 and 2012. Factor analysis, cluster analysis, stratified univariable and multivariable Cox proportional hazards models were used. Four SES factors accounting for 87% of the variance in SES indicators were identified: F1) economic/educational disadvantage, less immigration; F2) immigration-related features (foreign-born, language-isolation, crowding), less mobility F3) housing instability; and, F4) absence of moving. F1 and F3 showed elevated risk of mortality, adjusted hazards ratios (aHR) (95% CI): 1.07(1.02–1.12) and 1.05(1.00–1.10), respectively. Seven SES-defined cluster groups were identified. Cluster 1: (low economic/educational disadvantage, few immigration-related features, and residential-stability) showed the minimum risk of mortality. Compared to Cluster 1, Cluster 3: (high economic/educational disadvantage, high-mobility) and Cluster 6: (moderately-high economic/educational disadvantages, housing-instability and immigration-related features) exhibited substantially greater risk of mortality, aHR(95% CI) = 1.19(1.0–1.4) and 1.23 (1.1–1.5), respectively. Factors of correlated SES-indicators and their pattern-based groups demonstrated differential risks in the pediatric AML mortality indicating the need of special public-health attention in areas with economic-educational disadvantages, housing-instability and immigration-related features. PMID:27543948
Xia, Shang; Xue, Jing-Bo; Zhang, Xia; Hu, He-Hua; Abe, Eniola Michael; Rollinson, David; Bergquist, Robert; Zhou, Yibiao; Li, Shi-Zhu; Zhou, Xiao-Nong
2017-04-26
The prevalence of schistosomiasis remains a key public health issue in China. Jiangling County in Hubei Province is a typical lake and marshland endemic area. The pattern analysis of schistosomiasis prevalence in Jiangling County is of significant importance for promoting schistosomiasis surveillance and control in the similar endemic areas. The dataset was constructed based on the annual schistosomiasis surveillance as well the socio-economic data in Jiangling County covering the years from 2009 to 2013. A village clustering method modified from the K-mean algorithm was used to identify different types of endemic villages. For these identified village clusters, a matrix-based predictive model was developed by means of exploring the one-step backward temporal correlation inference algorithm aiming to estimate the predicative correlations of schistosomiasis prevalence among different years. Field sampling of faeces from domestic animals, as an indicator of potential schistosomiasis prevalence, was carried out and the results were used to validate the results of proposed models and methods. The prevalence of schistosomiasis in Jiangling County declined year by year. The total of 198 endemic villages in Jiangling County can be divided into four clusters with reference to the 5 years' occurrences of schistosomiasis in human, cattle and snail populations. For each identified village cluster, a predictive matrix was generated to characterize the relationships of schistosomiasis prevalence with the historic infection level as well as their associated impact factors. Furthermore, the results of sampling faeces from the front field agreed with the results of the identified clusters of endemic villages. The results of village clusters and the predictive matrix can be regard as the basis to conduct targeted measures for schistosomiasis surveillance and control. Furthermore, the proposed models and methods can be modified to investigate the schistosomiasis prevalence in other regions as well as be used for investigating other parasitic diseases.
ERIC Educational Resources Information Center
Spreen, Otfried; Haaf, Robert G.
1986-01-01
Test scores of two groups of learning disabled children (N=63 and N=96) were submitted to cluster analysis in an attempt to replicate previously described subtypes. All three subtypes (visuo-perceptual, linguistic, and articulo-graphomotor types) were identified along with minimally and severely impaired subtypes. Similar clusters in the same…
Rosychuk, Rhonda J; Johnson, David W; Urichuk, Liana; Dong, Kathryn; Newton, Amanda S
2016-07-11
Clustering of adolescent self-harming behaviours in the context of health care utilization has not been studied. We identified geographic areas with higher numbers of adolescents who (1) presented to an emergency department (ED) for self-harm, and (2) were without a physician follow-up visit for mental health within 14 days post-ED visit. We extracted a population-based cohort of adolescents aged 15-17 years (n = 3,927) with ED visits during 2002-2011 in Alberta, Canada. We defined the case as an individual with one or more ED presentations for self-harm in the fiscal year of the analysis. Crude case rates were calculated and clusters were identified using a spatial scan. The rates decreased over time for ED visits for self-harm (differences: girls -199.6/100,000; p < 0.01; boys -58.8/100,000; p < 0.01), and for adolescents without a follow-up visit within 14 days following an ED visit for self-harm (differences: girls -108.3/100,000; p < 0.01; boys -61.9/100,000; p < 0.01). Two space-time clusters were identified: (1) a North zone cluster during 2002-2006 (p < 0.01) and (2) a South zone cluster during 2003-2007 (p < 0.01). These clusters had higher numbers of adolescents who presented to the ED for self-harm (relative risks [RRs]: 1.58 for cluster 1, 3.54 for cluster 2) and were without a 14-day physician follow-up (RRs: 1.78 for cluster 1, 4.17 for cluster 2). In 2010/2011, clusters in the North, Edmonton, and Central zones were identified for adolescents with and without a follow-up visit within 14 days following an ED visit for self-harm (p < 0.01). The rates for ED visits for adolescents who self-harm and rates of adolescents without a 14-day physician follow-up visit following emergency care for self-harm decreased during the study period. The space-time clusters identified the areas and years where visits to the ED by adolescents for self-harm were statistically higher than expected. These clusters can be used to identify locations where adolescents are potentially not receiving follow-up and the mental health support needed after emergency-based care. The 2010/2011 geographic cluster suggests that the northern part of the province still has elevated numbers of adolescents visiting the ED for self-harm. Prospective research is needed to determine outcomes associated with adolescents who receive physician follow-up following ED-based care for self-harm compared to those who do not.
NASA Technical Reports Server (NTRS)
Goldenberg, Stanley B.; Houze, Robert A., Jr.; Churchill, Dean D.
1990-01-01
The horizontal precipitation structure of cloud clusters observed over the South China Sea during the Winter Monsoon Experiment (WMONEX) is analyzed using a convective-stratiform technique (CST) developed by Adler and Negri (1988). The technique was modified by altering the method for identifying convective cells in the satellite data, accounting for the extremely cold cloud tops characteristic of the WMONEX region, and modifying the threshold infrared temperature for the boundary of the stratiform rain area. The precipitation analysis was extended to the entire history of the cloud cluster by applying the modified CST to IR imagery from geosynchronous-satellite observations. The ship and aircraft data from the later period of the cluster's lifetime make it possible to check the locations of convective and stratiform precipitation identified by the CST using in situ observations. The extended CST is considered to be effective for determining the climatology of the convective-stratiform structure of tropical cloud clusters.
Effects of age and body mass index on breast characteristics: A cluster analysis.
Coltman, Celeste E; Steele, Julie R; McGhee, Deirdre E
2018-05-24
Limited research has quantified variation in the characteristics of the breasts among women and determined how these breast characteristics are influenced by age and body mass. The aim of this study was to classify the breasts of women in the community into different categories based on comprehensive and objective measurements of the characteristics of their breasts and torsos, and to determine the effect of age and body mass index (BMI) on the prevalence of these breast categories. Four breast characteristic clusters were identified (X-Large, Very-ptotic & Splayed; Large, Ptotic & Splayed; Medium & Mildly-ptotic; and Small & Non-ptotic), with age and BMI shown to significantly affect the breast characteristic clusters. These results highlight the difference in breast characteristics exhibited among women and how these clusters are affected by age and BMI. The breast characteristic clusters identified in this study could be used as a basis for future bra designs and sizing systems in order to improve bra fit for women.
Serial analysis of gene expression (SAGE) in normal human trabecular meshwork.
Liu, Yutao; Munro, Drew; Layfield, David; Dellinger, Andrew; Walter, Jeffrey; Peterson, Katherine; Rickman, Catherine Bowes; Allingham, R Rand; Hauser, Michael A
2011-04-08
To identify the genes expressed in normal human trabecular meshwork tissue, a tissue critical to the pathogenesis of glaucoma. Total RNA was extracted from human trabecular meshwork (HTM) harvested from 3 different donors. Extracted RNA was used to synthesize individual SAGE (serial analysis of gene expression) libraries using the I-SAGE Long kit from Invitrogen. Libraries were analyzed using SAGE 2000 software to extract the 17 base pair sequence tags. The extracted sequence tags were mapped to the genome using SAGE Genie map. A total of 298,834 SAGE tags were identified from all HTM libraries (96,842, 88,126, and 113,866 tags, respectively). Collectively, there were 107,325 unique tags. There were 10,329 unique tags with a minimum of 2 counts from a single library. These tags were mapped to known unique Unigene clusters. Approximately 29% of the tags (orphan tags) did not map to a known Unigene cluster. Thirteen percent of the tags mapped to at least 2 Unigene clusters. Sequence tags from many glaucoma-related genes, including myocilin, optineurin, and WD repeat domain 36, were identified. This is the first time SAGE analysis has been used to characterize the gene expression profile in normal HTM. SAGE analysis provides an unbiased sampling of gene expression of the target tissue. These data will provide new and valuable information to improve understanding of the biology of human aqueous outflow.
Xue, Ling; Scoglio, Caterina; McVey, D Scott; Boone, Rebecca; Cohnstaedt, Lee W
2015-09-01
Lyme disease has become the most prevalent vector-borne disease in the United States and results in morbidity in humans, especially children. We used historical case distributions to explain vector-borne disease introductions and subsequent geographic expansion in the absence of disease vector data. We used geographic information system analysis of publicly available Connecticut Department of Public Health case data from 1984, 1985, and 1991 to 2012 for the 169 towns in Connecticut to identify the yearly clusters of Lyme disease cases. Our analysis identified the spatial and temporal origins of two separate introductions of Lyme disease into Connecticut and identified the subsequent direction and rate of spread. We defined both epidemic clusters of cases using significant long-term spatial autocorrelation. The incidence-weighted geographic mean analysis indicates a northern trend of geographic expansion for both epidemic clusters. In eastern Connecticut, as the epidemic progressed, the yearly shift in the geographic mean (rate of epidemic expansion) decreased each year until spatial equilibrium was reached in 2007. The equilibrium indicates a transition from epidemic Lyme disease spread to stable endemic transmission, and we associate this with a reduction in incidence. In western Connecticut, the parabolic distribution of the yearly geographic mean indicates that following the establishment of Lyme disease (1988) the epidemic quickly expanded northward and established equilibrium in 2009.
Factors influencing the quality of life of haemodialysis patients according to symptom cluster.
Shim, Hye Yeung; Cho, Mi-Kyoung
2018-05-01
To identify the characteristics in each symptom cluster and factors influencing the quality of life of haemodialysis patients in Korea according to cluster. Despite developments in renal replacement therapy, haemodialysis still restricts the activities of daily living due to pain and impairs physical functioning induced by the disease and its complications. Descriptive survey. Two hundred and thirty dialysis patients aged >18 years. They completed self-administered questionnaires of Dialysis Symptom Index and Kidney Disease Quality of Life instrument-Short Form 1.3. To determine the optimal number of clusters, the collected data were analysed using polytomous variable latent class analysis in R software (poLCA) to estimate the latent class models and the latent class regression models for polytomous outcome variables. Differences in characteristics, symptoms and QOL according to the symptom cluster of haemodialysis patients were analysed using the independent t test and chi-square test. The factors influencing the QOL according to symptom cluster were identified using hierarchical multiple regression analysis. Physical and emotional symptoms were significantly more severe, and the QOL was significantly worse in Cluster 1 than in Cluster 2. The factors influencing the QOL were spouse, job, insurance type and physical and emotional symptoms in Cluster 1, with these variables having an explanatory power of 60.9%. Physical and emotional symptoms were the only influencing factors in Cluster 2, and they had an explanatory power of 37.4%. Mitigating the symptoms experienced by haemodialysis patients and improving their QOL require educational and therapeutic symptom management interventions that are tailored according to the characteristics and symptoms in each cluster. The findings of this study are expected to lead to practical guidelines for addressing the symptoms experienced by haemodialysis patients, and they provide basic information for developing nursing interventions to manage these symptoms and improve the QOL of these patients. © 2017 John Wiley & Sons Ltd.
Toyomaki, Atsuhito; Koga, Minori; Okada, Emiko; Nakai, Yukiei; Miyazaki, Akane; Tamakoshi, Akiko; Kiso, Yoshinobu; Kusumi, Ichiro
2017-01-01
Several studies indicate that dietary habits are associated with mental health. We are interested in identifying not a specific single nutrient/food group but the population preferring specific food combinations that can be related to mental health. Very few studies have examined relationships between dietary patterns and multifaceted mental states using cluster analysis. The purpose of this study was to investigate population-level dietary patterns associated with mental state using cluster analysis. We focused on depressive state, sleep quality, subjective well-being, and impulsive behaviors using rating scales. Two hundred and seventy-nine Japanese middle-aged people participated in the present study. Dietary pattern was estimated using a brief self-administered diet-history questionnaire (the BDHQ). We conducted K-means cluster analysis using thirteen BDHQ food groups: milk, meat, fish, egg, pulses, potatoes, green and yellow vegetables, other vegetables, mushrooms, seaweed, sweets, fruits, and grain. We identified three clusters characterized as "vegetable and fruit dominant," "grain dominant," and "low grain tendency" subgroups. The vegetable and fruit dominant group showed increases in several aspects of subjective well-being demonstrated by the SF-8. Differences in mean subject characteristics across clusters were tested using ANOVA. The low frequency intake of grain group showed higher impulsive behavior, demonstrated by BIS-11 deliberation and sum scores. The present study demonstrated that traditional Japanese dietary patterns, such as eating rice, can help with beneficial changes in mental health.
Toyomaki, Atsuhito; Koga, Minori; Okada, Emiko; Nakai, Yukiei; Miyazaki, Akane; Tamakoshi, Akiko; Kiso, Yoshinobu; Kusumi, Ichiro
2017-01-01
Several studies indicate that dietary habits are associated with mental health. We are interested in identifying not a specific single nutrient/food group but the population preferring specific food combinations that can be related to mental health. Very few studies have examined relationships between dietary patterns and multifaceted mental states using cluster analysis. The purpose of this study was to investigate population-level dietary patterns associated with mental state using cluster analysis. We focused on depressive state, sleep quality, subjective well-being, and impulsive behaviors using rating scales. Two hundred and seventy-nine Japanese middle-aged people participated in the present study. Dietary pattern was estimated using a brief self-administered diet-history questionnaire (the BDHQ). We conducted K-means cluster analysis using thirteen BDHQ food groups: milk, meat, fish, egg, pulses, potatoes, green and yellow vegetables, other vegetables, mushrooms, seaweed, sweets, fruits, and grain. We identified three clusters characterized as “vegetable and fruit dominant,” “grain dominant,” and “low grain tendency” subgroups. The vegetable and fruit dominant group showed increases in several aspects of subjective well-being demonstrated by the SF-8. Differences in mean subject characteristics across clusters were tested using ANOVA. The low frequency intake of grain group showed higher impulsive behavior, demonstrated by BIS-11 deliberation and sum scores. The present study demonstrated that traditional Japanese dietary patterns, such as eating rice, can help with beneficial changes in mental health. PMID:28704469
Deschamps, Kevin; Matricali, Giovanni Arnoldo; Roosen, Philip; Desloovere, Kaat; Bruyninckx, Herman; Spaepen, Pieter; Nobels, Frank; Tits, Jos; Flour, Mieke; Staes, Filip
2013-01-01
Background The aim of this study was to identify groups of subjects with similar patterns of forefoot loading and verify if specific groups of patients with diabetes could be isolated from non-diabetics. Methodology/Principal Findings Ninety-seven patients with diabetes and 33 control participants between 45 and 70 years were prospectively recruited in two Belgian Diabetic Foot Clinics. Barefoot plantar pressure measurements were recorded and subsequently analysed using a semi-automatic total mapping technique. Kmeans cluster analysis was applied on relative regional impulses of six forefoot segments in order to pursue a classification for the control group separately, the diabetic group separately and both groups together. Cluster analysis led to identification of three distinct groups when considering only the control group. For the diabetic group, and the computation considering both groups together, four distinct groups were isolated. Compared to the cluster analysis of the control group an additional forefoot loading pattern was identified. This group comprised diabetic feet only. The relevance of the reported clusters was supported by ANOVA statistics indicating significant differences between different regions of interest and different clusters. Conclusion/s Significance There seems to emerge a new era in diabetic foot medicine which embraces the classification of diabetic patients according to their biomechanical profile. Classification of the plantar pressure distribution has the potential to provide a means to determine mechanical interventions for the prevention and/or treatment of the diabetic foot. PMID:24278219
Wan, B; Yarbrough, J W; Schultz, T W
2008-01-01
This study was undertaken to test the hypothesis that structurally similar PAHs induce similar gene expression profiles. THP-1 cells were exposed to a series of 12 selected PAHs at 50 microM for 24 hours and gene expressions profiles were analyzed using both unsupervised and supervised methods. Clustering analysis of gene expression profiles revealed that the 12 tested chemicals were grouped into five clusters. Within each cluster, the gene expression profiles are more similar to each other than to the ones outside the cluster. One-methylanthracene and 1-methylfluorene were found to have the most similar profiles; dibenzothiophene and dibenzofuran were found to share common profiles with fluorine. As expression pattern comparisons were expanded, similarity in genomic fingerprint dropped off dramatically. Prediction analysis of microarrays (PAM) based on the clustering pattern generated 49 predictor genes that can be used for sample discrimination. Moreover, a significant analysis of Microarrays (SAM) identified 598 genes being modulated by tested chemicals with a variety of biological processes, such as cell cycle, metabolism, and protein binding and KEGG pathways being significantly (p < 0.05) affected. It is feasible to distinguish structurally different PAHs based on their genomic fingerprints, which are mechanism based.
Jadhav, Rohit R; Ye, Zhenqing; Huang, Rui-Lan; Liu, Joseph; Hsu, Pei-Yin; Huang, Yi-Wen; Rangel, Leticia B; Lai, Hung-Cheng; Roa, Juan Carlos; Kirma, Nameer B; Huang, Tim Hui-Ming; Jin, Victor X
2015-01-01
Recent genome-wide analysis has shown that DNA methylation spans long stretches of chromosome regions consisting of clusters of contiguous CpG islands or gene families. Hypermethylation of various gene clusters has been reported in many types of cancer. In this study, we conducted methyl-binding domain capture (MBDCap) sequencing (MBD-seq) analysis on a breast cancer cohort consisting of 77 patients and 10 normal controls, as well as a panel of 38 breast cancer cell lines. Bioinformatics analysis determined seven gene clusters with a significant difference in overall survival (OS) and further revealed a distinct feature that the conservation of a large gene cluster (approximately 70 kb) metallothionein-1 (MT1) among 45 species is much lower than the average of all RefSeq genes. Furthermore, we found that DNA methylation is an important epigenetic regulator contributing to gene repression of MT1 gene cluster in both ERα positive (ERα+) and ERα negative (ERα-) breast tumors. In silico analysis revealed much lower gene expression of this cluster in The Cancer Genome Atlas (TCGA) cohort for ERα + tumors. To further investigate the role of estrogen, we conducted 17β-estradiol (E2) and demethylating agent 5-aza-2'-deoxycytidine (DAC) treatment in various breast cancer cell types. Cell proliferation and invasion assays suggested MT1F and MT1M may play an anti-oncogenic role in breast cancer. Our data suggests that DNA methylation in large contiguous gene clusters can be potential prognostic markers of breast cancer. Further investigation of these clusters revealed that estrogen mediates epigenetic repression of MT1 cluster in ERα + breast cancer cell lines. In all, our studies identify thousands of breast tumor hypermethylated regions for the first time, in particular, discovering seven large contiguous hypermethylated gene clusters.
MANPRINT Methods Monograph: Aiding the Development of Manpower-Based System Evaluation
1989-06-01
zone below tree level where threats are known to be (the actual number of threats may vary). Weather conditions are VFR. The helicopter pops up to...12.0 Replace 13.3 Bearing, connecting Inspect 6.2 Replace 6.2 0105 Camshaft Inspect 7.2 Replace 7.2 Cover, cylinder head Inspect .2 (valve cover...matrix to analyze the data and identify task clusters. . Outputs and Use of Cluster Analysis 1. Hierarchical cluster tree (taxonomy) of system tasks will
Freitas-Vilela, Ana Amélia; Smith, Andrew D A C; Kac, Gilberto; Pearson, Rebecca M; Heron, Jon; Emond, Alan; Hibbeln, Joseph R; Castro, Maria Beatriz Trindade; Emmett, Pauline M
2017-04-01
Little is known about how dietary patterns of mothers and their children track over time. The objectives of this study are to obtain dietary patterns in pregnancy using cluster analysis, to examine women's mean nutrient intakes in each cluster and to compare the dietary patterns of mothers to those of their children. Pregnant women (n = 12 195) from the Avon Longitudinal Study of Parents and Children reported their frequency of consumption of 47 foods and food groups. These data were used to obtain dietary patterns during pregnancy by cluster analysis. The absolute and energy-adjusted nutrient intakes were compared between clusters. Women's dietary patterns were compared with previously derived clusters of their children at 7 years of age. Multinomial logistic regression was performed to evaluate relationships comparing maternal and offspring clusters. Three maternal clusters were identified: 'fruit and vegetables', 'meat and potatoes' and 'white bread and coffee'. After energy adjustment women in the 'fruit and vegetables' cluster had the highest mean nutrient intakes. Mothers in the 'fruit and vegetables' cluster were more likely than mothers in 'meat and potatoes' (adjusted odds ratio [OR]: 2.00; 95% Confidence Interval [CI]: 1.69-2.36) or 'white bread and coffee' (OR: 2.18; 95% CI: 1.87-2.53) clusters to have children in a 'plant-based' cluster. However the majority of children were in clusters unrelated to their mother dietary pattern. Three distinct dietary patterns were obtained in pregnancy; the 'fruit and vegetables' pattern being the most nutrient dense. Mothers' dietary patterns were associated with but did not dominate offspring dietary patterns. © 2016 The Authors. Maternal & Child Nutrition published by John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Haghighi, Babak; Choi, Jiwoong; Choi, Sanghun; Hoffman, Eric A.; Lin, Ching-Long
2017-11-01
Accurate modeling of small airway diameters in patients with chronic obstructive pulmonary disease (COPD) is a crucial step toward patient-specific CFD simulations of regional airflow and particle transport. We proposed to use computed tomography (CT) imaging-based cluster membership to identify structural characteristics of airways in each cluster and use them to develop cluster-specific airway diameter models. We analyzed 284 COPD smokers with airflow limitation, and 69 healthy controls. We used multiscale imaging-based cluster analysis (MICA) to classify smokers into 4 clusters. With representative cluster patients and healthy controls, we performed multiple regressions to quantify variation of airway diameters by generation as well as by cluster. The cluster 2 and 4 showed more diameter decrease as generation increases than other clusters. The cluster 4 had more rapid decreases of airway diameters in the upper lobes, while cluster 2 in the lower lobes. We then used these regression models to estimate airway diameters in CT unresolved regions to obtain pressure-volume hysteresis curves using a 1D resistance model. These 1D flow solutions can be used to provide the patient-specific boundary conditions for 3D CFD simulations in COPD patients. Support for this study was provided, in part, by NIH Grants U01-HL114494, R01-HL112986 and S10-RR022421.
Fuzzy cluster analysis of high-field functional MRI data.
Windischberger, Christian; Barth, Markus; Lamm, Claus; Schroeder, Lee; Bauer, Herbert; Gur, Ruben C; Moser, Ewald
2003-11-01
Functional magnetic resonance imaging (fMRI) based on blood-oxygen level dependent (BOLD) contrast today is an established brain research method and quickly gains acceptance for complementary clinical diagnosis. However, neither the basic mechanisms like coupling between neuronal activation and haemodynamic response are known exactly, nor can the various artifacts be predicted or controlled. Thus, modeling functional signal changes is non-trivial and exploratory data analysis (EDA) may be rather useful. In particular, identification and separation of artifacts as well as quantification of expected, i.e. stimulus correlated, and novel information on brain activity is important for both, new insights in neuroscience and future developments in functional MRI of the human brain. After an introduction on fuzzy clustering and very high-field fMRI we present several examples where fuzzy cluster analysis (FCA) of fMRI time series helps to identify and locally separate various artifacts. We also present and discuss applications and limitations of fuzzy cluster analysis in very high-field functional MRI: differentiate temporal patterns in MRI using (a) a test object with static and dynamic parts, (b) artifacts due to gross head motion artifacts. Using a synthetic fMRI data set we quantitatively examine the influences of relevant FCA parameters on clustering results in terms of receiver-operator characteristics (ROC) and compare them with a commonly used model-based correlation analysis (CA) approach. The application of FCA in analyzing in vivo fMRI data is shown for (a) a motor paradigm, (b) data from multi-echo imaging, and (c) a fMRI study using mental rotation of three-dimensional cubes. We found that differentiation of true "neural" from false "vascular" activation is possible based on echo time dependence and specific activation levels, as well as based on their signal time-course. Exploratory data analysis methods in general and fuzzy cluster analysis in particular may help to identify artifacts and add novel and unexpected information valuable for interpretation, classification and characterization of functional MRI data which can be used to design new data acquisition schemes, stimulus presentations, neuro(physio)logical paradigms, as well as to improve quantitative biophysical models.
Nurses' beliefs about nursing diagnosis: A study with cluster analysis.
D'Agostino, Fabio; Pancani, Luca; Romero-Sánchez, José Manuel; Lumillo-Gutierrez, Iris; Paloma-Castro, Olga; Vellone, Ercole; Alvaro, Rosaria
2018-06-01
To identify clusters of nurses in relation to their beliefs about nursing diagnosis among two populations (Italian and Spanish); to investigate differences among clusters of nurses in each population considering the nurses' socio-demographic data, attitudes towards nursing diagnosis, intentions to make nursing diagnosis and actual behaviours in making nursing diagnosis. Nurses' beliefs concerning nursing diagnosis can influence its use in practice but this is still unclear. A cross-sectional design. A convenience sample of nurses in Italy and Spain was enrolled. Data were collected between 2014-2015 using tools, that is, a socio-demographic questionnaire and behavioural, normative and control beliefs, attitudes, intentions and behaviours scales. The sample included 499 nurses (272 Italians & 227 Spanish). Of these, 66.5% of the Italian and 90.7% of the Spanish sample were female. The mean age was 36.5 and 45.2 years old in the Italian and Spanish sample respectively. Six clusters of nurses were identified in Spain and four in Italy. Three clusters were similar among the two populations. Similar significant associations between age, years of work, attitudes towards nursing diagnosis, intentions to make nursing diagnosis and behaviours in making nursing diagnosis and cluster membership in each population were identified. Belief profiles identified unique subsets of nurses that have distinct characteristics. Categorizing nurses by belief patterns may help administrators and educators to tailor interventions aimed at improving nursing diagnosis use in practice. © 2018 John Wiley & Sons Ltd.
Identification of stable QTLs causing chalk in rice grains in nine environments.
Zhao, Xiangqian; Daygon, Venea D; McNally, Kenneth L; Hamilton, Ruaraidh Sackville; Xie, Fangming; Reinke, Russell F; Fitzgerald, Melissa A
2016-01-01
A novel QTL cluster for chalkiness on Chr04 was identified using single environment analysis and joint mapping across 9 environments in Asia and South American. QTL NILs showed that each had a significant effect on chalk. Chalk in rice grains leads to a significant loss in the proportion of marketable grains in a harvested crop, leading to a significant financial loss to rice farmers and traders. To identify the genetic basis of chalkiness, two sets of recombinant inbred lines (RILs) derived from reciprocal crosses between Lemont and Teqing were used to find stable QTLs for chalkiness. The RILs were grown in seven locations in Asia and Latin American and in two controlled environments in phytotrons. A total of 32 (21) and 46 (22) QTLs for DEC and PGWC, most of them explaining more than 10% of phenotypic variation, were detected based on single environment analysis in T/L (L/T) population, respectively. Seven (2) and 7 (3) QTLs for DEC and PGWC were identified in the T/L (L/T) population using joined analysis across all environments, respectively. Six major QTLs clusters were found on five chromosomes: 1, 2, 4, 5 and 11. The biggest cluster at id4007289-RM252 on Chr04 was a novelty, including 16 and 4 QTLs detected by single environment analysis and joint mapping across all environments, respectively. The detected digenic epistatic QTLs explained up to 13% of phenotypic variation, suggesting that epistasis play an important role in the genetic control of chalkiness in rice. QTL NILs showed that each QTL cluster had a significant effect on chalk. These chromosomal regions could be targets for MAS, fine mapping and map-based cloning for low chalkiness breeding.
Onda, Kyle; Crocker, Jonny; Kayser, Georgia Lyn; Bartram, Jamie
2013-01-01
The fields of global health and international development commonly cluster countries by geography and income to target resources and describe progress. For any given sector of interest, a range of relevant indicators can serve as a more appropriate basis for classification. We create a new typology of country clusters specific to the water and sanitation (WatSan) sector based on similarities across multiple WatSan-related indicators. After a literature review and consultation with experts in the WatSan sector, nine indicators were selected. Indicator selection was based on relevance to and suggested influence on national water and sanitation service delivery, and to maximize data availability across as many countries as possible. A hierarchical clustering method and a gap statistic analysis were used to group countries into a natural number of relevant clusters. Two stages of clustering resulted in five clusters, representing 156 countries or 6.75 billion people. The five clusters were not well explained by income or geography, and were unique from existing country clusters used in international development. Analysis of these five clusters revealed that they were more compact and well separated than United Nations and World Bank country clusters. This analysis and resulting country typology suggest that previous geography- or income-based country groupings can be improved upon for applications in the WatSan sector by utilizing globally available WatSan-related indicators. Potential applications include guiding and discussing research, informing policy, improving resource targeting, describing sector progress, and identifying critical knowledge gaps in the WatSan sector. PMID:24054545
Lorenz, L; Bachem, R C; Maercker, A
2016-10-01
Adjustment disorder (AjD) is a transient mental health condition emerging after stressful life events. Its diagnostic criteria have recently been under revision which led to the development of the Adjustment Disorder--New Module 20 (ADNM-20) as a self-report assessment. To identify a threshold value for people at high risk for AjD. As part of a randomized controlled trial evaluating a self-help manual for burglary victims, the baseline data of all participants (n=80) were analyzed. Besides the ADNM-20, participants answered self-report questionnaires regarding the external variables post-traumatic stress disorder symptomatology, depression, anxiety, and stress levels. We used cluster analysis and ROC analysis to identify the most appropriate cut-off value. The cluster analysis identified three different subgroups. They differed in their level of AjD symptomatology from low to high symptom severity. The same pattern of impairment was found for the external variables. The ROC analysis testing the ADNM-20 sum scoreagainst the theory-based diagnostic algorithm, revealed an optimal cut-off score at 47.5 to distinguish between people at high risk for AjD and people at low risk. The ADNM-20 distinguishes between people with low, moderate, and high symptomatology. The recommendation for a cut-off score at 47.5 facilitates the use of the ADNM-20 in research and practice.
Yang, Guang; Raschke, Felix; Barrick, Thomas R; Howe, Franklyn A
2015-09-01
To investigate whether nonlinear dimensionality reduction improves unsupervised classification of (1) H MRS brain tumor data compared with a linear method. In vivo single-voxel (1) H magnetic resonance spectroscopy (55 patients) and (1) H magnetic resonance spectroscopy imaging (MRSI) (29 patients) data were acquired from histopathologically diagnosed gliomas. Data reduction using Laplacian eigenmaps (LE) or independent component analysis (ICA) was followed by k-means clustering or agglomerative hierarchical clustering (AHC) for unsupervised learning to assess tumor grade and for tissue type segmentation of MRSI data. An accuracy of 93% in classification of glioma grade II and grade IV, with 100% accuracy in distinguishing tumor and normal spectra, was obtained by LE with unsupervised clustering, but not with the combination of k-means and ICA. With (1) H MRSI data, LE provided a more linear distribution of data for cluster analysis and better cluster stability than ICA. LE combined with k-means or AHC provided 91% accuracy for classifying tumor grade and 100% accuracy for identifying normal tissue voxels. Color-coded visualization of normal brain, tumor core, and infiltration regions was achieved with LE combined with AHC. The LE method is promising for unsupervised clustering to separate brain and tumor tissue with automated color-coding for visualization of (1) H MRSI data after cluster analysis. © 2014 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Schaefer, A. M.; Daniell, J. E.; Wenzel, F.
2014-12-01
Earthquake clustering tends to be an increasingly important part of general earthquake research especially in terms of seismic hazard assessment and earthquake forecasting and prediction approaches. The distinct identification and definition of foreshocks, aftershocks, mainshocks and secondary mainshocks is taken into account using a point based spatio-temporal clustering algorithm originating from the field of classic machine learning. This can be further applied for declustering purposes to separate background seismicity from triggered seismicity. The results are interpreted and processed to assemble 3D-(x,y,t) earthquake clustering maps which are based on smoothed seismicity records in space and time. In addition, multi-dimensional Gaussian functions are used to capture clustering parameters for spatial distribution and dominant orientations. Clusters are further processed using methodologies originating from geostatistics, which have been mostly applied and developed in mining projects during the last decades. A 2.5D variogram analysis is applied to identify spatio-temporal homogeneity in terms of earthquake density and energy output. The results are mitigated using Kriging to provide an accurate mapping solution for clustering features. As a case study, seismic data of New Zealand and the United States is used, covering events since the 1950s, from which an earthquake cluster catalogue is assembled for most of the major events, including a detailed analysis of the Landers and Christchurch sequences.
Semi-supervised clustering methods.
Bair, Eric
2013-01-01
Cluster analysis methods seek to partition a data set into homogeneous subgroups. It is useful in a wide variety of applications, including document processing and modern genetics. Conventional clustering methods are unsupervised, meaning that there is no outcome variable nor is anything known about the relationship between the observations in the data set. In many situations, however, information about the clusters is available in addition to the values of the features. For example, the cluster labels of some observations may be known, or certain observations may be known to belong to the same cluster. In other cases, one may wish to identify clusters that are associated with a particular outcome variable. This review describes several clustering algorithms (known as "semi-supervised clustering" methods) that can be applied in these situations. The majority of these methods are modifications of the popular k-means clustering method, and several of them will be described in detail. A brief description of some other semi-supervised clustering algorithms is also provided.
Marateb, Hamid Reza; Mansourian, Marjan; Adibi, Peyman; Farina, Dario
2014-01-01
Background: selecting the correct statistical test and data mining method depends highly on the measurement scale of data, type of variables, and purpose of the analysis. Different measurement scales are studied in details and statistical comparison, modeling, and data mining methods are studied based upon using several medical examples. We have presented two ordinal–variables clustering examples, as more challenging variable in analysis, using Wisconsin Breast Cancer Data (WBCD). Ordinal-to-Interval scale conversion example: a breast cancer database of nine 10-level ordinal variables for 683 patients was analyzed by two ordinal-scale clustering methods. The performance of the clustering methods was assessed by comparison with the gold standard groups of malignant and benign cases that had been identified by clinical tests. Results: the sensitivity and accuracy of the two clustering methods were 98% and 96%, respectively. Their specificity was comparable. Conclusion: by using appropriate clustering algorithm based on the measurement scale of the variables in the study, high performance is granted. Moreover, descriptive and inferential statistics in addition to modeling approach must be selected based on the scale of the variables. PMID:24672565
Vastano, Valeria; Perrone, Filomena; Marasco, Rosangela; Sacco, Margherita; Muscariello, Lidia
2016-04-01
Exopolysaccharides (EPS) from lactic acid bacteria contribute to specific rheology and texture of fermented milk products and find applications also in non-dairy foods and in therapeutics. Recently, four clusters of genes (cps) associated with surface polysaccharide production have been identified in Lactobacillus plantarum WCFS1, a probiotic and food-associated lactobacillus. These clusters are involved in cell surface architecture and probably in release and/or exposure of immunomodulating bacterial molecules. Here we show a transcriptional analysis of these clusters. Indeed, RT-PCR experiments revealed that the cps loci are organized in five operons. Moreover, by reverse transcription-qPCR analysis performed on L. plantarum WCFS1 (wild type) and WCFS1-2 (ΔccpA), we demonstrated that expression of three cps clusters is under the control of the global regulator CcpA. These results, together with the identification of putative CcpA target sequences (catabolite responsive element CRE) in the regulatory region of four out of five transcriptional units, strongly suggest for the first time a role of the master regulator CcpA in EPS gene transcription among lactobacilli.
Caesar, Lindsay K; Kvalheim, Olav M; Cech, Nadja B
2018-08-27
Mass spectral data sets often contain experimental artefacts, and data filtering prior to statistical analysis is crucial to extract reliable information. This is particularly true in untargeted metabolomics analyses, where the analyte(s) of interest are not known a priori. It is often assumed that chemical interferents (i.e. solvent contaminants such as plasticizers) are consistent across samples, and can be removed by background subtraction from blank injections. On the contrary, it is shown here that chemical contaminants may vary in abundance across each injection, potentially leading to their misidentification as relevant sample components. With this metabolomics study, we demonstrate the effectiveness of hierarchical cluster analysis (HCA) of replicate injections (technical replicates) as a methodology to identify chemical interferents and reduce their contaminating contribution to metabolomics models. Pools of metabolites with varying complexity were prepared from the botanical Angelica keiskei Koidzumi and spiked with known metabolites. Each set of pools was analyzed in triplicate and at multiple concentrations using ultraperformance liquid chromatography coupled to mass spectrometry (UPLC-MS). Before filtering, HCA failed to cluster replicates in the data sets. To identify contaminant peaks, we developed a filtering process that evaluated the relative peak area variance of each variable within triplicate injections. These interferent peaks were found across all samples, but did not show consistent peak area from injection to injection, even when evaluating the same chemical sample. This filtering process identified 128 ions that appear to originate from the UPLC-MS system. Data sets collected for a high number of pools with comparatively simple chemical composition were highly influenced by these chemical interferents, as were samples that were analyzed at a low concentration. When chemical interferent masses were removed, technical replicates clustered in all data sets. This work highlights the importance of technical replication in mass spectrometry-based studies, and presents a new application of HCA as a tool for evaluating the effectiveness of data filtering prior to statistical analysis. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Kong, Xiangzhen; He, Wei; Qin, Ning; He, Qishuang; Yang, Bin; Ouyang, Huiling; Wang, Qingmei; Xu, Fuliu
2013-03-01
Trajectory cluster analysis, including the two-stage cluster method based on Euclidean metrics and the one-stage clustering method based on Mahalanobis metrics and self-organizing maps (SOM), was applied and compared to identify the transport pathways of PM10 for the cities of Chaohu and Hefei, both located near Lake Chaohu in China. The two-stage cluster method was modified to further investigate the long trajectories in the second stage in order to eliminate the observed disaggregation among them. Twelve trajectory clusters were identified for both cities. The one-stage clustering method based on Mahalanobis metrics gives the best performance regarding the variances within clusters. The results showed that local PM10 emission was one of the most important sources in both cities and that the local emission in Hefei was higher than in Chaohu. In addition, Chaohu suffered greater effects from the eastern region (Yangtze River Delta, YRD) than Hefei. On the other hand, the long-range transportation from the northwestern pathway had a higher influence on the PM10 level in Hefei. Receptor models, including potential source contribution function (PSCF) and residence time weighted concentrations (RTWC), were utilized to identify the potential source locations of PM10 for both cities. However, the combined PSCF and RTWC results for the two cities provided PM10 source locations that were more consistent with the results of transport pathways and the total anthropogenic PM10 emission inventory. This indicates that the combined method's ability to identify the source regions is superior to that of the individual PSCF or RTWC methods. Henan and Shanxi Provinces and the YRD were important PM10 source regions for the two cities, but the Henan and Shanxi area was more important for Hefei than for Chaohu, while the YRD region was less important. In addition, the PSCF, RTWC and the combined results all had higher correlation coefficients with PM10 emission from traffic than from industry, electricity generation or residential sources, suggesting the relatively higher contribution of traffic emissions to the PM10 pollution in Lake Chaohu.
Hyde, Jonathan M; DaCosta, Gérald; Hatzoglou, Constantinos; Weekes, Hannah; Radiguet, Bertrand; Styman, Paul D; Vurpillot, Francois; Pareige, Cristelle; Etienne, Auriane; Bonny, Giovanni; Castin, Nicolas; Malerba, Lorenzo; Pareige, Philippe
2017-04-01
Irradiation of reactor pressure vessel (RPV) steels causes the formation of nanoscale microstructural features (termed radiation damage), which affect the mechanical properties of the vessel. A key tool for characterizing these nanoscale features is atom probe tomography (APT), due to its high spatial resolution and the ability to identify different chemical species in three dimensions. Microstructural observations using APT can underpin development of a mechanistic understanding of defect formation. However, with atom probe analyses there are currently multiple methods for analyzing the data. This can result in inconsistencies between results obtained from different researchers and unnecessary scatter when combining data from multiple sources. This makes interpretation of results more complex and calibration of radiation damage models challenging. In this work simulations of a range of different microstructures are used to directly compare different cluster analysis algorithms and identify their strengths and weaknesses.
Wang, Zhao-Xin; Li, Shu-Ming; Heide, Lutz
2000-01-01
The biosynthetic gene cluster of the aminocoumarin antibiotic coumermycin A1 was cloned by screening of a cosmid library of Streptomyces rishiriensis DSM 40489 with heterologous probes from a dTDP-glucose 4,6-dehydratase gene, involved in deoxysugar biosynthesis, and from the aminocoumarin resistance gyrase gene gyrBr. Sequence analysis of a 30.8-kb region upstream of gyrBr revealed the presence of 28 complete open reading frames (ORFs). Fifteen of the identified ORFs showed, on average, 84% identity to corresponding ORFs in the biosynthetic gene cluster of novobiocin, another aminocoumarin antibiotic. Possible functions of 17 ORFs in the biosynthesis of coumermycin A1 could be assigned by comparison with sequences in GenBank. Experimental proof for the function of the identified gene cluster was provided by an insertional gene inactivation experiment, which resulted in an abolishment of coumermycin A1 production. PMID:11036020
Odoi, Agricola; Wray, Ron; Emo, Marion; Birch, Stephen; Hutchison, Brian; Eyles, John; Abernathy, Tom
2005-01-01
Background Population health planning aims to improve the health of the entire population and to reduce health inequities among population groups. Socioeconomic factors are increasingly being recognized as major determinants of many aspects of health and causes of health inequities. Knowledge of socioeconomic characteristics of neighbourhoods is necessary to identify their unique health needs and enhance identification of socioeconomically disadvantaged populations. Careful integration of this knowledge into health planning activities is necessary to ensure that health planning and service provision are tailored to unique neighbourhood population health needs. In this study, we identify unique neighbourhood socioeconomic characteristics and classify the neighbourhoods based on these characteristics. Principal components analysis (PCA) of 18 socioeconomic variables was used to identify the principal components explaining most of the variation in socioeconomic characteristics across the neighbourhoods. Cluster analysis was used to classify neighbourhoods based on their socioeconomic characteristics. Results Results of the PCA and cluster analysis were similar but the latter were more objective and easier to interpret. Five neighbourhood types with distinguishing socioeconomic and demographic characteristics were identified. The methodology provides a more complete picture of the neighbourhood socioeconomic characteristics than when a single variable (e.g. income) is used to classify neighbourhoods. Conclusion Cluster analysis is useful for generating neighbourhood population socioeconomic and demographic characteristics that can be useful in guiding neighbourhood health planning and service provision. This study is the first of a series of studies designed to investigate health inequalities at the neighbourhood level with a view to providing evidence-base for health planners, service providers and policy makers to help address health inequity issues at the neighbourhood level. Subsequent studies will investigate inequalities in health outcomes both within and across the neighbourhood types identified in the current study. PMID:16092969
Carvalho, Carolina Abreu de; Fonsêca, Poliana Cristina de Almeida; Nobre, Luciana Neri; Priore, Silvia Eloiza; Franceschini, Sylvia do Carmo Castro
2016-01-01
The objective of this study is to provide guidance for identifying dietary patterns using the a posteriori approach, and analyze the methodological aspects of the studies conducted in Brazil that identified the dietary patterns of children. Articles were selected from the Latin American and Caribbean Literature on Health Sciences, Scientific Electronic Library Online and Pubmed databases. The key words were: Dietary pattern; Food pattern; Principal Components Analysis; Factor analysis; Cluster analysis; Reduced rank regression. We included studies that identified dietary patterns of children using the a posteriori approach. Seven studies published between 2007 and 2014 were selected, six of which were cross-sectional and one cohort, Five studies used the food frequency questionnaire for dietary assessment; one used a 24-hour dietary recall and the other a food list. The method of exploratory approach used in most publications was principal components factor analysis, followed by cluster analysis. The sample size of the studies ranged from 232 to 4231, the values of the Kaiser-Meyer-Olkin test from 0.524 to 0.873, and Cronbach's alpha from 0.51 to 0.69. Few Brazilian studies identified dietary patterns of children using the a posteriori approach and principal components factor analysis was the technique most used.
antiSMASH 3.0—a comprehensive resource for the genome mining of biosynthetic gene clusters
Blin, Kai; Duddela, Srikanth; Krug, Daniel; Kim, Hyun Uk; Bruccoleri, Robert; Lee, Sang Yup; Fischbach, Michael A; Müller, Rolf; Wohlleben, Wolfgang; Breitling, Rainer; Takano, Eriko
2015-01-01
Abstract Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at http://antismash.secondarymetabolites.org. Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products. At the enzyme level, active sites of key biosynthetic enzymes are now pinpointed through a curated pattern-matching procedure and Enzyme Commission numbers are assigned to functionally classify all enzyme-coding genes. Additionally, chemical structure prediction has been improved by incorporating polyketide reduction states. Finally, in order for users to be able to organize and analyze multiple antiSMASH outputs in a private setting, a new XML output module allows offline editing of antiSMASH annotations within the Geneious software. PMID:25948579
HIV Transmission Networks in the San Diego–Tijuana Border Region
Mehta, Sanjay R.; Wertheim, Joel O.; Brouwer, Kimberly C.; Wagner, Karla D.; Chaillon, Antoine; Strathdee, Steffanie; Patterson, Thomas L.; Rangel, Maria G.; Vargas, Mlenka; Murrell, Ben; Garfein, Richard; Little, Susan J.; Smith, Davey M.
2015-01-01
Background HIV sequence data can be used to reconstruct local transmission networks. Along international borders, like the San Diego–Tijuana region, understanding the dynamics of HIV transmission across reported risks, racial/ethnic groups, and geography can help direct effective prevention efforts on both sides of the border. Methods We gathered sociodemographic, geographic, clinical, and viral sequence data from HIV infected individuals participating in ten studies in the San Diego–Tijuana border region. Phylogenetic and network analysis was performed to infer putative relationships between HIV sequences. Correlates of identified clusters were evaluated and spatiotemporal relationships were explored using Bayesian phylogeographic analysis. Findings After quality filtering, 843 HIV sequences with associated demographic data and 263 background sequences from the region were analyzed, and 138 clusters were inferred (2–23 individuals). Overall, the rate of clustering did not differ by ethnicity, residence, or sex, but bisexuals were less likely to cluster than heterosexuals or men who have sex with men (p = 0.043), and individuals identifying as white (p ≤ 0.01) were more likely to cluster than other races. Clustering individuals were also 3.5 years younger than non-clustering individuals (p < 0.001). Although the sampled San Diego and Tijuana epidemics were phylogenetically compartmentalized, five clusters contained individuals residing on both sides of the border. Interpretation This study sampled ~ 7% of HIV infected individuals in the border region, and although the sampled networks on each side of the border were largely separate, there was evidence of persistent bidirectional cross-border transmissions that linked risk groups, thus highlighting the importance of the border region as a “melting pot” of risk groups. Funding NIH, VA, and Pendleton Foundation. PMID:26629540
HIV Transmission Networks in the San Diego-Tijuana Border Region.
Mehta, Sanjay R; Wertheim, Joel O; Brouwer, Kimberly C; Wagner, Karla D; Chaillon, Antoine; Strathdee, Steffanie; Patterson, Thomas L; Rangel, Maria G; Vargas, Mlenka; Murrell, Ben; Garfein, Richard; Little, Susan J; Smith, Davey M
2015-10-01
HIV sequence data can be used to reconstruct local transmission networks. Along international borders, like the San Diego-Tijuana region, understanding the dynamics of HIV transmission across reported risks, racial/ethnic groups, and geography can help direct effective prevention efforts on both sides of the border. We gathered sociodemographic, geographic, clinical, and viral sequence data from HIV infected individuals participating in ten studies in the San Diego-Tijuana border region. Phylogenetic and network analysis was performed to infer putative relationships between HIV sequences. Correlates of identified clusters were evaluated and spatiotemporal relationships were explored using Bayesian phylogeographic analysis. After quality filtering, 843 HIV sequences with associated demographic data and 263 background sequences from the region were analyzed, and 138 clusters were inferred (2-23 individuals). Overall, the rate of clustering did not differ by ethnicity, residence, or sex, but bisexuals were less likely to cluster than heterosexuals or men who have sex with men (p = 0.043), and individuals identifying as white (p ≤ 0.01) were more likely to cluster than other races. Clustering individuals were also 3.5 years younger than non-clustering individuals (p < 0.001). Although the sampled San Diego and Tijuana epidemics were phylogenetically compartmentalized, five clusters contained individuals residing on both sides of the border. This study sampled ~ 7% of HIV infected individuals in the border region, and although the sampled networks on each side of the border were largely separate, there was evidence of persistent bidirectional cross-border transmissions that linked risk groups, thus highlighting the importance of the border region as a "melting pot" of risk groups. NIH, VA, and Pendleton Foundation.
Equivalent damage validation by variable cluster analysis
NASA Astrophysics Data System (ADS)
Drago, Carlo; Ferlito, Rachele; Zucconi, Maria
2016-06-01
The main aim of this work is to perform a clustering analysis on the damage relieved in the old center of L'Aquila after the earthquake occurred on April 6, 2009 and to validate an Indicator of Equivalent Damage ED that summarizes the information reported on the AeDES card regarding the level of damage and their extension on the surface of the buildings. In particular we used a sample of 13442 masonry buildings located in an area characterized by a Macroseismic Intensity equal to 8 [1]. The aim is to ensure the coherence between the clusters and its hierarchy identified in the data of damage detected and in the data of the ED elaborated.
A Detailed Survey of Pulsating Variables in Five Globular Clusters (Abstract)
NASA Astrophysics Data System (ADS)
Murphy, B. W.
2016-12-01
(Abstract only) Globular clusters are ideal laboratories for conducting a stellar census. Of particular interest are pulsating variables, which provide astronomers with a tool to probe the properties of the stars and the cluster. We observed each of five globular clusters hundreds to thousands of times over a time span ranging from 2 to 4 years in B, V, and I filters using the SARA 0.6-meter telescope located at Cerro Tololo Interamerican Observatory and the 0.9-meter telescope located at Kitt Peak, Arizona. The images were analyzed using difference image analysis to identify and produce light curves of all variables found in each cluster. In total we identified 377 variables with 140 of these being newly discovered increasing the number of known variables stars in these clusters by 60%. Of the total we have identified 319 RR Lyrae variables (193 RR0, 18 RR01, 101 RR1, 7 RR2), 9 SX Phe stars, 5 Cepheid variables, 11 eclipsing variables, and 33 long period variables. For IC4499 we identified 64 RR0, 18 RR01, 14 RR1, 4 RR2, 1 SX Phe, 1 eclipsing binary, and 2 long period variables. For NGC4833 we identified 10 RR0, 7 RR1, 3 RR2, 6 SX Phe, 5 eclipsing binaries, and 9 long period variables. For NGC6171 (M107) we identified 14 RR0, 7 RR1, and 1 SX Phe. For NGC6402 (M14) we identified 55 RR0, 57 RR1, 1 RR2, 1 SX Phe, 6 Cepheids, 1 eclipsing binary, and 15 long period variables. For NGC6584 we identified 50 RR0, 16 RR1, 4 eclipsing binaries, and 7 long period variables. From our extensive data set we were able to obtain sufficient temporal and complete phase coverage of the RR Lyrae variables. This has allowed us not only to properly classify each of the RR Lyrae variables but also to use Fourier decomposition of the B, V, and I light curves to further analyze the properties of the variable stars and hence the physical properties of each globular cluster.
Spatiotemporal analysis of dengue fever in Nepal from 2010 to 2014.
Acharya, Bipin Kumar; Cao, ChunXiang; Lakes, Tobia; Chen, Wei; Naeem, Shahid
2016-08-22
Due to recent emergence, dengue is becoming one of the major public health problems in Nepal. The numbers of reported dengue cases in general and the area with reported dengue cases are both continuously increasing in recent years. However, spatiotemporal patterns and clusters of dengue have not been investigated yet. This study aims to fill this gap by analyzing spatiotemporal patterns based on monthly surveillance data aggregated at district. Dengue cases from 2010 to 2014 at district level were collected from the Nepal government's health and mapping agencies respectively. GeoDa software was used to map crude incidence, excess hazard and spatially smoothed incidence. Cluster analysis was performed in SaTScan software to explore spatiotemporal clusters of dengue during the above-mentioned time period. Spatiotemporal distribution of dengue fever in Nepal from 2010 to 2014 was mapped at district level in terms of crude incidence, excess risk and spatially smoothed incidence. Results show that the distribution of dengue fever was not random but clustered in space and time. Chitwan district was identified as the most likely cluster and Jhapa district was the first secondary cluster in both spatial and spatiotemporal scan. July to September of 2010 was identified as a significant temporal cluster. This study assessed and mapped for the first time the spatiotemporal pattern of dengue fever in Nepal. Two districts namely Chitwan and Jhapa were found highly affected by dengue fever. The current study also demonstrated the importance of geospatial approach in epidemiological research. The initial result on dengue patterns and risk of this study may assist institutions and policy makers to develop better preventive strategies.
Çağlar, Emine; Aşçı, F. Hülya
2010-01-01
The primary purpose of the present study was to identify motivational profiles of adolescent athletes using cluster analysis in non-Western culture. A second purpose was to examine relationships between physical self-perception differences of adolescent athletes and motivational profiles. One hundred and thirty six male (Mage = 17.46, SD = 1.25 years) and 80 female adolescent athletes (Mage = 17.61, SD = 1.19 years) from a variety of team sports including basketball, soccer, volleyball, and handball volunteered to participate in this study. The Sport Motivation Scale (SMS) and Physical Self-Perception Profile (PSPP) were administered to all participants. Hierarchical cluster analysis revealed a four-cluster solution for this sample: amotivated, low motivated, moderate motivated, and highly motivated. A 4 x 5 (Cluster x PSPP Subscales) MANOVA revealed no significant main effect of motivational clusters on physical self-perception levels (p > 0.05). As a result, findings of the present study showed that motivational types of the adolescent athletes constituted four different motivational clusters. Highly and moderate motivated athletes consistently scored higher than amotivated athletes on the perceived sport competence, physical condition, and physical self-worth subscales of PSPP. This study identified motivational profiles of competitive youth-sport participants. Key points Highly motivated athletes have a tendency to perceive themselves competent in psychomotor domains as compared to the amotivated athletes As the athletes feel more competent in psychomotor domain, they are more intrinsically motivated. The information about motivational profiles of adolescent athletes could be used for developing strategies and interventions designed to improve the strength and quality of sport participants’ motivation. PMID:24149690
Voigt, Andrew P.; Brodersen, Lisa Eidenschink; Alonzo, Todd A.; Gerbing, Robert B.; Menssen, Andrew J.; Wilson, Elisabeth R.; Kahwash, Samir; Raimondi, Susana C.; Hirsch, Betsy A.; Gamis, Alan S.; Meshinchi, Soheil; Wells, Denise A.; Loken, Michael R.
2017-01-01
Diagnostic biomarkers can be used to determine relapse risk in acute myeloid leukemia, and certain genetic aberrancies have prognostic relevance. A diagnostic immunophenotypic expression profile, which quantifies the amounts of distinct gene products, not just their presence or absence, was established in order to improve outcome prediction for patients with acute myeloid leukemia. The immunophenotypic expression profile, which defines each patient’s leukemia as a location in 15-dimensional space, was generated for 769 patients enrolled in the Children’s Oncology Group AAML0531 protocol. Unsupervised hierarchical clustering grouped patients with similar immunophenotypic expression profiles into eleven patient cohorts, demonstrating high associations among phenotype, genotype, morphology, and outcome. Of 95 patients with inv(16), 79% segregated in Cluster A. Of 109 patients with t(8;21), 92% segregated in Clusters A and B. Of 152 patients with 11q23 alterations, 78% segregated in Clusters D, E, F, G, or H. For both inv(16) and 11q23 abnormalities, differential phenotypic expression identified patient groups with different survival characteristics (P<0.05). Clinical outcome analysis revealed that Cluster B (predominantly t(8;21)) was associated with favorable outcome (P<0.001) and Clusters E, G, H, and K were associated with adverse outcomes (P<0.05). Multivariable regression analysis revealed that Clusters E, G, H, and K were independently associated with worse survival (P range <0.001 to 0.008). The Children’s Oncology Group AAML0531 trial: clinicaltrials.gov Identifier: 00372593. PMID:28883080
Replicating cluster subtypes for the prevention of adolescent smoking and alcohol use.
Babbin, Steven F; Velicer, Wayne F; Paiva, Andrea L; Brick, Leslie Ann D; Redding, Colleen A
2015-01-01
Substance abuse interventions tailored to the individual level have produced effective outcomes for a wide variety of behaviors. One approach to enhancing tailoring involves using cluster analysis to identify prevention subtypes that represent different attitudes about substance use. This study applied this approach to better understand tailored interventions for smoking and alcohol prevention. Analyses were performed on a sample of sixth graders from 20 New England middle schools involved in a 36-month tailored intervention study. Most adolescents reported being in the Acquisition Precontemplation (aPC) stage at baseline: not smoking or not drinking and not planning to start in the next six months. For smoking (N=4059) and alcohol (N=3973), each sample was randomly split into five subsamples. Cluster analysis was performed within each subsample based on three variables: Pros and Cons (from Decisional Balance Scales), and Situational Temptations. Across all subsamples for both smoking and alcohol, the following four clusters were identified: (1) Most Protected (MP; low Pros, high Cons, low Temptations); (2) Ambivalent (AM; high Pros, average Cons and Temptations); (3) Risk Denial (RD; average Pros, low Cons, average Temptations); and (4) High Risk (HR; high Pros, low Cons, and very high Temptations). Finding the same four clusters within aPC for both smoking and alcohol, replicating the results across the five subsamples, and demonstrating hypothesized relations among the clusters with additional external validity analyses provide strong evidence of the robustness of these results. These clusters demonstrate evidence of validity and can provide a basis for tailoring interventions. Copyright © 2014. Published by Elsevier Ltd.
Replicating cluster subtypes for the prevention of adolescent smoking and alcohol use
Babbin, Steven F.; Velicer, Wayne F.; Paiva, Andrea L.; Brick, Leslie Ann D.; Redding, Colleen A.
2015-01-01
Introduction Substance abuse interventions tailored to the individual level have produced effective outcomes for a wide variety of behaviors. One approach to enhancing tailoring involves using cluster analysis to identify prevention subtypes that represent different attitudes about substance use. This study applied this approach to better understand tailored interventions for smoking and alcohol prevention. Methods Analyses were performed on a sample of sixth graders from 20 New England middle schools involved in a 36-month tailored intervention study. Most adolescents reported being in the Acquisition Precontemplation (aPC) stage at baseline: not smoking or not drinking and not planning to start in the next six months. For smoking (N= 4059) and alcohol (N= 3973), each sample was randomly split into five subsamples. Cluster analysis was performed within each subsample based on three variables: Pros and Cons (from Decisional Balance Scales), and Situational Temptations. Results Across all subsamples for both smoking and alcohol, the following four clusters were identified: (1) Most Protected (MP; low Pros, high Cons, low Temptations); (2) Ambivalent (AM; high Pros, average Cons and Temptations); (3) Risk Denial (RD; average Pros, low Cons, average Temptations); and (4) High Risk (HR; high Pros, low Cons, and very high Temptations). Conclusions Finding the same four clusters within aPC for both smoking and alcohol, replicating the results across the five subsamples, and demonstrating hypothesized relations among the clusters with additional external validity analyses provide strong evidence of the robustness of these results. These clusters demonstrate evidence of validity and can provide a basis for tailoring interventions. PMID:25222849
Zakharov, A.; Vitale, C.; Kilinc, E.; Koroleva, K.; Fayuk, D.; Shelukhina, I.; Naumenko, N.; Skorinkin, A.; Khazipov, R.; Giniatullin, R.
2015-01-01
Trigeminal nerves in meninges are implicated in generation of nociceptive firing underlying migraine pain. However, the neurochemical mechanisms of nociceptive firing in meningeal trigeminal nerves are little understood. In this study, using suction electrode recordings from peripheral branches of the trigeminal nerve in isolated rat meninges, we analyzed spontaneous and capsaicin-induced orthodromic spiking activity. In control, biphasic single spikes with variable amplitude and shapes were observed. Application of the transient receptor potential vanilloid 1 (TRPV1) agonist capsaicin to meninges dramatically increased firing whereas the amplitudes and shapes of spikes remained essentially unchanged. This effect was antagonized by the specific TRPV1 antagonist capsazepine. Using the clustering approach, several groups of uniform spikes (clusters) were identified. The clustering approach combined with capsaicin application allowed us to detect and to distinguish “responder” (65%) from “non-responder” clusters (35%). Notably, responders fired spikes at frequencies exceeding 10 Hz, high enough to provide postsynaptic temporal summation of excitation at brainstem and spinal cord level. Almost all spikes were suppressed by tetrodotoxin (TTX) suggesting an involvement of the TTX-sensitive sodium channels in nociceptive signaling at the peripheral branches of trigeminal neurons. Our analysis also identified transient (desensitizing) and long-lasting (slowly desensitizing) responses to the continuous application of capsaicin. Thus, the persistent activation of nociceptors in capsaicin-sensitive nerve fibers shown here may be involved in trigeminal pain signaling and plasticity along with the release of migraine-related neuropeptides from TRPV1 positive neurons. Furthermore, cluster analysis could be widely used to characterize the temporal and neurochemical profiles of other pain transducers likely implicated in migraine. PMID:26283923
Evolution of coding and non-coding genes in HOX clusters of a marsupial.
Yu, Hongshi; Lindsay, James; Feng, Zhi-Ping; Frankenberg, Stephen; Hu, Yanqiu; Carone, Dawn; Shaw, Geoff; Pask, Andrew J; O'Neill, Rachel; Papenfuss, Anthony T; Renfree, Marilyn B
2012-06-18
The HOX gene clusters are thought to be highly conserved amongst mammals and other vertebrates, but the long non-coding RNAs have only been studied in detail in human and mouse. The sequencing of the kangaroo genome provides an opportunity to use comparative analyses to compare the HOX clusters of a mammal with a distinct body plan to those of other mammals. Here we report a comparative analysis of HOX gene clusters between an Australian marsupial of the kangaroo family and the eutherians. There was a strikingly high level of conservation of HOX gene sequence and structure and non-protein coding genes including the microRNAs miR-196a, miR-196b, miR-10a and miR-10b and the long non-coding RNAs HOTAIR, HOTAIRM1 and HOXA11AS that play critical roles in regulating gene expression and controlling development. By microRNA deep sequencing and comparative genomic analyses, two conserved microRNAs (miR-10a and miR-10b) were identified and one new candidate microRNA with typical hairpin precursor structure that is expressed in both fibroblasts and testes was found. The prediction of microRNA target analysis showed that several known microRNA targets, such as miR-10, miR-414 and miR-464, were found in the tammar HOX clusters. In addition, several novel and putative miRNAs were identified that originated from elsewhere in the tammar genome and that target the tammar HOXB and HOXD clusters. This study confirms that the emergence of known long non-coding RNAs in the HOX clusters clearly predate the marsupial-eutherian divergence 160 Ma ago. It also identified a new potentially functional microRNA as well as conserved miRNAs. These non-coding RNAs may participate in the regulation of HOX genes to influence the body plan of this marsupial.
Evolution of coding and non-coding genes in HOX clusters of a marsupial
2012-01-01
Background The HOX gene clusters are thought to be highly conserved amongst mammals and other vertebrates, but the long non-coding RNAs have only been studied in detail in human and mouse. The sequencing of the kangaroo genome provides an opportunity to use comparative analyses to compare the HOX clusters of a mammal with a distinct body plan to those of other mammals. Results Here we report a comparative analysis of HOX gene clusters between an Australian marsupial of the kangaroo family and the eutherians. There was a strikingly high level of conservation of HOX gene sequence and structure and non-protein coding genes including the microRNAs miR-196a, miR-196b, miR-10a and miR-10b and the long non-coding RNAs HOTAIR, HOTAIRM1 and HOXA11AS that play critical roles in regulating gene expression and controlling development. By microRNA deep sequencing and comparative genomic analyses, two conserved microRNAs (miR-10a and miR-10b) were identified and one new candidate microRNA with typical hairpin precursor structure that is expressed in both fibroblasts and testes was found. The prediction of microRNA target analysis showed that several known microRNA targets, such as miR-10, miR-414 and miR-464, were found in the tammar HOX clusters. In addition, several novel and putative miRNAs were identified that originated from elsewhere in the tammar genome and that target the tammar HOXB and HOXD clusters. Conclusions This study confirms that the emergence of known long non-coding RNAs in the HOX clusters clearly predate the marsupial-eutherian divergence 160 Ma ago. It also identified a new potentially functional microRNA as well as conserved miRNAs. These non-coding RNAs may participate in the regulation of HOX genes to influence the body plan of this marsupial. PMID:22708672
Cluster: A New Application for Spatial Analysis of Pixelated Data for Epiphytotics.
Nelson, Scot C; Corcoja, Iulian; Pethybridge, Sarah J
2017-12-01
Spatial analysis of epiphytotics is essential to develop and test hypotheses about pathogen ecology, disease dynamics, and to optimize plant disease management strategies. Data collection for spatial analysis requires substantial investment in time to depict patterns in various frames and hierarchies. We developed a new approach for spatial analysis of pixelated data in digital imagery and incorporated the method in a stand-alone desktop application called Cluster. The user isolates target entities (clusters) by designating up to 24 pixel colors as nontargets and moves a threshold slider to visualize the targets. The app calculates the percent area occupied by targeted pixels, identifies the centroids of targeted clusters, and computes the relative compass angle of orientation for each cluster. Users can deselect anomalous clusters manually and/or automatically by specifying a size threshold value to exclude smaller targets from the analysis. Up to 1,000 stochastic simulations randomly place the centroids of each cluster in ranked order of size (largest to smallest) within each matrix while preserving their calculated angles of orientation for the long axes. A two-tailed probability t test compares the mean inter-cluster distances for the observed versus the values derived from randomly simulated maps. This is the basis for statistical testing of the null hypothesis that the clusters are randomly distributed within the frame of interest. These frames can assume any shape, from natural (e.g., leaf) to arbitrary (e.g., a rectangular or polygonal field). Cluster summarizes normalized attributes of clusters, including pixel number, axis length, axis width, compass orientation, and the length/width ratio, available to the user as a downloadable spreadsheet. Each simulated map may be saved as an image and inspected. Provided examples demonstrate the utility of Cluster to analyze patterns at various spatial scales in plant pathology and ecology and highlight the limitations, trade-offs, and considerations for the sensitivities of variables and the biological interpretations of results. The Cluster app is available as a free download for Apple computers at iTunes, with a link to a user guide website.
Spatial autocorrelation analysis of health care hotspots in Taiwan in 2006
2009-01-01
Background Spatial analytical techniques and models are often used in epidemiology to identify spatial anomalies (hotspots) in disease regions. These analytical approaches can be used to not only identify the location of such hotspots, but also their spatial patterns. Methods In this study, we utilize spatial autocorrelation methodologies, including Global Moran's I and Local Getis-Ord statistics, to describe and map spatial clusters, and areas in which these are situated, for the 20 leading causes of death in Taiwan. In addition, we use the fit to a logistic regression model to test the characteristics of similarity and dissimilarity by gender. Results Gender is compared in efforts to formulate the common spatial risk. The mean found by local spatial autocorrelation analysis is utilized to identify spatial cluster patterns. There is naturally great interest in discovering the relationship between the leading causes of death and well-documented spatial risk factors. For example, in Taiwan, we found the geographical distribution of clusters where there is a prevalence of tuberculosis to closely correspond to the location of aboriginal townships. Conclusions Cluster mapping helps to clarify issues such as the spatial aspects of both internal and external correlations for leading health care events. This is of great aid in assessing spatial risk factors, which in turn facilitates the planning of the most advantageous types of health care policies and implementation of effective health care services. PMID:20003460
Zhang, Min; Jia, Dijing; Li, Hanping; Gui, Tao; Jia, Lei; Wang, Xiaolin; Li, Tianyi; Liu, Yongjian; Bao, Zuoyi; Liu, Siyang; Zhuang, Daomin; Li, Jingyun; Li, Lin
2017-10-01
CRF07_BC was originally formed in Yunnan province of China in 1980s and spread quickly in injecting drug users (IDUs). In recent years, it has been introduced into men who have sex with men (MSM) and become the most dominant strain in China. In this study, we performed a comprehensively phylodynamic analysis of CRF07_BC sequences from China. All CRF07_BC sequences identified in China were retrieved from database. More sequences obtained in our laboratory were added to make the dataset more representative. A maximum-likelihood (ML) tree was constructed with PhyML3.0. Maximum clade credibility (MCC) tree and effective population size were predicted by using Markov Chains Monte Carlo sampling method with Beast software. A total of 610 CRF07_BC sequences coving 1,473 bp of the gag gene (from 817 to 2,289 according to HXB2 calculator) were included into the dataset. Three epidemic clusters were identified; two clusters comprised sequences from IDUs, while one cluster mainly contained sequences from MSMs. The time of the most recent common ancestor of clusters that composed of sequences from MSMs was estimated to be in 2000. Two rapid spreading waves of effective population size of CRF07_BC infections were identified in the skyline plot. The second wave coincided with the expanding of MSM cluster. The results indicated that the control of CRF07_BC infections in MSMs would help to decrease its epidemic in China.
Serratia marcescens Bacteremia: Nosocomial Cluster Following Narcotic Diversion.
Schuppener, Leah M; Pop-Vicas, Aurora E; Brooks, Erin G; Duster, Megan N; Crnich, Christopher J; Sterkel, Alana K; Webb, Aaron P; Safdar, Nasia
2017-09-01
OBJECTIVE To describe the investigation and control of a cluster of Serratia marcescens bacteremia in a 505-bed tertiary-care center. METHODS Cluster cases were defined as all patients with S. marcescens bacteremia between March 2 and April 7, 2014, who were found to have identical or related blood isolates determined by molecular typing with pulsed-field gel electrophoresis. Cases were compared using bivariate analysis with controls admitted at the same time and to the same service as the cases, in a 4:1 ratio. RESULTS In total, 6 patients developed S. marcescens bacteremia within 48 hours after admission within the above period. Of these, 5 patients had identical Serratia isolates determined by molecular typing, and were included in a case-control study. Exposure to the post-anesthesia care unit was a risk factor identified in bivariate analysis. Evidence of tampered opioid-containing syringes on several hospital units was discovered soon after the initial cluster case presented, and a full narcotic diversion investigation was conducted. A nurse working in the post-anesthesia care unit was identified as the employee responsible for the drug diversion and was epidemiologically linked to all 5 patients in the cluster. No further cases were identified once the implicated employee's job was terminated. CONCLUSION Illicit drug use by healthcare workers remains an important mechanism for the development of bloodstream infections in hospitalized patients. Active mechanisms and systems should remain in place to prevent, detect, and control narcotic drug diversions and associated patient harm in the healthcare setting. Infect Control Hosp Epidemiol 2017;38:1027-1031.
Cao, Zhen; Wang, Zhenjie; Shang, Zhonglin; Zhao, Jiancheng
2017-01-01
Fourier-transform infrared spectroscopy (FTIR) with the attenuated total reflectance technique was used to identify Rhodobryum roseum from its four adulterants. The FTIR spectra of six samples in the range from 4000 cm-1 to 600 cm-1 were obtained. The second-derivative transformation test was used to identify the small and nearby absorption peaks. A cluster analysis was performed to classify the spectra in a dendrogram based on the spectral similarity. Principal component analysis (PCA) was used to classify the species of six moss samples. A cluster analysis with PCA was used to identify different genera. However, some species of the same genus exhibited highly similar chemical components and FTIR spectra. Fourier self-deconvolution and discrete wavelet transform (DWT) were used to enhance the differences among the species with similar chemical components and FTIR spectra. Three scales were selected as the feature-extracting space in the DWT domain. The results show that FTIR spectroscopy with chemometrics is suitable for identifying Rhodobryum roseum and its adulterants.
Diversity in Older Adults' Use of the Internet: Identifying Subgroups Through Latent Class Analysis.
van Boekel, Leonieke C; Peek, Sebastiaan Tm; Luijkx, Katrien G
2017-05-24
As for all individuals, the Internet is important in the everyday life of older adults. Research on older adults' use of the Internet has merely focused on users versus nonusers and consequences of Internet use and nonuse. Older adults are a heterogeneous group, which may implicate that their use of the Internet is diverse as well. Older adults can use the Internet for different activities, and this usage can be of influence on benefits the Internet can have for them. The aim of this paper was to describe the diversity or heterogeneity in the activities for which older adults use the Internet and determine whether diversity is related to social or health-related variables. We used data of a national representative Internet panel in the Netherlands. Panel members aged 65 years and older and who have access to and use the Internet were selected (N=1418). We conducted a latent class analysis based on the Internet activities that panel members reported to spend time on. Second, we described the identified clusters with descriptive statistics and compared the clusters using analysis of variance (ANOVA) and chi-square tests. Four clusters were distinguished. Cluster 1 was labeled as the "practical users" (36.88%, n=523). These respondents mainly used the Internet for practical and financial purposes such as searching for information, comparing products, and banking. Respondents in Cluster 2, the "minimizers" (32.23%, n=457), reported lowest frequency on most Internet activities, are older (mean age 73 years), and spent the smallest time on the Internet. Cluster 3 was labeled as the "maximizers" (17.77%, n=252); these respondents used the Internet for various activities, spent most time on the Internet, and were relatively younger (mean age below 70 years). Respondents in Cluster 4, the "social users," mainly used the Internet for social and leisure-related activities such as gaming and social network sites. The identified clusters significantly differed in age (P<.001, ω 2 =0.07), time spent on the Internet (P<.001, ω 2 =0.12), and frequency of downloading apps (P<.001, ω 2 =0.14), with medium to large effect sizes. Social and health-related variables were significantly different between the clusters, except social and emotional loneliness. However, effect sizes were small. The minimizers scored significantly lower on psychological well-being, instrumental activities of daily living (iADL), and experienced health compared with the practical users and maximizers. Older adults are a diverse group in terms of their activities on the Internet. This underlines the importance to look beyond use versus nonuse when studying older adults' Internet use. The clusters we have identified in this study can help tailor the development and deployment of eHealth intervention to specific segments of the older population. ©Leonieke C van Boekel, Sebastiaan TM Peek, Katrien G Luijkx. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 24.05.2017.
Diversity in Older Adults’ Use of the Internet: Identifying Subgroups Through Latent Class Analysis
van Boekel, Leonieke C; Peek, Sebastiaan TM; Luijkx, Katrien G
2017-01-01
Background As for all individuals, the Internet is important in the everyday life of older adults. Research on older adults’ use of the Internet has merely focused on users versus nonusers and consequences of Internet use and nonuse. Older adults are a heterogeneous group, which may implicate that their use of the Internet is diverse as well. Older adults can use the Internet for different activities, and this usage can be of influence on benefits the Internet can have for them. Objective The aim of this paper was to describe the diversity or heterogeneity in the activities for which older adults use the Internet and determine whether diversity is related to social or health-related variables. Methods We used data of a national representative Internet panel in the Netherlands. Panel members aged 65 years and older and who have access to and use the Internet were selected (N=1418). We conducted a latent class analysis based on the Internet activities that panel members reported to spend time on. Second, we described the identified clusters with descriptive statistics and compared the clusters using analysis of variance (ANOVA) and chi-square tests. Results Four clusters were distinguished. Cluster 1 was labeled as the “practical users” (36.88%, n=523). These respondents mainly used the Internet for practical and financial purposes such as searching for information, comparing products, and banking. Respondents in Cluster 2, the “minimizers” (32.23%, n=457), reported lowest frequency on most Internet activities, are older (mean age 73 years), and spent the smallest time on the Internet. Cluster 3 was labeled as the “maximizers” (17.77%, n=252); these respondents used the Internet for various activities, spent most time on the Internet, and were relatively younger (mean age below 70 years). Respondents in Cluster 4, the “social users,” mainly used the Internet for social and leisure-related activities such as gaming and social network sites. The identified clusters significantly differed in age (P<.001, ω2=0.07), time spent on the Internet (P<.001, ω2=0.12), and frequency of downloading apps (P<.001, ω2=0.14), with medium to large effect sizes. Social and health-related variables were significantly different between the clusters, except social and emotional loneliness. However, effect sizes were small. The minimizers scored significantly lower on psychological well-being, instrumental activities of daily living (iADL), and experienced health compared with the practical users and maximizers. Conclusions Older adults are a diverse group in terms of their activities on the Internet. This underlines the importance to look beyond use versus nonuse when studying older adults’ Internet use. The clusters we have identified in this study can help tailor the development and deployment of eHealth intervention to specific segments of the older population. PMID:28539302
A simple algorithm for the identification of clinical COPD phenotypes.
Burgel, Pierre-Régis; Paillasseur, Jean-Louis; Janssens, Wim; Piquet, Jacques; Ter Riet, Gerben; Garcia-Aymerich, Judith; Cosio, Borja; Bakke, Per; Puhan, Milo A; Langhammer, Arnulf; Alfageme, Inmaculada; Almagro, Pere; Ancochea, Julio; Celli, Bartolome R; Casanova, Ciro; de-Torres, Juan P; Decramer, Marc; Echazarreta, Andrés; Esteban, Cristobal; Gomez Punter, Rosa Mar; Han, MeiLan K; Johannessen, Ane; Kaiser, Bernhard; Lamprecht, Bernd; Lange, Peter; Leivseth, Linda; Marin, Jose M; Martin, Francis; Martinez-Camblor, Pablo; Miravitlles, Marc; Oga, Toru; Sofia Ramírez, Ana; Sin, Don D; Sobradillo, Patricia; Soler-Cataluña, Juan J; Turner, Alice M; Verdu Rivera, Francisco Javier; Soriano, Joan B; Roche, Nicolas
2017-11-01
This study aimed to identify simple rules for allocating chronic obstructive pulmonary disease (COPD) patients to clinical phenotypes identified by cluster analyses.Data from 2409 COPD patients of French/Belgian COPD cohorts were analysed using cluster analysis resulting in the identification of subgroups, for which clinical relevance was determined by comparing 3-year all-cause mortality. Classification and regression trees (CARTs) were used to develop an algorithm for allocating patients to these subgroups. This algorithm was tested in 3651 patients from the COPD Cohorts Collaborative International Assessment (3CIA) initiative.Cluster analysis identified five subgroups of COPD patients with different clinical characteristics (especially regarding severity of respiratory disease and the presence of cardiovascular comorbidities and diabetes). The CART-based algorithm indicated that the variables relevant for patient grouping differed markedly between patients with isolated respiratory disease (FEV 1 , dyspnoea grade) and those with multi-morbidity (dyspnoea grade, age, FEV 1 and body mass index). Application of this algorithm to the 3CIA cohorts confirmed that it identified subgroups of patients with different clinical characteristics, mortality rates (median, from 4% to 27%) and age at death (median, from 68 to 76 years).A simple algorithm, integrating respiratory characteristics and comorbidities, allowed the identification of clinically relevant COPD phenotypes. Copyright ©ERS 2017.
Epigenetic transgenerational inheritance of somatic transcriptomes and epigenetic control regions
2012-01-01
Background Environmentally induced epigenetic transgenerational inheritance of adult onset disease involves a variety of phenotypic changes, suggesting a general alteration in genome activity. Results Investigation of different tissue transcriptomes in male and female F3 generation vinclozolin versus control lineage rats demonstrated all tissues examined had transgenerational transcriptomes. The microarrays from 11 different tissues were compared with a gene bionetwork analysis. Although each tissue transgenerational transcriptome was unique, common cellular pathways and processes were identified between the tissues. A cluster analysis identified gene modules with coordinated gene expression and each had unique gene networks regulating tissue-specific gene expression and function. A large number of statistically significant over-represented clusters of genes were identified in the genome for both males and females. These gene clusters ranged from 2-5 megabases in size, and a number of them corresponded to the epimutations previously identified in sperm that transmit the epigenetic transgenerational inheritance of disease phenotypes. Conclusions Combined observations demonstrate that all tissues derived from the epigenetically altered germ line develop transgenerational transcriptomes unique to the tissue, but common epigenetic control regions in the genome may coordinately regulate these tissue-specific transcriptomes. This systems biology approach provides insight into the molecular mechanisms involved in the epigenetic transgenerational inheritance of a variety of adult onset disease phenotypes. PMID:23034163
Jentsch, Franziska; Allen, Jennifer; Fuchs, Judith; von der Lippe, Elena
2017-04-04
Modifiable health risk factors (MHRFs) significantly affect morbidity and mortality rates and frequently occur in specific combinations or risk clusters. Using five MHRFs (smoking, high-risk alcohol consumption, physical inactivity, low intake of fruits and vegetables, and obesity) this study investigates the extent to which risk clusters are observed in a representative sample of women aged 65 and older in Germany. Additionally, the structural composition of the clusters is systematically compared with data and findings from other countries. A pooled data set of Germany's representative cross-sectional surveys GEDA09 and GEDA10 was used. The cohort comprised 4,617 women aged 65 and older. Specific risk clusters based on five MHRFs are identified, using hierarchical cluster analysis. The MHRFs were defined as current smoking (daily or occasionally), risk alcohol consumption (according to the Alcohol Use Disorders Identification Test, a sum score of 4 or more points), physical inactivity (less active than 5 days per week for at least 30 min and lack of sports-related activity in the last three months), low intake of fruits and vegetables (less than one serving of fruits and one of vegetables per day), and obesity (a body mass index equal to or greater than 30). A total of 4,292 cases with full information on these factors are included in the cluster analysis. Extended analyses were also performed to include the number of chronic diseases by age and socioeconomic status of group members. A total of seven risk clusters were identified. In a comparison with data from international studies, the seven risk clusters were found to be stable with a high degree of structural equivalency. Evidence of the stability of risk clusters across various study populations provides a useful starting point for long-term targeted health interventions. The structural clusters provide information through which various MHRFs can be evaluated simultaneously.
Kindschuh, Sarah R.; Cain, James W.; Daniel, David; Peyton, Mark A.
2016-01-01
The capacity to describe and quantify predation by large carnivores expanded considerably with the advent of GPS technology. Analyzing clusters of GPS locations formed by carnivores facilitates the detection of predation events by identifying characteristics which distinguish predation sites. We present a performance assessment of GPS cluster analysis as applied to the predation and scavenging of an omnivore, the American black bear (Ursus americanus), on ungulate prey and carrion. Through field investigations of 6854 GPS locations from 24 individual bears, we identified 54 sites where black bears formed a cluster of locations while predating or scavenging elk (Cervus elaphus), mule deer (Odocoileus hemionus), or cattle (Bos spp.). We developed models for three data sets to predict whether a GPS cluster was formed at a carnivory site vs. a non-carnivory site (e.g., bed sites or non-ungulate foraging sites). Two full-season data sets contained GPS locations logged at either 3-h or 30-min intervals from April to November, and a third data set contained 30-min interval data from April through July corresponding to the calving period for elk. Longer fix intervals resulted in the detection of fewer carnivory sites. Clusters were more likely to be carnivory sites if they occurred in open or edge habitats, if they occurred in the early season, if the mean distance between all pairs of GPS locations within the cluster was less, and if the cluster endured for a longer period of time. Clusters were less likely to be carnivory sites if they were initiated in the morning or night compared to the day. The top models for each data set performed well and successfully predicted 71–96% of field-verified carnivory events, 55–75% of non–carnivory events, and 58–76% of clusters overall. Refinement of this method will benefit from further application across species and ecological systems.
Cluster analysis of obesity and asthma phenotypes.
Sutherland, E Rand; Goleva, Elena; King, Tonya S; Lehman, Erik; Stevens, Allen D; Jackson, Leisa P; Stream, Amanda R; Fahy, John V; Leung, Donald Y M
2012-01-01
Asthma is a heterogeneous disease with variability among patients in characteristics such as lung function, symptoms and control, body weight, markers of inflammation, and responsiveness to glucocorticoids (GC). Cluster analysis of well-characterized cohorts can advance understanding of disease subgroups in asthma and point to unsuspected disease mechanisms. We utilized an hypothesis-free cluster analytical approach to define the contribution of obesity and related variables to asthma phenotype. In a cohort of clinical trial participants (n = 250), minimum-variance hierarchical clustering was used to identify clinical and inflammatory biomarkers important in determining disease cluster membership in mild and moderate persistent asthmatics. In a subset of participants, GC sensitivity was assessed via expression of GC receptor alpha (GCRα) and induction of MAP kinase phosphatase-1 (MKP-1) expression by dexamethasone. Four asthma clusters were identified, with body mass index (BMI, kg/m(2)) and severity of asthma symptoms (AEQ score) the most significant determinants of cluster membership (F = 57.1, p<0.0001 and F = 44.8, p<0.0001, respectively). Two clusters were composed of predominantly obese individuals; these two obese asthma clusters differed from one another with regard to age of asthma onset, measures of asthma symptoms (AEQ) and control (ACQ), exhaled nitric oxide concentration (F(E)NO) and airway hyperresponsiveness (methacholine PC(20)) but were similar with regard to measures of lung function (FEV(1) (%) and FEV(1)/FVC), airway eosinophilia, IgE, leptin, adiponectin and C-reactive protein (hsCRP). Members of obese clusters demonstrated evidence of reduced expression of GCRα, a finding which was correlated with a reduced induction of MKP-1 expression by dexamethasone Obesity is an important determinant of asthma phenotype in adults. There is heterogeneity in expression of clinical and inflammatory biomarkers of asthma across obese individuals. Reduced expression of the dominant functional isoform of the GCR may mediate GC insensitivity in obese asthmatics.
Shahar, Tal; Granit, Avital; Zrihan, Daniel; Canello, Tamar; Charbit, Hanna; Einstein, Ofira; Rozovski, Uri; Elgavish, Sharona; Ram, Zvi; Siegal, Tali; Lavon, Iris
2016-12-01
The 54 microRNAs (miRNAs) within the DLK-DIO3 genomic region on chromosome 14q32.31 (cluster-14-miRNAs) are organized into sub-clusters 14A and 14B. These miRNAs are downregulated in glioblastomas and might have a tumor suppressive role. Any association between the expression levels of cluster-14-miRNAs with overall survival (OS) is undetermined. We randomly selected miR-433, belonging to sub-cluster 14A and miR-323a-3p and miR-369-3p, belonging to sub-cluster 14B, and assessed their role in glioblastomas in vitro and in vivo. We also determined the expression level of cluster-14-miRNAs in 27 patients with newly diagnosed glioblastoma, and analyzed the association between their level of expression and OS. Overexpression of miR-323a-3p and miR-369-3p, but not miR-433, in glioblastoma cells inhibited their proliferation and migration in vitro. Mice implanted with glioblastoma cells overexpressing miR323a-3p and miR369-3p, but not miR433, exhibited prolonged survival compared to controls (P = .003). Bioinformatics analysis identified 13 putative target genes of cluster-14-miRNAs, and real-time RT-PCR validated these findings. Pathway analysis of the putative target genes identified neuregulin as the most enriched pathway. The expression level of cluster-14-miRNAs correlated with patients' OS. The median OS was 8.5 months for patients with low expression levels and 52.7 months for patients with high expression levels (HR 0.34; 95 % CI 0.12-0.59, P = .003). The expression level of cluster-14-miRNAs correlates directly with OS, suggesting a role for this cluster in promoting aggressive behavior of glioblastoma, possibly through ErBb/neuregulin signaling.
Modica, Maddalena; Carabalona, Roberta; Spezzaferri, Rosa; Tavanelli, Monica; Torri, A; Ripamonti, Vittorino; Castiglioni, Paolo; De Maria, Renata; Ferratini, Maurizio
2012-03-01
To evaluate the psychological characteristics of coronary heart disease (CHD) patients after coronary artery bypass grafting (CABG) by cluster analysis of Minnesota Multiphasic Personality Inventory (MMPI-2) questionnaires and to assess the impact of the profiles obtained on long-term outcome. 229 CHD patients admitted to cardiac rehabilitation filled in self-administered MMPI-2 questionnaires early after CABG. We assessed the relation between MMPI-2 profiles derived by cluster analysis, clinical characteristics and outcome at 3-year follow-up. Among the 215 patients (76% men, median age 66 years) with valid criteria in control scales, we identified 3 clusters (G) with homogenous psychological characteristics: G1 patients (N = 75) presented somatoform complaints but overall minimal psychological distress. G2 patients (N=72) presented type D personality traits. G3 subjects (N=68) showed a trend to cynicism, mild increases in anger, social introversion and hostility. Clusters overlapped for clinical characteristics such as smoking (G1 21%, G2 24%, G3 24%, p ns), previous myocardial infarction (G1 43%, G2 47%, G3 49% p ns), LV ejection fraction (G1 60 [51-60]; G2 58 [49-60]; G3 60 [55-60], p ns), 3-vessel-disease prevalence (G1 69%, G2 65%, G3 71%, p ns). Three-year event rates were comparable (G1 15%; G2 18%; G3 15%) and Kaplan-Meier curves overlapped among clusters (p ns). After CABG, the interpretation of MMPI-2 by cluster analysis is useful for the psychological and personological diagnosis to direct psychological assistance. Conversely, results from cluster analysis of MMPI-2 do not seem helpful to the clinician to predict long term outcome.
Banelli, Barbara; Brigati, Claudio; Di Vinci, Angela; Casciano, Ida; Forlani, Alessandra; Borzì, Luana; Allemanni, Giorgio; Romani, Massimo
2012-03-01
Epigenetic alterations are hallmarks of cancer and powerful biomarkers, whose clinical utilization is made difficult by the absence of standardization and of common methods of data interpretation. The coordinate methylation of many loci in cancer is defined as 'CpG island methylator phenotype' (CIMP) and identifies clinically distinct groups of patients. In neuroblastoma (NB), CIMP is defined by a methylation signature, which includes different loci, but its predictive power on outcome is entirely recapitulated by the PCDHB cluster only. We have developed a robust and cost-effective pyrosequencing-based assay that could facilitate the clinical application of CIMP in NB. This assay permits the unbiased simultaneous amplification and sequencing of 17 out of 19 genes of the PCDHB cluster for quantitative methylation analysis, taking into account all the sequence variations. As some of these variations were at CpG doublets, we bypassed the data interpretation conducted by the methylation analysis software to assign the corrected methylation value at these sites. The final result of the assay is the mean methylation level of 17 gene fragments in the protocadherin B cluster (PCDHB) cluster. We have utilized this assay to compare the methylation levels of the PCDHB cluster between high-risk and very low-risk NB patients, confirming the predictive value of CIMP. Our results demonstrate that the pyrosequencing-based assay herein described is a powerful instrument for the analysis of this gene cluster that may simplify the data comparison between different laboratories and, in perspective, could facilitate its clinical application. Furthermore, our results demonstrate that, in principle, pyrosequencing can be efficiently utilized for the methylation analysis of gene clusters with high internal homologies.
Dental Laboratory Career Ladder (AFSC 4Y1X1)
1994-08-01
analysis identified one job cluster and seven jobs: Base Dental Lab cluster, Orthodontic Appliance Fabricator job, Fixed Restoration Fabricator job, Crown...reline and repair, removable partial denture construction, crown and fixed partial denture construction, fabrication of orthodontic appliances, and...specialized prostheses. Preventive maintenance and safety precautions for dental laboratory equipment are also stressed . Entry into the career ladder
The Application of Clustering Techniques to Citation Data. Research Reports Series B No. 6.
ERIC Educational Resources Information Center
Arms, William Y.; Arms, Caroline
This report describes research carried out as part of the Design of Information Systems in the Social Sciences (DISISS) project. Cluster analysis techniques were applied to a machine readable file of bibliographic data in the form of cited journal titles in order to identify groupings which could be used to structure bibliographic files. Practical…
A Typology of Teacher-Rated Child Behavior: Revisiting Subgroups over 10 Years Later
ERIC Educational Resources Information Center
DiStefano, Christine A.; Kamphaus, Randy W.; Mindrila, Diana L.
2010-01-01
The purpose of this article was to examine a typology of child behavior using the Behavioral Assessment System for Children, Teacher Rating Scale (BASC TRS-C, 2nd edition; Reynolds & Kamphaus, 2004). The typology was compared with the solution identified from the 1992 BASC TRS-C norm dataset. Using cluster analysis, a seven-cluster solution…
Advanced Solar Observatory (ASO) accommodations requirements study
NASA Technical Reports Server (NTRS)
1989-01-01
Results of an accommodations analysis for the Advanced Solar Observatory on Space Station Freedom are reported. Concepts for the High Resolution Telescope Cluster, Pinhole/Occulter Facility, and High Energy Cluster were developed which can be accommodated on Space Station Freedom. It is shown that workable accommodations concepts are possible. Areas of emphasis for the next stage of engineering development are identified.
Patterns of psychological responses in parents of children that underwent stem cell transplantation.
Riva, Roberto; Forinder, Ulla; Arvidson, Johan; Mellgren, Karin; Toporski, Jacek; Winiarski, Jacek; Norberg, Annika Lindahl
2014-11-01
Hematopoietic stem cell transplantation (HSCT) is curative in several life-threatening pediatric diseases but may affect children and their families inducing depression, anxiety, burnout symptoms, and post-traumatic stress symptoms, as well as post-traumatic growth (PTG). The aim of this study was to investigate the co-occurrence of different aspects of such responses in parents of children that had undergone HSCT. Questionnaires were completed by 260 parents (146 mothers and 114 fathers) 11-198 months after HSCT: the Hospital Anxiety and Depression Scale, the Shirom-Melamed Burnout Questionnaire, the post-traumatic stress disorders checklist, civilian version, and the PTG inventory. Additional variables were also investigated: perceived support, time elapsed since HSCT, job stress, partner-relationship satisfaction, trauma appraisal, and the child's health problems. A hierarchical cluster analysis and a k-means cluster analysis were used to identify patterns of psychological responses. Four clusters of parents with different psychological responses were identified. One cluster (n = 40) significantly differed from the other groups and reported levels of depression, anxiety, burnout symptoms, and post-traumatic stress symptoms above the cut-off. In contrast, another cluster (n = 66) reported higher levels of PTG than the other groups did. This study shows a subgroup of parents maintaining high levels of several aspects of distress years after HSCT. Differences between clusters might be explained by differences in perceived support, the child's health problems, job stress, and partner-relationship satisfaction. Copyright © 2014 John Wiley & Sons, Ltd.
M Weerasekera, Manjula; H Sissons, Chris; Wong, Lisa; A Anderson, Sally; R Holmes, Ann; D Cannon, Richard
2017-10-01
The aim was to investigate the relationship between groups of bacteria identified by cluster analysis of the DGGE fingerprints and the amounts and diversity of yeast present. Bacterial and yeast populations in saliva samples from 24 adults were analysed using denaturing gradient gel electrophoresis (DGGE) of the bacteria present and by yeast culture. Eubacterial DGGE banding patterns showed considerable variation between individuals. Seventy one different amplicon bands were detected, the band number per saliva sample ranged from 21 to 39 (mean±SD=29.3±4.9). Cluster and principal component analysis of the bacterial DGGE patterns yielded three major clusters containing 20 of the samples. Seventeen of the 24 (71%) saliva samples were yeast positive with concentrations up to 10 3 cfu/mL. Candida albicans was the predominant species in saliva samples although six other yeast species, including Candida dubliniensis, Candida tropicalis, Candida krusei, Candida guilliermondii, Candida rugosa and Saccharomyces cerevisiae, were identified. The presence, concentration, and species of yeast in samples showed no clear relationship to the bacterial clusters. Despite indications of in vitro bacteria-yeast interactions, there was a lack of association between the presence, identity and diversity of yeasts and the bacterial DGGE fingerprint clusters in saliva. This suggests significant ecological individual-specificity of these associations in highly complex in vivo oral biofilm systems under normal oral conditions. Copyright © 2017 Elsevier Ltd. All rights reserved.
Demographic but not geographic insularity in HIV transmission among young black MSM.
Oster, Alexandra M; Pieniazek, Danuta; Zhang, Xinjian; Switzer, William M; Ziebell, Rebecca A; Mena, Leandro A; Wei, Xierong; Johnson, Kendra L; Singh, Sonita K; Thomas, Peter E; Elmore, Kimberlee A; Heffelfinger, James D
2011-11-13
To understand patterns of HIV transmission among young black MSM and others in Mississippi. Phylogenetic analysis of HIV-1 polymerase (pol) sequences from 799 antiretroviral-naive persons newly diagnosed with HIV infection in Mississippi during 2005-2008, 130 (16%) of whom were black MSM aged 16-25 years. We identified phylogenetic clusters and used surveillance data to evaluate demographic attributes and risk factors of all persons in clusters that included black MSM aged 16-25 years. We identified 82 phylogenetic clusters, 21 (26%) of which included HIV strains from at least one young black MSM. Of the 69 persons in these clusters, 59 were black MSM and seven were black men with unknown transmission category; the remaining three were MSM of white or Hispanic race/ethnicity. Of these 21 clusters, 10 included residents of one geographic region of Mississippi, whereas 11 included residents of multiple regions or outside of the state. Phylogenetic clusters involving HIV-infected young black MSM were homogeneous with respect to demographic and risk characteristics, suggesting insularity of this population with respect to HIV transmission, but were geographically heterogeneous. Reducing HIV transmission among young black MSM in Mississippi may require prevention strategies that are tailored to young black MSM and those in their sexual networks, and prevention interventions should be delivered in a manner to reach young black MSM throughout the state. Phylogenetic analysis can be a tool for local jurisdictions to understand the transmission dynamics in their areas.
Polymorphisms and linkage analysis for ICAM-1 and the selectin gene cluster
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vora, D.K.; Rosenbloom, C.L.; Cottingham, R.W.
1994-06-01
Genetic polymorphisms in leukocyte and endothelial cell adhesion molecules may be important variables with regard to susceptibility to multifactorial disease processes that include an inflammatory component. For this reason, polymorphisms were sought for intercellular adhesion molecule-1 (ICAM-1; gene symbol ICAM1) and for the three genes in the selectin cluster, P-selectin, L-selectin, and E-selectin (gene symbols SELP, SELL, and SELE, respectively). Two amino acid polymorphisms were identified for ICAM-1; Gly or Arg at codon 241 and Lys or Glu at codon 469. Dinucleotide repeat polymorphisms were identified in the 3{prime}-untranslated region for ICAM-1 and in intron 9 for P-selectin. Restriction fragmentmore » length polymorphisms were found using cDNAs for each of the three selectin genes as probes; E-selectin with BglII, P-selectin with ScaI, and L-selectin with HincII. Linkage analysis was performed for the selectin gene cluster and for ICAM-1 using the CEPH families; ICAM-1 is very tightly linked to the LDL receptor on chromosome 19, and the selectin cluster is linked to markers at chromosome 1q23. 41 refs., 2 tabs.« less
Vasylkivska, Veronika S.; Huerta, Nicolas J.
2017-06-24
Determining the spatiotemporal characteristics of natural and induced seismic events holds the opportunity to gain new insights into why these events occur. Linking the seismicity characteristics with other geologic, geographic, natural, or anthropogenic factors could help to identify the causes and suggest mitigation strategies that reduce the risk associated with such events. The nearest-neighbor approach utilized in this work represents a practical first step toward identifying statistically correlated clusters of recorded earthquake events. Detailed study of the Oklahoma earthquake catalog’s inherent errors, empirical model parameters, and model assumptions is presented. We found that the cluster analysis results are stable withmore » respect to empirical parameters (e.g., fractal dimension) but were sensitive to epicenter location errors and seismicity rates. Most critically, we show that the patterns in the distribution of earthquake clusters in Oklahoma are primarily defined by spatial relationships between events. This observation is a stark contrast to California (also known for induced seismicity) where a comparable cluster distribution is defined by both spatial and temporal interactions between events. These results highlight the difficulty in understanding the mechanisms and behavior of induced seismicity but provide insights for future work.« less
Spatial analysis of malaria in Anhui province, China
Zhang, Wenyi; Wang, Liping; Fang, Liqun; Ma, Jiaqi; Xu, Youfu; Jiang, Jiafu; Hui, Fengming; Wang, Jianjun; Liang, Song; Yang, Hong; Cao, Wuchun
2008-01-01
Background Malaria has re-emerged in Anhui Province, China, and this province was the most seriously affected by malaria during 2005–2006. It is necessary to understand the spatial distribution of malaria cases and to identify highly endemic areas for future public health planning and resource allocation in Anhui Province. Methods The annual average incidence at the county level was calculated using malaria cases reported between 2000 and 2006 in Anhui Province. GIS-based spatial analyses were conducted to detect spatial distribution and clustering of malaria incidence at the county level. Results The spatial distribution of malaria cases in Anhui Province from 2000 to 2006 was mapped at the county level to show crude incidence, excess hazard and spatial smoothed incidence. Spatial cluster analysis suggested 10 and 24 counties were at increased risk for malaria (P < 0.001) with the maximum spatial cluster sizes at < 50% and < 25% of the total population, respectively. Conclusion The application of GIS, together with spatial statistical techniques, provide a means to quantify explicit malaria risks and to further identify environmental factors responsible for the re-emerged malaria risks. Future public health planning and resource allocation in Anhui Province should be focused on the maximum spatial cluster region. PMID:18847489