Stopka, Thomas J; Goulart, Michael A; Meyers, David J; Hutcheson, Marga; Barton, Kerri; Onofrey, Shauna; Church, Daniel; Donahue, Ashley; Chui, Kenneth K H
2017-04-20
Hepatitis C virus (HCV) infections have increased during the past decade but little is known about geographic clustering patterns. We used a unique analytical approach, combining geographic information systems (GIS), spatial epidemiology, and statistical modeling to identify and characterize HCV hotspots, statistically significant clusters of census tracts with elevated HCV counts and rates. We compiled sociodemographic and HCV surveillance data (n = 99,780 cases) for Massachusetts census tracts (n = 1464) from 2002 to 2013. We used a five-step spatial epidemiological approach, calculating incremental spatial autocorrelations and Getis-Ord Gi* statistics to identify clusters. We conducted logistic regression analyses to determine factors associated with the HCV hotspots. We identified nine HCV clusters, with the largest in Boston, New Bedford/Fall River, Worcester, and Springfield (p < 0.05). In multivariable analyses, we found that HCV hotspots were independently and positively associated with the percent of the population that was Hispanic (adjusted odds ratio [AOR]: 1.07; 95% confidence interval [CI]: 1.04, 1.09) and the percent of households receiving food stamps (AOR: 1.83; 95% CI: 1.22, 2.74). HCV hotspots were independently and negatively associated with the percent of the population that were high school graduates or higher (AOR: 0.91; 95% CI: 0.89, 0.93) and the percent of the population in the "other" race/ethnicity category (AOR: 0.88; 95% CI: 0.85, 0.91). We identified locations where HCV clusters were a concern, and where enhanced HCV prevention, treatment, and care can help combat the HCV epidemic in Massachusetts. GIS, spatial epidemiological and statistical analyses provided a rigorous approach to identify hotspot clusters of disease, which can inform public health policy and intervention targeting. Further studies that incorporate spatiotemporal cluster analyses, Bayesian spatial and geostatistical models, spatially weighted regression analyses, and assessment of associations between HCV clustering and the built environment are needed to expand upon our combined spatial epidemiological and statistical methods.
Is Technology-Mediated Parental Monitoring Related to Adolescent Substance Use?
Rudi, Jessie; Dworkin, Jodi
2018-01-03
Prevention researchers have identified parental monitoring leading to parental knowledge to be a protective factor against adolescent substance use. In today's digital society, parental monitoring can occur using technology-mediated communication methods, such as text messaging, email, and social networking sites. The current study aimed to identify patterns, or clusters, of in-person and technology-mediated monitoring behaviors, and examine differences between the patterns (clusters) in adolescent substance use. Cross-sectional survey data were collected from 289 parents of adolescents using Facebook and Amazon Mechanical Turk (MTurk). Cluster analyses were computed to identify patterns of in-person and technology-mediated monitoring behaviors, and chi-square analyses were computed to examine differences in substance use between the identified clusters. Three monitoring clusters were identified: a moderate in-person and moderate technology-mediated monitoring cluster (moderate-moderate), a high in-person and high technology-mediated monitoring cluster (high-high), and a high in-person and low technology-mediated monitoring cluster (high-low). Higher frequency of technology-mediated parental monitoring was not associated with lower levels of substance use. Results show that higher levels of technology-mediated parental monitoring may not be associated with adolescent substance use.
Deckersbach, Thilo; Peters, Amy T.; Sylvia, Louisa G.; Gold, Alexandra K.; da Silva Magalhaes, Pedro Vieira; Henry, David B.; Frank, Ellen; Otto, Michael W.; Berk, Michael; Dougherty, Darin D.; Nierenberg, Andrew A.; Miklowitz, David J.
2016-01-01
Background We sought to address how predictors and moderators of psychotherapy for bipolar depression – identified individually in prior analyses – can inform the development of a metric for prospectively classifying treatment outcome in intensive psychotherapy (IP) versus collaborative care (CC) adjunctive to pharmacotherapy in the Systematic Treatment Enhancement Program (STEP-BD) study. Methods We conducted post-hoc analyses on 135 STEP-BD participants using cluster analysis to identify subsets of participants with similar clinical profiles and investigated this combined metric as a moderator and predictor of response to IP. We used agglomerative hierarchical cluster analyses and k-means clustering to determine the content of the clinical profiles. Logistic regression and Cox proportional hazard models were used to evaluate whether the resulting clusters predicted or moderated likelihood of recovery or time until recovery. Results The cluster analysis yielded a two-cluster solution: 1) “less-recurrent/severe” and 2) “chronic/recurrent.” Rates of recovery in IP were similar for less-recurrent/severe and chronic/recurrent participants. Less-recurrent/severe patients were more likely than chronic/recurrent patients to achieve recovery in CC (p = .040, OR = 4.56). IP yielded a faster recovery for chronic/recurrent participants, whereas CC led to recovery sooner in the less-recurrent/severe cluster (p = .034, OR = 2.62). Limitations Cluster analyses require list-wise deletion of cases with missing data so we were unable to conduct analyses on all STEP-BD participants. Conclusions A well-powered, parametric approach can distinguish patients based on illness history and provide clinicians with symptom profiles of patients that confer differential prognosis in CC vs. IP. PMID:27289316
Comprehensive Molecular Characterization of Muscle-Invasive Bladder Cancer.
Robertson, A Gordon; Kim, Jaegil; Al-Ahmadie, Hikmat; Bellmunt, Joaquim; Guo, Guangwu; Cherniack, Andrew D; Hinoue, Toshinori; Laird, Peter W; Hoadley, Katherine A; Akbani, Rehan; Castro, Mauro A A; Gibb, Ewan A; Kanchi, Rupa S; Gordenin, Dmitry A; Shukla, Sachet A; Sanchez-Vega, Francisco; Hansel, Donna E; Czerniak, Bogdan A; Reuter, Victor E; Su, Xiaoping; de Sa Carvalho, Benilton; Chagas, Vinicius S; Mungall, Karen L; Sadeghi, Sara; Pedamallu, Chandra Sekhar; Lu, Yiling; Klimczak, Leszek J; Zhang, Jiexin; Choo, Caleb; Ojesina, Akinyemi I; Bullman, Susan; Leraas, Kristen M; Lichtenberg, Tara M; Wu, Catherine J; Schultz, Nicholaus; Getz, Gad; Meyerson, Matthew; Mills, Gordon B; McConkey, David J; Weinstein, John N; Kwiatkowski, David J; Lerner, Seth P
2017-10-19
We report a comprehensive analysis of 412 muscle-invasive bladder cancers characterized by multiple TCGA analytical platforms. Fifty-eight genes were significantly mutated, and the overall mutational load was associated with APOBEC-signature mutagenesis. Clustering by mutation signature identified a high-mutation subset with 75% 5-year survival. mRNA expression clustering refined prior clustering analyses and identified a poor-survival "neuronal" subtype in which the majority of tumors lacked small cell or neuroendocrine histology. Clustering by mRNA, long non-coding RNA (lncRNA), and miRNA expression converged to identify subsets with differential epithelial-mesenchymal transition status, carcinoma in situ scores, histologic features, and survival. Our analyses identified 5 expression subtypes that may stratify response to different treatments. Copyright © 2017 Elsevier Inc. All rights reserved.
Identifying sighting clusters of endangered taxa with historical records.
Duffy, Karl J
2011-04-01
The probability and time of extinction of taxa is often inferred from statistical analyses of historical records. Many of these analyses require the exclusion of multiple records within a unit of time (i.e., a month or a year). Nevertheless, spatially explicit, temporally aggregated data may be useful for identifying clusters of sightings (i.e., sighting clusters) in space and time. Identification of sighting clusters highlights changes in the historical recording of endangered taxa. I used two methods to identify sighting clusters in historical records: the Ederer-Myers-Mantel (EMM) test and the space-time permutation scan (STPS). I applied these methods to the spatially explicit sighting records of three species of orchids that are listed as endangered in the Republic of Ireland under the Wildlife Act (1976): Cephalanthera longifolia, Hammarbya paludosa, and Pseudorchis albida. Results with the EMM test were strongly affected by the choice of the time interval, and thus the number of temporal samples, used to examine the records. For example, sightings of P. albida clustered when the records were partitioned into 20-year temporal samples, but not when they were partitioned into 22-year temporal samples. Because the statistical power of EMM was low, it will not be useful when data are sparse. Nevertheless, the STPS identified regions that contained sighting clusters because it uses a flexible scanning window (defined by cylinders of varying size that move over the study area and evaluate the likelihood of clustering) to detect them, and it identified regions with high and regions with low rates of orchid sightings. The STPS analyses can be used to detect sighting clusters of endangered species that may be related to regions of extirpation and may assist in the categorization of threat status. ©2010 Society for Conservation Biology.
Hahus, Ian; Migliaccio, Kati; Douglas-Mankin, Kyle; Klarenberg, Geraldine; Muñoz-Carpena, Rafael
2018-04-27
Hierarchical and partitional cluster analyses were used to compartmentalize Water Conservation Area 1, a managed wetland within the Arthur R. Marshall Loxahatchee National Wildlife Refuge in southeast Florida, USA, based on physical, biological, and climatic geospatial attributes. Single, complete, average, and Ward's linkages were tested during the hierarchical cluster analyses, with average linkage providing the best results. In general, the partitional method, partitioning around medoids, found clusters that were more evenly sized and more spatially aggregated than those resulting from the hierarchical analyses. However, hierarchical analysis appeared to be better suited to identify outlier regions that were significantly different from other areas. The clusters identified by geospatial attributes were similar to clusters developed for the interior marsh in a separate study using water quality attributes, suggesting that similar factors have influenced variations in both the set of physical, biological, and climatic attributes selected in this study and water quality parameters. However, geospatial data allowed further subdivision of several interior marsh clusters identified from the water quality data, potentially indicating zones with important differences in function. Identification of these zones can be useful to managers and modelers by informing the distribution of monitoring equipment and personnel as well as delineating regions that may respond similarly to future changes in management or climate.
Visualizing Confidence in Cluster-Based Ensemble Weather Forecast Analyses.
Kumpf, Alexander; Tost, Bianca; Baumgart, Marlene; Riemer, Michael; Westermann, Rudiger; Rautenhaus, Marc
2018-01-01
In meteorology, cluster analysis is frequently used to determine representative trends in ensemble weather predictions in a selected spatio-temporal region, e.g., to reduce a set of ensemble members to simplify and improve their analysis. Identified clusters (i.e., groups of similar members), however, can be very sensitive to small changes of the selected region, so that clustering results can be misleading and bias subsequent analyses. In this article, we - a team of visualization scientists and meteorologists-deliver visual analytics solutions to analyze the sensitivity of clustering results with respect to changes of a selected region. We propose an interactive visual interface that enables simultaneous visualization of a) the variation in composition of identified clusters (i.e., their robustness), b) the variability in cluster membership for individual ensemble members, and c) the uncertainty in the spatial locations of identified trends. We demonstrate that our solution shows meteorologists how representative a clustering result is, and with respect to which changes in the selected region it becomes unstable. Furthermore, our solution helps to identify those ensemble members which stably belong to a given cluster and can thus be considered similar. In a real-world application case we show how our approach is used to analyze the clustering behavior of different regions in a forecast of "Tropical Cyclone Karl", guiding the user towards the cluster robustness information required for subsequent ensemble analysis.
Kopelman, Naama M; Mayzel, Jonathan; Jakobsson, Mattias; Rosenberg, Noah A; Mayrose, Itay
2015-09-01
The identification of the genetic structure of populations from multilocus genotype data has become a central component of modern population-genetic data analysis. Application of model-based clustering programs often entails a number of steps, in which the user considers different modelling assumptions, compares results across different predetermined values of the number of assumed clusters (a parameter typically denoted K), examines multiple independent runs for each fixed value of K, and distinguishes among runs belonging to substantially distinct clustering solutions. Here, we present Clumpak (Cluster Markov Packager Across K), a method that automates the postprocessing of results of model-based population structure analyses. For analysing multiple independent runs at a single K value, Clumpak identifies sets of highly similar runs, separating distinct groups of runs that represent distinct modes in the space of possible solutions. This procedure, which generates a consensus solution for each distinct mode, is performed by the use of a Markov clustering algorithm that relies on a similarity matrix between replicate runs, as computed by the software Clumpp. Next, Clumpak identifies an optimal alignment of inferred clusters across different values of K, extending a similar approach implemented for a fixed K in Clumpp and simplifying the comparison of clustering results across different K values. Clumpak incorporates additional features, such as implementations of methods for choosing K and comparing solutions obtained by different programs, models, or data subsets. Clumpak, available at http://clumpak.tau.ac.il, simplifies the use of model-based analyses of population structure in population genetics and molecular ecology. © 2015 John Wiley & Sons Ltd.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Neupane, Ghanashyam; McLing, Travis; Mattson, Earl
The presented database includes water chemistry data and structural rating values for various geothermal features used for performing principal component (PC) and cluster analyses work to identify promising KGRAs and IHRAs in southern Idaho and southeastern Oregon. A brief note on various KGRAs/IHRAs is also included herewith. Results of PC and cluster analyses are presented as a separate paper (Lindsey et al., 2017) that is, as of the time of this submission, in 'revision' status.
Cluster Analysis to Identify Possible Subgroups in Tinnitus Patients.
van den Berge, Minke J C; Free, Rolien H; Arnold, Rosemarie; de Kleine, Emile; Hofman, Rutger; van Dijk, J Marc C; van Dijk, Pim
2017-01-01
In tinnitus treatment, there is a tendency to shift from a "one size fits all" to a more individual, patient-tailored approach. Insight in the heterogeneity of the tinnitus spectrum might improve the management of tinnitus patients in terms of choice of treatment and identification of patients with severe mental distress. The goal of this study was to identify subgroups in a large group of tinnitus patients. Data were collected from patients with severe tinnitus complaints visiting our tertiary referral tinnitus care group at the University Medical Center Groningen. Patient-reported and physician-reported variables were collected during their visit to our clinic. Cluster analyses were used to characterize subgroups. For the selection of the right variables to enter in the cluster analysis, two approaches were used: (1) variable reduction with principle component analysis and (2) variable selection based on expert opinion. Various variables of 1,783 tinnitus patients were included in the analyses. Cluster analysis (1) included 976 patients and resulted in a four-cluster solution. The effect of external influences was the most discriminative between the groups, or clusters, of patients. The "silhouette measure" of the cluster outcome was low (0.2), indicating a "no substantial" cluster structure. Cluster analysis (2) included 761 patients and resulted in a three-cluster solution, comparable to the first analysis. Again, a "no substantial" cluster structure was found (0.2). Two cluster analyses on a large database of tinnitus patients revealed that clusters of patients are mostly formed by a different response of external influences on their disease. However, both cluster outcomes based on this dataset showed a poor stability, suggesting that our tinnitus population comprises a continuum rather than a number of clearly defined subgroups.
Jabson, Jennifer M.; Bowen, Deborah; Weinberg, Janice; Kroenke, Candyce; Luo, Juhua; Messina, Catherine; Shumaker, Sally; Tindle, Hilary A.
2016-01-01
BACKGROUND Strategies for identifying the most relevant psychosocial predictors in studies of racial/ethnic minority women’s health are limited because they largely exclude cultural influences and they assume that psychosocial predictors are independent. This paper proposes and tests an empirical solution. METHODS Hierarchical cluster analysis, conducted with data from 140,652 Women’s Health Initiative participants, identified clusters among individual psychosocial predictors. Multivariable analyses tested associations between clusters and health outcomes. RESULTS A Social Cluster and a Stress Cluster were identified. The Social Cluster was positively associated with well-being and inversely associated with chronic disease index, and the Stress Cluster was inversely associated with well-being and positively associated with chronic disease index. As hypothesized, the magnitude of association between clusters and outcomes differed by race/ethnicity. CONCLUSIONS By identifying psychosocial clusters and their associations with health, we have taken an important step toward understanding how individual psychosocial predictors interrelate and how empirically formed Stress and Social clusters relate to health outcomes. This study has also demonstrated important insight about differences in associations between these psychosocial clusters and health among racial/ethnic minorities. These differences could signal the best pathways for intervention modification and tailoring. PMID:27279761
ERIC Educational Resources Information Center
DiStefano, Christine; Kamphaus, R. W.
2006-01-01
Two classification methods, latent class cluster analysis and cluster analysis, are used to identify groups of child behavioral adjustment underlying a sample of elementary school children aged 6 to 11 years. Behavioral rating information across 14 subscales was obtained from classroom teachers and used as input for analyses. Both the procedures…
2009-01-01
Background Tardigrades represent an animal phylum with extraordinary resistance to environmental stress. Results To gain insights into their stress-specific adaptation potential, major clusters of related and similar proteins are identified, as well as specific functional clusters delineated comparing all tardigrades and individual species (Milnesium tardigradum, Hypsibius dujardini, Echiniscus testudo, Tulinus stephaniae, Richtersius coronifer) and functional elements in tardigrade mRNAs are analysed. We find that 39.3% of the total sequences clustered in 58 clusters of more than 20 proteins. Among these are ten tardigrade specific as well as a number of stress-specific protein clusters. Tardigrade-specific functional adaptations include strong protein, DNA- and redox protection, maintenance and protein recycling. Specific regulatory elements regulate tardigrade mRNA stability such as lox P DICE elements whereas 14 other RNA elements of higher eukaryotes are not found. Further features of tardigrade specific adaption are rapidly identified by sequence and/or pattern search on the web-tool tardigrade analyzer http://waterbear.bioapps.biozentrum.uni-wuerzburg.de. The work-bench offers nucleotide pattern analysis for promotor and regulatory element detection (tardigrade specific; nrdb) as well as rapid COG search for function assignments including species-specific repositories of all analysed data. Conclusion Different protein clusters and regulatory elements implicated in tardigrade stress adaptations are analysed including unpublished tardigrade sequences. PMID:19821996
Förster, Frank; Liang, Chunguang; Shkumatov, Alexander; Beisser, Daniela; Engelmann, Julia C; Schnölzer, Martina; Frohme, Marcus; Müller, Tobias; Schill, Ralph O; Dandekar, Thomas
2009-10-12
Tardigrades represent an animal phylum with extraordinary resistance to environmental stress. To gain insights into their stress-specific adaptation potential, major clusters of related and similar proteins are identified, as well as specific functional clusters delineated comparing all tardigrades and individual species (Milnesium tardigradum, Hypsibius dujardini, Echiniscus testudo, Tulinus stephaniae, Richtersius coronifer) and functional elements in tardigrade mRNAs are analysed. We find that 39.3% of the total sequences clustered in 58 clusters of more than 20 proteins. Among these are ten tardigrade specific as well as a number of stress-specific protein clusters. Tardigrade-specific functional adaptations include strong protein, DNA- and redox protection, maintenance and protein recycling. Specific regulatory elements regulate tardigrade mRNA stability such as lox P DICE elements whereas 14 other RNA elements of higher eukaryotes are not found. Further features of tardigrade specific adaption are rapidly identified by sequence and/or pattern search on the web-tool tardigrade analyzer http://waterbear.bioapps.biozentrum.uni-wuerzburg.de. The work-bench offers nucleotide pattern analysis for promotor and regulatory element detection (tardigrade specific; nrdb) as well as rapid COG search for function assignments including species-specific repositories of all analysed data. Different protein clusters and regulatory elements implicated in tardigrade stress adaptations are analysed including unpublished tardigrade sequences.
Density-based clustering analyses to identify heterogeneous cellular sub-populations
NASA Astrophysics Data System (ADS)
Heaster, Tiffany M.; Walsh, Alex J.; Landman, Bennett A.; Skala, Melissa C.
2017-02-01
Autofluorescence microscopy of NAD(P)H and FAD provides functional metabolic measurements at the single-cell level. Here, density-based clustering algorithms were applied to metabolic autofluorescence measurements to identify cell-level heterogeneity in tumor cell cultures. The performance of the density-based clustering algorithm, DENCLUE, was tested in samples with known heterogeneity (co-cultures of breast carcinoma lines). DENCLUE was found to better represent the distribution of cell clusters compared to Gaussian mixture modeling. Overall, DENCLUE is a promising approach to quantify cell-level heterogeneity, and could be used to understand single cell population dynamics in cancer progression and treatment.
Studt, Lena; Niehaus, Eva-Maria; Espino, Jose J.; Huß, Kathleen; Michielse, Caroline B.; Albermann, Sabine; Wagner, Dominik; Bergner, Sonja V.; Connolly, Lanelle R.; Fischer, Andreas; Reuter, Gunter; Kleigrewe, Karin; Bald, Till; Wingfield, Brenda D.; Ophir, Ron; Freeman, Stanley; Hippler, Michael; Smith, Kristina M.; Brown, Daren W.; Proctor, Robert H.; Münsterkötter, Martin; Freitag, Michael; Humpf, Hans-Ulrich; Güldener, Ulrich; Tudzynski, Bettina
2013-01-01
The fungus Fusarium fujikuroi causes “bakanae” disease of rice due to its ability to produce gibberellins (GAs), but it is also known for producing harmful mycotoxins. However, the genetic capacity for the whole arsenal of natural compounds and their role in the fungus' interaction with rice remained unknown. Here, we present a high-quality genome sequence of F. fujikuroi that was assembled into 12 scaffolds corresponding to the 12 chromosomes described for the fungus. We used the genome sequence along with ChIP-seq, transcriptome, proteome, and HPLC-FTMS-based metabolome analyses to identify the potential secondary metabolite biosynthetic gene clusters and to examine their regulation in response to nitrogen availability and plant signals. The results indicate that expression of most but not all gene clusters correlate with proteome and ChIP-seq data. Comparison of the F. fujikuroi genome to those of six other fusaria revealed that only a small number of gene clusters are conserved among these species, thus providing new insights into the divergence of secondary metabolism in the genus Fusarium. Noteworthy, GA biosynthetic genes are present in some related species, but GA biosynthesis is limited to F. fujikuroi, suggesting that this provides a selective advantage during infection of the preferred host plant rice. Among the genome sequences analyzed, one cluster that includes a polyketide synthase gene (PKS19) and another that includes a non-ribosomal peptide synthetase gene (NRPS31) are unique to F. fujikuroi. The metabolites derived from these clusters were identified by HPLC-FTMS-based analyses of engineered F. fujikuroi strains overexpressing cluster genes. In planta expression studies suggest a specific role for the PKS19-derived product during rice infection. Thus, our results indicate that combined comparative genomics and genome-wide experimental analyses identified novel genes and secondary metabolites that contribute to the evolutionary success of F. fujikuroi as a rice pathogen. PMID:23825955
Clustering of self-organizing map identifies five distinct medulloblastoma subgroups.
Cao, Changjun; Wang, Wei; Jiang, Pucha
2016-01-01
Medulloblastoma is one the most malignant paediatric brain tumours. Molecular subgrouping these medulloblastomas will not only help identify specific cohorts for certain treatment but also improve confidence in prognostic prediction. Currently, there is a consensus of the existences of four distinct subtypes of medulloblastoma. We proposed a novel bioinformatics method, clustering of self-organizing map, to determine the subgroups and their molecular diversity. Microarray expression profiles of 46 medulloblastoma samples were analysed and five clusters with distinct demographics, clinical outcome and transcriptional profiles were identified. The previously reported Wnt subgroup was identified as expected. Three other novel subgroups were proposed for later investigation. Our findings underscore the value of SOM clustering for discovering the medulloblastoma subgroups. When the suggested subdivision has been confirmed in large cohorts, this method should serve as a part of routine classification of clinical samples.
Rayward, Anna T; Duncan, Mitch J; Brown, Wendy J; Plotnikoff, Ronald C; Burton, Nicola W
2017-08-01
This study aimed to identify how different patterns of physical activity, sleep duration and sleep quality cluster together, and to examine how the identified clusters differ in terms of socio-demographic and health characteristics. Participants were adults from Brisbane, Australia, aged 42-72 years who reported their physical activity, sleep duration, sleep quality, socio-demographic and health characteristics in 2011 (n=5854). Two-step Cluster Analyses were used to identify clusters. Cluster differences in socio-demographic and health characteristics were examined using chi square tests (p<0.05). Four clusters were identified: 'Poor Sleepers' (31.2%), 'Moderate Sleepers' (30.7%), 'Mixed Sleepers/Highly Active' (20.5%), and 'Excellent Sleepers/Mixed Activity' (17.6%). The 'Poor Sleepers' cluster had the highest proportion of participants with less-than-recommended sleep duration and poor sleep quality, had the poorest health characteristics and a high proportion of participants with low physical activity. Physical activity, sleep duration and sleep quality cluster together in distinct patterns and clusters of poor behaviours are associated with poor health status. Multiple health behaviour change interventions which target both physical activity and sleep should be prioritised to improve health outcomes in mid-aged adults. Copyright © 2017 Elsevier B.V. All rights reserved.
Detection of protein complex from protein-protein interaction network using Markov clustering
NASA Astrophysics Data System (ADS)
Ochieng, P. J.; Kusuma, W. A.; Haryanto, T.
2017-05-01
Detection of complexes, or groups of functionally related proteins, is an important challenge while analysing biological networks. However, existing algorithms to identify protein complexes are insufficient when applied to dense networks of experimentally derived interaction data. Therefore, we introduced a graph clustering method based on Markov clustering algorithm to identify protein complex within highly interconnected protein-protein interaction networks. Protein-protein interaction network was first constructed to develop geometrical network, the network was then partitioned using Markov clustering to detect protein complexes. The interest of the proposed method was illustrated by its application to Human Proteins associated to type II diabetes mellitus. Flow simulation of MCL algorithm was initially performed and topological properties of the resultant network were analysed for detection of the protein complex. The results indicated the proposed method successfully detect an overall of 34 complexes with 11 complexes consisting of overlapping modules and 20 non-overlapping modules. The major complex consisted of 102 proteins and 521 interactions with cluster modularity and density of 0.745 and 0.101 respectively. The comparison analysis revealed MCL out perform AP, MCODE and SCPS algorithms with high clustering coefficient (0.751) network density and modularity index (0.630). This demonstrated MCL was the most reliable and efficient graph clustering algorithm for detection of protein complexes from PPI networks.
Is It Feasible to Identify Natural Clusters of TSC-Associated Neuropsychiatric Disorders (TAND)?
Leclezio, Loren; Gardner-Lubbe, Sugnet; de Vries, Petrus J
2018-04-01
Tuberous sclerosis complex (TSC) is a genetic disorder with multisystem involvement. The lifetime prevalence of TSC-Associated Neuropsychiatric Disorders (TAND) is in the region of 90% in an apparently unique, individual pattern. This "uniqueness" poses significant challenges for diagnosis, psycho-education, and intervention planning. To date, no studies have explored whether there may be natural clusters of TAND. The purpose of this feasibility study was (1) to investigate the practicability of identifying natural TAND clusters, and (2) to identify appropriate multivariate data analysis techniques for larger-scale studies. TAND Checklist data were collected from 56 individuals with a clinical diagnosis of TSC (n = 20 from South Africa; n = 36 from Australia). Using R, the open-source statistical platform, mean squared contingency coefficients were calculated to produce a correlation matrix, and various cluster analyses and exploratory factor analysis were examined. Ward's method rendered six TAND clusters with good face validity and significant convergence with a six-factor exploratory factor analysis solution. The "bottom-up" data-driven strategies identified a "scholastic" cluster of TAND manifestations, an "autism spectrum disorder-like" cluster, a "dysregulated behavior" cluster, a "neuropsychological" cluster, a "hyperactive/impulsive" cluster, and a "mixed/mood" cluster. These feasibility results suggest that a combination of cluster analysis and exploratory factor analysis methods may be able to identify clinically meaningful natural TAND clusters. Findings require replication and expansion in larger dataset, and could include quantification of cluster or factor scores at an individual level. Copyright © 2018 Elsevier Inc. All rights reserved.
Kato, Hiroki; Tsunematsu, Yuta; Yamamoto, Tsuyoshi; Namiki, Takuya; Kishimoto, Shinji; Noguchi, Hiroshi; Watanabe, Kenji
2016-07-01
To rapidly identify novel natural products and their associated biosynthetic genes from underutilized and genetically difficult-to-manipulate microbes, we developed a method that uses (1) chemical screening to isolate novel microbial secondary metabolites, (2) bioinformatic analyses to identify a potential biosynthetic gene cluster and (3) heterologous expression of the genes in a convenient host to confirm the identity of the gene cluster and the proposed biosynthetic mechanism. The chemical screen was achieved by searching known natural product databases with data from liquid chromatographic and high-resolution mass spectrometric analyses collected on the extract from a target microbe culture. Using this method, we were able to isolate two new meroterpenes, subglutinols C (1) and D (2), from an entomopathogenic filamentous fungus Metarhizium robertsii ARSEF 23. Bioinformatics analysis of the genome allowed us to identify a gene cluster likely to be responsible for the formation of subglutinols. Heterologous expression of three genes from the gene cluster encoding a polyketide synthase, a prenyltransferase and a geranylgeranyl pyrophosphate synthase in Aspergillus nidulans A1145 afforded an α-pyrone-fused uncyclized diterpene, the expected intermediate of the subglutinol biosynthesis, thereby confirming the gene cluster to be responsible for the subglutinol biosynthesis. These results indicate the usefulness of our methodology in isolating new natural products and identifying their associated biosynthetic gene cluster from microbes that are not amenable to genetic manipulation. Our method should facilitate the natural product discovery efforts by expediting the identification of new secondary metabolites and their associated biosynthetic genes from a wider source of microbes.
Response to "Comparison and Evaluation of Clustering Algorithms for Tandem Mass Spectra".
Griss, Johannes; Perez-Riverol, Yasset; The, Matthew; Käll, Lukas; Vizcaíno, Juan Antonio
2018-05-04
In the recent benchmarking article entitled "Comparison and Evaluation of Clustering Algorithms for Tandem Mass Spectra", Rieder et al. compared several different approaches to cluster MS/MS spectra. While we certainly recognize the value of the manuscript, here, we report some shortcomings detected in the original analyses. For most analyses, the authors clustered only single MS/MS runs. In one of the reported analyses, three MS/MS runs were processed together, which already led to computational performance issues in many of the tested approaches. This fact highlights the difficulties of using many of the tested algorithms on the nowadays produced average proteomics data sets. Second, the authors only processed identified spectra when merging MS runs. Thereby, all unidentified spectra that are of lower quality were already removed from the data set and could not influence the clustering results. Next, we found that the authors did not analyze the effect of chimeric spectra on the clustering results. In our analysis, we found that 3% of the spectra in the used data sets were chimeric, and this had marked effects on the behavior of the different clustering algorithms tested. Finally, the authors' choice to evaluate the MS-Cluster and spectra-cluster algorithms using a precursor tolerance of 5 Da for high-resolution Orbitrap data only was, in our opinion, not adequate to assess the performance of MS/MS clustering approaches.
Hendricks, Brian; Mark-Carew, Miguella
2017-02-01
Lyme disease is the most commonly reported vectorborne disease in the United States. The objective of our study was to identify patterns of Lyme disease reporting after multistate inclusion to mitigate potential border effects. County-level human Lyme disease surveillance data were obtained from Kentucky, Maryland, Ohio, Pennsylvania, Virginia, and West Virginia state health departments. Rate smoothing and Local Moran's I was performed to identify clusters of reporting activity and identify spatial outliers. A logistic generalized estimating equation was performed to identify significant associations in disease clustering over time. Resulting analyses identified statistically significant (P=0.05) clusters of high reporting activity and trends over time. High reporting activity aggregated near border counties in high incidence states, while low reporting aggregated near shared county borders in non-high incidence states. Findings highlight the need for exploratory surveillance approaches to describe the extent to which state level reporting affects accurate estimation of Lyme disease progression. Copyright © 2017 Elsevier Ltd. All rights reserved.
Intertumoral Heterogeneity within Medulloblastoma Subgroups.
Cavalli, Florence M G; Remke, Marc; Rampasek, Ladislav; Peacock, John; Shih, David J H; Luu, Betty; Garzia, Livia; Torchia, Jonathon; Nor, Carolina; Morrissy, A Sorana; Agnihotri, Sameer; Thompson, Yuan Yao; Kuzan-Fischer, Claudia M; Farooq, Hamza; Isaev, Keren; Daniels, Craig; Cho, Byung-Kyu; Kim, Seung-Ki; Wang, Kyu-Chang; Lee, Ji Yeoun; Grajkowska, Wieslawa A; Perek-Polnik, Marta; Vasiljevic, Alexandre; Faure-Conter, Cecile; Jouvet, Anne; Giannini, Caterina; Nageswara Rao, Amulya A; Li, Kay Ka Wai; Ng, Ho-Keung; Eberhart, Charles G; Pollack, Ian F; Hamilton, Ronald L; Gillespie, G Yancey; Olson, James M; Leary, Sarah; Weiss, William A; Lach, Boleslaw; Chambless, Lola B; Thompson, Reid C; Cooper, Michael K; Vibhakar, Rajeev; Hauser, Peter; van Veelen, Marie-Lise C; Kros, Johan M; French, Pim J; Ra, Young Shin; Kumabe, Toshihiro; López-Aguilar, Enrique; Zitterbart, Karel; Sterba, Jaroslav; Finocchiaro, Gaetano; Massimino, Maura; Van Meir, Erwin G; Osuka, Satoru; Shofuda, Tomoko; Klekner, Almos; Zollo, Massimo; Leonard, Jeffrey R; Rubin, Joshua B; Jabado, Nada; Albrecht, Steffen; Mora, Jaume; Van Meter, Timothy E; Jung, Shin; Moore, Andrew S; Hallahan, Andrew R; Chan, Jennifer A; Tirapelli, Daniela P C; Carlotti, Carlos G; Fouladi, Maryam; Pimentel, José; Faria, Claudia C; Saad, Ali G; Massimi, Luca; Liau, Linda M; Wheeler, Helen; Nakamura, Hideo; Elbabaa, Samer K; Perezpeña-Diazconti, Mario; Chico Ponce de León, Fernando; Robinson, Shenandoah; Zapotocky, Michal; Lassaletta, Alvaro; Huang, Annie; Hawkins, Cynthia E; Tabori, Uri; Bouffet, Eric; Bartels, Ute; Dirks, Peter B; Rutka, James T; Bader, Gary D; Reimand, Jüri; Goldenberg, Anna; Ramaswamy, Vijay; Taylor, Michael D
2017-06-12
While molecular subgrouping has revolutionized medulloblastoma classification, the extent of heterogeneity within subgroups is unknown. Similarity network fusion (SNF) applied to genome-wide DNA methylation and gene expression data across 763 primary samples identifies very homogeneous clusters of patients, supporting the presence of medulloblastoma subtypes. After integration of somatic copy-number alterations, and clinical features specific to each cluster, we identify 12 different subtypes of medulloblastoma. Integrative analysis using SNF further delineates group 3 from group 4 medulloblastoma, which is not as readily apparent through analyses of individual data types. Two clear subtypes of infants with Sonic Hedgehog medulloblastoma with disparate outcomes and biology are identified. Medulloblastoma subtypes identified through integrative clustering have important implications for stratification of future clinical trials. Copyright © 2017 Elsevier Inc. All rights reserved.
Krawczyk, Christopher; Gradziel, Pat; Geraghty, Estella M.
2014-01-01
Objectives. We used a geographic information system and cluster analyses to determine locations in need of enhanced Special Supplemental Nutrition Program for Women, Infants, and Children (WIC) Program services. Methods. We linked documented births in the 2010 California Birth Statistical Master File with the 2010 data from the WIC Integrated Statewide Information System. Analyses focused on the density of pregnant women who were eligible for but not receiving WIC services in California’s 7049 census tracts. We used incremental spatial autocorrelation and hot spot analyses to identify clusters of WIC-eligible nonparticipants. Results. We detected clusters of census tracts with higher-than-expected densities, compared with the state mean density of WIC-eligible nonparticipants, in 21 of 58 (36.2%) California counties (P < .05). In subsequent county-level analyses, we located neighborhood-level clusters of higher-than-expected densities of eligible nonparticipants in Sacramento, San Francisco, Fresno, and Los Angeles Counties (P < .05). Conclusions. Hot spot analyses provided a rigorous and objective approach to determine the locations of statistically significant clusters of WIC-eligible nonparticipants. Results helped inform WIC program and funding decisions, including the opening of new WIC centers, and offered a novel approach for targeting public health services. PMID:24354821
Gurjav, Ulziijargal; Outhred, Alexander C.; Jelfs, Peter; McCallum, Nadine; Wang, Qinning; Hill-Cawthorne, Grant A.; Marais, Ben J.; Sintchenko, Vitali
2016-01-01
Australia has a low tuberculosis incidence rate with most cases occurring among recent immigrants. Given suboptimal cluster resolution achieved with 24-locus mycobacterium interspersed repetitive unit (MIRU-24) genotyping, the added value of whole genome sequencing was explored. MIRU-24 profiles of all Mycobacterium tuberculosis culture-confirmed tuberculosis cases diagnosed between 2009 and 2013 in New South Wales (NSW), Australia, were examined and clusters identified. The relatedness of cases within the largest MIRU-24 clusters was assessed using whole genome sequencing and phylogenetic analyses. Of 1841 culture-confirmed TB cases, 91.9% (1692/1841) had complete demographic and genotyping data. East-African Indian (474; 28.0%) and Beijing (470; 27.8%) lineage strains predominated. The overall rate of MIRU-24 clustering was 20.1% (340/1692) and was highest among Beijing lineage strains (35.7%; 168/470). One Beijing and three East-African Indian (EAI) clonal complexes were responsible for the majority of observed clusters. Whole genome sequencing of the 4 largest clusters (30 isolates) demonstrated diverse single nucleotide polymorphisms (SNPs) within identified clusters. All sequenced EAI strains and 70% of Beijing lineage strains clustered by MIRU-24 typing demonstrated distinct SNP profiles. The superior resolution provided by whole genome sequencing demonstrated limited M. tuberculosis transmission within NSW, even within identified MIRU-24 clusters. Routine whole genome sequencing could provide valuable public health guidance in low burden settings. PMID:27737005
Geographical Clusters of Rape in the United States: 2000-2012
Amin, Raid; Nabors, Nicole S.; Nelson, Arlene M.; Saqlain, Murshid; Kulldorff, Martin
2016-01-01
Background While rape is a very serious crime and public health problem, no spatial mapping has been attempted for rape on the national scale. This paper addresses the three research questions: (1) Are reported rape cases randomly distributed across the USA, after being adjusted for population density and age, or are there geographical clusters of reported rape cases? (2) Are the geographical clusters of reported rapes still present after adjusting for differences in poverty levels? (3) Are there geographical clusters where the proportion of reported rape cases that lead to an arrest is exceptionally low or exceptionally high? Methods We studied the geographical variation of reported rape events (2003-2012) and rape arrests (2000-2012) in the 48 contiguous states of the USA. The disease Surveillance software SaTScan™ with its spatial scan statistic is used to evaluate the spatial variation in rapes. The spatial scan statistic has been widely used as a geographical surveillance tool for diseases, and we used it to identify geographical areas with clusters of reported rape and clusters of arrest rates for rape. Results The spatial scan statistic was used to identify geographical areas with exceptionally high rates of reported rape. The analyses were adjusted for age, and in secondary analyses, for both age and poverty level. We also identified geographical areas with either a low or a high proportion of reported rapes leading to an arrest. Conclusions We have identified geographical areas with exceptionally high (low) rates of reported rape. The geographical problem areas identified are prime candidates for more intensive preventive counseling and criminal prosecution efforts by public health, social service, and law enforcement agencies Geographical clusters of high rates of reported rape are prime areas in need of expanded implementation of preventive measures, such as changing attitudes in our society toward rape crimes, in addition to having the criminal justice system play an even larger role in preventing rape. PMID:28078318
Geographical Clusters of Rape in the United States: 2000-2012.
Amin, Raid; Nabors, Nicole S; Nelson, Arlene M; Saqlain, Murshid; Kulldorff, Martin
2015-01-01
While rape is a very serious crime and public health problem, no spatial mapping has been attempted for rape on the national scale. This paper addresses the three research questions: (1) Are reported rape cases randomly distributed across the USA, after being adjusted for population density and age, or are there geographical clusters of reported rape cases? (2) Are the geographical clusters of reported rapes still present after adjusting for differences in poverty levels? (3) Are there geographical clusters where the proportion of reported rape cases that lead to an arrest is exceptionally low or exceptionally high? We studied the geographical variation of reported rape events (2003-2012) and rape arrests (2000-2012) in the 48 contiguous states of the USA. The disease Surveillance software SaTScan™ with its spatial scan statistic is used to evaluate the spatial variation in rapes. The spatial scan statistic has been widely used as a geographical surveillance tool for diseases, and we used it to identify geographical areas with clusters of reported rape and clusters of arrest rates for rape. The spatial scan statistic was used to identify geographical areas with exceptionally high rates of reported rape. The analyses were adjusted for age, and in secondary analyses, for both age and poverty level. We also identified geographical areas with either a low or a high proportion of reported rapes leading to an arrest. We have identified geographical areas with exceptionally high (low) rates of reported rape. The geographical problem areas identified are prime candidates for more intensive preventive counseling and criminal prosecution efforts by public health, social service, and law enforcement agencies Geographical clusters of high rates of reported rape are prime areas in need of expanded implementation of preventive measures, such as changing attitudes in our society toward rape crimes, in addition to having the criminal justice system play an even larger role in preventing rape.
Anholt, R M; Berezowski, J; Robertson, C; Stephen, C
2015-09-01
There is interest in the potential of companion animal surveillance to provide data to improve pet health and to provide early warning of environmental hazards to people. We implemented a companion animal surveillance system in Calgary, Alberta and the surrounding communities. Informatics technologies automatically extracted electronic medical records from participating veterinary practices and identified cases of enteric syndrome in the warehoused records. The data were analysed using time-series analyses and a retrospective space-time permutation scan statistic. We identified a seasonal pattern of reports of occurrences of enteric syndromes in companion animals and four statistically significant clusters of enteric syndrome cases. The cases within each cluster were examined and information about the animals involved (species, age, sex), their vaccination history, possible exposure or risk behaviour history, information about disease severity, and the aetiological diagnosis was collected. We then assessed whether the cases within the cluster were unusual and if they represented an animal or public health threat. There was often insufficient information recorded in the medical record to characterize the clusters by aetiology or exposures. Space-time analysis of companion animal enteric syndrome cases found evidence of clustering. Collection of more epidemiologically relevant data would enhance the utility of practice-based companion animal surveillance.
Transformation and model choice for RNA-seq co-expression analysis.
Rau, Andrea; Maugis-Rabusseau, Cathy
2018-05-01
Although a large number of clustering algorithms have been proposed to identify groups of co-expressed genes from microarray data, the question of if and how such methods may be applied to RNA sequencing (RNA-seq) data remains unaddressed. In this work, we investigate the use of data transformations in conjunction with Gaussian mixture models for RNA-seq co-expression analyses, as well as a penalized model selection criterion to select both an appropriate transformation and number of clusters present in the data. This approach has the advantage of accounting for per-cluster correlation structures among samples, which can be strong in RNA-seq data. In addition, it provides a rigorous statistical framework for parameter estimation, an objective assessment of data transformations and number of clusters and the possibility of performing diagnostic checks on the quality and homogeneity of the identified clusters. We analyze four varied RNA-seq data sets to illustrate the use of transformations and model selection in conjunction with Gaussian mixture models. Finally, we propose a Bioconductor package coseq (co-expression of RNA-seq data) to facilitate implementation and visualization of the recommended RNA-seq co-expression analyses.
Knowledge, attitudes towards and acceptability of genetic modification in Germany.
Christoph, Inken B; Bruhn, Maike; Roosen, Jutta
2008-07-01
Genetic modification remains a controversial issue. The aim of this study is to analyse the attitudes towards genetic modification, the knowledge about it and its acceptability in different application areas among German consumers. Results are based on a survey from spring 2005. An exploratory factor analysis is conducted to identify the attitudes towards genetic modification. The identified factors are used in a cluster analysis that identified a cluster of supporters, of opponents and a group of indifferent consumers. Respondents' knowledge of genetics and biotechnology differs among the found clusters without revealing a clear relationship between knowledge and support of genetic modification. The acceptability of genetic modification varies by application area and cluster, and genetically modified non-food products are more widely accepted than food products. The perception of personal health risks has high explanatory power for attitudes and acceptability.
[Applying the clustering technique for characterising maintenance outsourcing].
Cruz, Antonio M; Usaquén-Perilla, Sandra P; Vanegas-Pabón, Nidia N; Lopera, Carolina
2010-06-01
Using clustering techniques for characterising companies providing health institutions with maintenance services. The study analysed seven pilot areas' equipment inventory (264 medical devices). Clustering techniques were applied using 26 variables. Response time (RT), operation duration (OD), availability and turnaround time (TAT) were amongst the most significant ones. Average biomedical equipment obsolescence value was 0.78. Four service provider clusters were identified: clusters 1 and 3 had better performance, lower TAT, RT and DR values (56 % of the providers coded O, L, C, B, I, S, H, F and G, had 1 to 4 day TAT values:
A Model-Based Cluster Analysis of Maternal Emotion Regulation and Relations to Parenting Behavior.
Shaffer, Anne; Whitehead, Monica; Davis, Molly; Morelen, Diana; Suveg, Cynthia
2017-10-15
In a diverse community sample of mothers (N = 108) and their preschool-aged children (M age = 3.50 years), this study conducted person-oriented analyses of maternal emotion regulation (ER) based on a multimethod assessment incorporating physiological, observational, and self-report indicators. A model-based cluster analysis was applied to five indicators of maternal ER: maternal self-report, observed negative affect in a parent-child interaction, baseline respiratory sinus arrhythmia (RSA), and RSA suppression across two laboratory tasks. Model-based cluster analyses revealed four maternal ER profiles, including a group of mothers with average ER functioning, characterized by socioeconomic advantage and more positive parenting behavior. A dysregulated cluster demonstrated the greatest challenges with parenting and dyadic interactions. Two clusters of intermediate dysregulation were also identified. Implications for assessment and applications to parenting interventions are discussed. © 2017 Family Process Institute.
Ji, N Y; Capone, G T; Kaufmann, W E
2011-11-01
The diagnostic validity of autism spectrum disorder (ASD) based on Diagnostic and Statistical Manual of Mental Disorders (DSM) has been challenged in Down syndrome (DS), because of the high prevalence of cognitive impairments in this population. Therefore, we attempted to validate DSM-based diagnoses via an unbiased categorisation of participants with a DSM-independent behavioural instrument. Based on scores on the Aberrant Behaviour Checklist - Community, we performed sequential factor (four DS-relevant factors: Autism-Like Behaviour, Disruptive Behaviour, Hyperactivity, Self-Injury) and cluster analyses on a 293-participant paediatric DS clinic cohort. The four resulting clusters were compared with DSM-delineated groups: DS + ASD, DS + None (no DSM diagnosis), DS + DBD (disruptive behaviour disorder) and DS + SMD (stereotypic movement disorder), the latter two as comparison groups. Two clusters were identified with DS + ASD: Cluster 1 (35.1%) with higher disruptive behaviour and Cluster 4 (48.2%) with more severe autistic behaviour and higher percentage of late onset ASD. The majority of participants in DS + None (71.9%) and DS + DBD (87.5%) were classified into Cluster 2 and 3, respectively, while participants in DS + SMD were relatively evenly distributed throughout the four clusters. Our unbiased, DSM-independent analyses, using a rating scale specifically designed for individuals with severe intellectual disability, demonstrated that DSM-based criteria of ASD are applicable to DS individuals despite their cognitive impairments. Two DS + ASD clusters were identified and supported the existence of at least two subtypes of ASD in DS, which deserve further characterisation. Despite the prominence of stereotypic behaviour in DS, the SMD diagnosis was not identified by cluster analysis, suggesting that high-level stereotypy is distributed throughout DS. Further supporting DSM diagnoses, typically behaving DS participants were easily distinguished as a group from those with maladaptive behaviours. © 2011 The Authors. Journal of Intellectual Disability Research © 2011 Blackwell Publishing Ltd.
Multilocus microsatellite typing shows three different genetic clusters of Leishmania major in Iran.
Mahnaz, Tashakori; Al-Jawabreh, Amer; Kuhls, Katrin; Schönian, Gabriele
2011-10-01
Ten polymorphic microsatellite markers were used to analyse 25 strains of Leishmania major collected from cutaneous leishmaniasis cases in different endemic areas in Iran. Nine of the markers were polymorphic, revealing 21 different genotypes. The data displayed significant microsatellite polymorphism with rare allelic heterozygosity. Bayesian statistic and distance based analyses identified three genetic clusters among the 25 strains analysed. Cluster I represented mainly strains isolated in the west and south-west of Iran, with the exception of four strains originating from central Iran. Cluster II comprised strains from the central part of Iran, and cluster III included only strains from north Iran. The geographical distribution of L. major in Iran was supported by comparing the microsatellite profiles of the 25 Iranian strains to those of 105 strains collected in 19 Asian and African countries. The Iranian clusters I and II were separated from three previously described populations comprising strains from Africa, the Middle East and Central Asia whereas cluster III grouped together with the Central Asian population. The considerable genetic variability of L. major might be related to the existence of different populations of Phlebotomus papatasi and/or to differences in reservoir host abundance in different parts of Iran. Copyright © 2011 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.
Rudi, Knut; Kleiberg, Gro H; Heiberg, Ragnhild; Rosnes, Jan T
2007-08-01
The aim of this work was to evaluate restriction fragment melting curve analyses (RFMCA) as a novel approach for rapid classification of bacteria during food production. RFMCA was evaluated for bacteria isolated from sous vide food products, and raw materials used for sous vide production. We identified four major bacterial groups in the material analysed (cluster I-Streptococcus, cluster II-Carnobacterium/Bacillus, cluster III-Staphylococcus and cluster IV-Actinomycetales). The accuracy of RFMCA was evaluated by comparison with 16S rDNA sequencing. The strains satisfying the RFMCA quality filtering criteria (73%, n=57), with both 16S rDNA sequence information and RFMCA data (n=45) gave identical group assignments with the two methods. RFMCA enabled rapid and accurate classification of bacteria that is database compatible. Potential application of RFMCA in the food or pharmaceutical industry will include development of classification models for the bacteria expected in a given product, and then to build an RFMCA database as a part of the product quality control.
Clusternomics: Integrative context-dependent clustering for heterogeneous datasets
Wernisch, Lorenz
2017-01-01
Integrative clustering is used to identify groups of samples by jointly analysing multiple datasets describing the same set of biological samples, such as gene expression, copy number, methylation etc. Most existing algorithms for integrative clustering assume that there is a shared consistent set of clusters across all datasets, and most of the data samples follow this structure. However in practice, the structure across heterogeneous datasets can be more varied, with clusters being joined in some datasets and separated in others. In this paper, we present a probabilistic clustering method to identify groups across datasets that do not share the same cluster structure. The proposed algorithm, Clusternomics, identifies groups of samples that share their global behaviour across heterogeneous datasets. The algorithm models clusters on the level of individual datasets, while also extracting global structure that arises from the local cluster assignments. Clusters on both the local and the global level are modelled using a hierarchical Dirichlet mixture model to identify structure on both levels. We evaluated the model both on simulated and on real-world datasets. The simulated data exemplifies datasets with varying degrees of common structure. In such a setting Clusternomics outperforms existing algorithms for integrative and consensus clustering. In a real-world application, we used the algorithm for cancer subtyping, identifying subtypes of cancer from heterogeneous datasets. We applied the algorithm to TCGA breast cancer dataset, integrating gene expression, miRNA expression, DNA methylation and proteomics. The algorithm extracted clinically meaningful clusters with significantly different survival probabilities. We also evaluated the algorithm on lung and kidney cancer TCGA datasets with high dimensionality, again showing clinically significant results and scalability of the algorithm. PMID:29036190
Clusternomics: Integrative context-dependent clustering for heterogeneous datasets.
Gabasova, Evelina; Reid, John; Wernisch, Lorenz
2017-10-01
Integrative clustering is used to identify groups of samples by jointly analysing multiple datasets describing the same set of biological samples, such as gene expression, copy number, methylation etc. Most existing algorithms for integrative clustering assume that there is a shared consistent set of clusters across all datasets, and most of the data samples follow this structure. However in practice, the structure across heterogeneous datasets can be more varied, with clusters being joined in some datasets and separated in others. In this paper, we present a probabilistic clustering method to identify groups across datasets that do not share the same cluster structure. The proposed algorithm, Clusternomics, identifies groups of samples that share their global behaviour across heterogeneous datasets. The algorithm models clusters on the level of individual datasets, while also extracting global structure that arises from the local cluster assignments. Clusters on both the local and the global level are modelled using a hierarchical Dirichlet mixture model to identify structure on both levels. We evaluated the model both on simulated and on real-world datasets. The simulated data exemplifies datasets with varying degrees of common structure. In such a setting Clusternomics outperforms existing algorithms for integrative and consensus clustering. In a real-world application, we used the algorithm for cancer subtyping, identifying subtypes of cancer from heterogeneous datasets. We applied the algorithm to TCGA breast cancer dataset, integrating gene expression, miRNA expression, DNA methylation and proteomics. The algorithm extracted clinically meaningful clusters with significantly different survival probabilities. We also evaluated the algorithm on lung and kidney cancer TCGA datasets with high dimensionality, again showing clinically significant results and scalability of the algorithm.
Descriptive epidemiology of typhoid fever during an epidemic in Harare, Zimbabwe, 2012.
Polonsky, Jonathan A; Martínez-Pino, Isabel; Nackers, Fabienne; Chonzi, Prosper; Manangazira, Portia; Van Herp, Michel; Maes, Peter; Porten, Klaudia; Luquero, Francisco J
2014-01-01
Typhoid fever remains a significant public health problem in developing countries. In October 2011, a typhoid fever epidemic was declared in Harare, Zimbabwe - the fourth enteric infection epidemic since 2008. To orient control activities, we described the epidemiology and spatiotemporal clustering of the epidemic in Dzivaresekwa and Kuwadzana, the two most affected suburbs of Harare. A typhoid fever case-patient register was analysed to describe the epidemic. To explore clustering, we constructed a dataset comprising GPS coordinates of case-patient residences and randomly sampled residential locations (spatial controls). The scale and significance of clustering was explored with Ripley K functions. Cluster locations were determined by a random labelling technique and confirmed using Kulldorff's spatial scan statistic. We analysed data from 2570 confirmed and suspected case-patients, and found significant spatiotemporal clustering of typhoid fever in two non-overlapping areas, which appeared to be linked to environmental sources. Peak relative risk was more than six times greater than in areas lying outside the cluster ranges. Clusters were identified in similar geographical ranges by both random labelling and Kulldorff's spatial scan statistic. The spatial scale at which typhoid fever clustered was highly localised, with significant clustering at distances up to 4.5 km and peak levels at approximately 3.5 km. The epicentre of infection transmission shifted from one cluster to the other during the course of the epidemic. This study demonstrated highly localised clustering of typhoid fever during an epidemic in an urban African setting, and highlights the importance of spatiotemporal analysis for making timely decisions about targetting prevention and control activities and reinforcing treatment during epidemics. This approach should be integrated into existing surveillance systems to facilitate early detection of epidemics and identify their spatial range.
Descriptive Epidemiology of Typhoid Fever during an Epidemic in Harare, Zimbabwe, 2012
Polonsky, Jonathan A.; Martínez-Pino, Isabel; Nackers, Fabienne; Chonzi, Prosper; Manangazira, Portia; Van Herp, Michel; Maes, Peter; Porten, Klaudia; Luquero, Francisco J.
2014-01-01
Background Typhoid fever remains a significant public health problem in developing countries. In October 2011, a typhoid fever epidemic was declared in Harare, Zimbabwe - the fourth enteric infection epidemic since 2008. To orient control activities, we described the epidemiology and spatiotemporal clustering of the epidemic in Dzivaresekwa and Kuwadzana, the two most affected suburbs of Harare. Methods A typhoid fever case-patient register was analysed to describe the epidemic. To explore clustering, we constructed a dataset comprising GPS coordinates of case-patient residences and randomly sampled residential locations (spatial controls). The scale and significance of clustering was explored with Ripley K functions. Cluster locations were determined by a random labelling technique and confirmed using Kulldorff's spatial scan statistic. Principal Findings We analysed data from 2570 confirmed and suspected case-patients, and found significant spatiotemporal clustering of typhoid fever in two non-overlapping areas, which appeared to be linked to environmental sources. Peak relative risk was more than six times greater than in areas lying outside the cluster ranges. Clusters were identified in similar geographical ranges by both random labelling and Kulldorff's spatial scan statistic. The spatial scale at which typhoid fever clustered was highly localised, with significant clustering at distances up to 4.5 km and peak levels at approximately 3.5 km. The epicentre of infection transmission shifted from one cluster to the other during the course of the epidemic. Conclusions This study demonstrated highly localised clustering of typhoid fever during an epidemic in an urban African setting, and highlights the importance of spatiotemporal analysis for making timely decisions about targetting prevention and control activities and reinforcing treatment during epidemics. This approach should be integrated into existing surveillance systems to facilitate early detection of epidemics and identify their spatial range. PMID:25486292
Walthouwer, Michel Jean Louis; Oenema, Anke; Soetens, Katja; Lechner, Lilian; de Vries, Hein
2014-11-01
Developing nutrition education interventions based on clusters of dietary patterns can only be done adequately when it is clear if distinctive clusters of dietary patterns can be derived and reproduced over time, if cluster membership is stable, and if it is predictable which type of people belong to a certain cluster. Hence, this study aimed to: (1) identify clusters of dietary patterns among Dutch adults, (2) test the reproducibility of these clusters and stability of cluster membership over time, and (3) identify sociodemographic predictors of cluster membership and cluster transition. This study had a longitudinal design with online measurements at baseline (N=483) and 6 months follow-up (N=379). Dietary intake was assessed with a validated food frequency questionnaire. A hierarchical cluster analysis was performed, followed by a K-means cluster analysis. Multinomial logistic regression analyses were conducted to identify the sociodemographic predictors of cluster membership and cluster transition. At baseline and follow-up, a comparable three-cluster solution was derived, distinguishing a healthy, moderately healthy, and unhealthy dietary pattern. Male and lower educated participants were significantly more likely to have a less healthy dietary pattern. Further, 251 (66.2%) participants remained in the same cluster, 45 (11.9%) participants changed to an unhealthier cluster, and 83 (21.9%) participants shifted to a healthier cluster. Men and people living alone were significantly more likely to shift toward a less healthy dietary pattern. Distinctive clusters of dietary patterns can be derived. Yet, cluster membership is unstable and only few sociodemographic factors were associated with cluster membership and cluster transition. These findings imply that clusters based on dietary intake may not be suitable as a basis for nutrition education interventions. Copyright © 2014 Elsevier Ltd. All rights reserved.
Exploring spatial evolution of economic clusters: A case study of Beijing
NASA Astrophysics Data System (ADS)
Yang, Zhenshan; Sliuzas, Richard; Cai, Jianming; Ottens, Henk F. L.
2012-10-01
An identification of economic clusters and analysing their changing spatial patterns is important for understanding urban economic space dynamics. Previous studies, however, suffer from limitations as a consequence of using fixed geographically areas and not combining functional and spatial dynamics. The paper presents an approach, based on local spatial statistics and the case of Beijing to understand the spatial clustering of industries that are functionally interconnected by common or complementary patterns of demand or supply relations. Using register data of business establishments, it identifies economic clusters and analyses their pattern based on postcodes at different time slices during the period 1983-2002. The study shows how the advanced services occupy the urban centre and key sub centres. The Information and Communication Technology (ICT) cluster is mainly concentrated in the north part of the city and circles the urban centre, and the main manufacturing clusters are evolved in the key sub centers. This type of outcomes improves understanding of urban-economic dynamics, which can support spatial and economic planning.
Haakensen, Vilde D; Lingjaerde, Ole Christian; Lüders, Torben; Riis, Margit; Prat, Aleix; Troester, Melissa A; Holmen, Marit M; Frantzen, Jan Ole; Romundstad, Linda; Navjord, Dina; Bukholm, Ida K; Johannesen, Tom B; Perou, Charles M; Ursin, Giske; Kristensen, Vessela N; Børresen-Dale, Anne-Lise; Helland, Aslaug
2011-11-01
Increased understanding of the variability in normal breast biology will enable us to identify mechanisms of breast cancer initiation and the origin of different subtypes, and to better predict breast cancer risk. Gene expression patterns in breast biopsies from 79 healthy women referred to breast diagnostic centers in Norway were explored by unsupervised hierarchical clustering and supervised analyses, such as gene set enrichment analysis and gene ontology analysis and comparison with previously published genelists and independent datasets. Unsupervised hierarchical clustering identified two separate clusters of normal breast tissue based on gene-expression profiling, regardless of clustering algorithm and gene filtering used. Comparison of the expression profile of the two clusters with several published gene lists describing breast cells revealed that the samples in cluster 1 share characteristics with stromal cells and stem cells, and to a certain degree with mesenchymal cells and myoepithelial cells. The samples in cluster 1 also share many features with the newly identified claudin-low breast cancer intrinsic subtype, which also shows characteristics of stromal and stem cells. More women belonging to cluster 1 have a family history of breast cancer and there is a slight overrepresentation of nulliparous women in cluster 1. Similar findings were seen in a separate dataset consisting of histologically normal tissue from both breasts harboring breast cancer and from mammoplasty reductions. This is the first study to explore the variability of gene expression patterns in whole biopsies from normal breasts and identified distinct subtypes of normal breast tissue. Further studies are needed to determine the specific cell contribution to the variation in the biology of normal breasts, how the clusters identified relate to breast cancer risk and their possible link to the origin of the different molecular subtypes of breast cancer.
Griss, Johannes; Perez-Riverol, Yasset; Lewis, Steve; Tabb, David L.; Dianes, José A.; del-Toro, Noemi; Rurik, Marc; Walzer, Mathias W.; Kohlbacher, Oliver; Hermjakob, Henning; Wang, Rui; Vizcaíno, Juan Antonio
2016-01-01
Mass spectrometry (MS) is the main technology used in proteomics approaches. However, on average 75% of spectra analysed in an MS experiment remain unidentified. We propose to use spectrum clustering at a large-scale to shed a light on these unidentified spectra. PRoteomics IDEntifications database (PRIDE) Archive is one of the largest MS proteomics public data repositories worldwide. By clustering all tandem MS spectra publicly available in PRIDE Archive, coming from hundreds of datasets, we were able to consistently characterize three distinct groups of spectra: 1) incorrectly identified spectra, 2) spectra correctly identified but below the set scoring threshold, and 3) truly unidentified spectra. Using a multitude of complementary analysis approaches, we were able to identify less than 20% of the consistently unidentified spectra. The complete spectrum clustering results are available through the new version of the PRIDE Cluster resource (http://www.ebi.ac.uk/pride/cluster). This resource is intended, among other aims, to encourage and simplify further investigation into these unidentified spectra. PMID:27493588
Griss, Johannes; Perez-Riverol, Yasset; Lewis, Steve; Tabb, David L; Dianes, José A; Del-Toro, Noemi; Rurik, Marc; Walzer, Mathias W; Kohlbacher, Oliver; Hermjakob, Henning; Wang, Rui; Vizcaíno, Juan Antonio
2016-08-01
Mass spectrometry (MS) is the main technology used in proteomics approaches. However, on average 75% of spectra analysed in an MS experiment remain unidentified. We propose to use spectrum clustering at a large-scale to shed a light on these unidentified spectra. PRoteomics IDEntifications database (PRIDE) Archive is one of the largest MS proteomics public data repositories worldwide. By clustering all tandem MS spectra publicly available in PRIDE Archive, coming from hundreds of datasets, we were able to consistently characterize three distinct groups of spectra: 1) incorrectly identified spectra, 2) spectra correctly identified but below the set scoring threshold, and 3) truly unidentified spectra. Using a multitude of complementary analysis approaches, we were able to identify less than 20% of the consistently unidentified spectra. The complete spectrum clustering results are available through the new version of the PRIDE Cluster resource (http://www.ebi.ac.uk/pride/cluster). This resource is intended, among other aims, to encourage and simplify further investigation into these unidentified spectra.
Lubelchek, Ronald J.; Hoehnen, Sarah C.; Hotton, Anna L.; Kincaid, Stacey L.; Barker, David E.; French, Audrey L.
2014-01-01
Introduction HIV transmission cluster analyses can inform HIV prevention efforts. We describe the first such assessment for transmission clustering among HIV patients in Chicago. Methods We performed transmission cluster analyses using HIV pol sequences from newly diagnosed patients presenting to Chicago’s largest HIV clinic between 2008 and 2011. We compared sequences via progressive pairwise alignment, using neighbor joining to construct an un-rooted phylogenetic tree. We defined clusters as >2 sequences among which each sequence had at least one partner within a genetic distance of ≤ 1.5%. We used multivariable regression to examine factors associated with clustering and used geospatial analysis to assess geographic proximity of phylogenetically clustered patients. Results We compared sequences from 920 patients; median age 35 years; 75% male; 67% Black, 23% Hispanic; 8% had a Rapid Plasma Reagin (RPR) titer ≥ 1:16 concurrent with their HIV diagnosis. We had HIV transmission risk data for 54%; 43% identified as men who have sex with men (MSM). Phylogenetic analysis demonstrated 123 patients (13%) grouped into 26 clusters, the largest having 20 members. In multivariable regression, age < 25, Black race, MSM status, male gender, higher HIV viral load, and RPR ≥ 1:16 associated with clustering. We did not observe geographic grouping of genetically clustered patients. Discussion Our results demonstrate high rates of HIV transmission clustering, without local geographic foci, among young Black MSM in Chicago. Applied prospectively, phylogenetic analyses could guide prevention efforts and help break the cycle of transmission. PMID:25321182
Ning, P; Guo, Y F; Sun, T Y; Zhang, H S; Chai, D; Li, X M
2016-09-01
To study the distinct clinical phenotype of chronic airway diseases by hierarchical cluster analysis and two-step cluster analysis. A population sample of adult patients in Donghuamen community, Dongcheng district and Qinghe community, Haidian district, Beijing from April 2012 to January 2015, who had wheeze within the last 12 months, underwent detailed investigation, including a clinical questionnaire, pulmonary function tests, total serum IgE levels, blood eosinophil level and a peak flow diary. Nine variables were chosen as evaluating parameters, including pre-salbutamol forced expired volume in one second(FEV1)/forced vital capacity(FVC) ratio, pre-salbutamol FEV1, percentage of post-salbutamol change in FEV1, residual capacity, diffusing capacity of the lung for carbon monoxide/alveolar volume adjusted for haemoglobin level, peak expiratory flow(PEF) variability, serum IgE level, cumulative tobacco cigarette consumption (pack-years) and respiratory symptoms (cough and expectoration). Subjects' different clinical phenotype by hierarchical cluster analysis and two-step cluster analysis was identified. (1) Four clusters were identified by hierarchical cluster analysis. Cluster 1 was chronic bronchitis in smokers with normal pulmonary function. Cluster 2 was chronic bronchitis or mild chronic obstructive pulmonary disease (COPD) patients with mild airflow limitation. Cluster 3 included COPD patients with heavy smoking, poor quality of life and severe airflow limitation. Cluster 4 recognized atopic patients with mild airflow limitation, elevated serum IgE and clinical features of asthma. Significant differences were revealed regarding pre-salbutamol FEV1/FVC%, pre-salbutamol FEV1% pred, post-salbutamol change in FEV1%, maximal mid-expiratory flow curve(MMEF)% pred, carbon monoxide diffusing capacity per liter of alveolar(DLCO)/(VA)% pred, residual volume(RV)% pred, total serum IgE level, smoking history (pack-years), St.George's respiratory questionnaire(SGRQ) score, acute exacerbation in the past one year, PEF variability and allergic dermatitis (P<0.05). (2) Four clusters were also identified by two-step cluster analysis as followings, cluster 1, COPD patients with moderate to severe airflow limitation; cluster 2, asthma and COPD patients with heavy smoking, airflow limitation and increased airways reversibility; cluster 3, patients having less smoking and normal pulmonary function with wheezing but no chronic cough; cluster 4, chronic bronchitis patients with normal pulmonary function and chronic cough. Significant differences were revealed regarding gender distribution, respiratory symptoms, pre-salbutamol FEV1/FVC%, pre-salbutamol FEV1% pred, post-salbutamol change in FEV1%, MMEF% pred, DLCO/VA% pred, RV% pred, PEF variability, total serum IgE level, cumulative tobacco cigarette consumption (pack-years), and SGRQ score (P<0.05). By different cluster analyses, distinct clinical phenotypes of chronic airway diseases are identified. Thus, individualized treatments may guide doctors to provide based on different phenotypes.
Identifying children at risk for being bullies in the United States.
Shetgiri, Rashmi; Lin, Hua; Flores, Glenn
2012-01-01
To identify risk factors associated with the greatest and lowest prevalence of bullying perpetration among U.S. children. Using the 2001-2002 Health Behavior in School-Aged Children, a nationally representative survey of U.S. children in 6th-10th grades, bivariate analyses were conducted to identify factors associated with any (once or twice or more), moderate (two to three times/month or more), and frequent (weekly or more) bullying. Stepwise multivariable analyses identified risk factors associated with bullying. Recursive partitioning analysis (RPA) identified risk factors which, in combination, identify students with the highest and lowest bullying prevalence. The prevalence of any bullying in the 13,710 students was 37.3%, moderate bullying was 12.6%, and frequent bullying was 6.6%. Characteristics associated with bullying were similar in the multivariable analyses and RPA clusters. In RPA, the highest prevalence of any bullying (67%) accrued in children with a combination of fighting and weapon-carrying. Students who carry weapons, smoke, and drink alcohol more than 5 to 6 days/week were at greatest risk for moderate bullying (61%). Those who carry weapons, smoke, have more than one alcoholic drink per day, have above-average academic performance, moderate/high family affluence, and feel irritable or bad-tempered daily were at greatest risk for frequent bullying (68%). Risk clusters for any, moderate, and frequent bullying differ. Children who fight and carry weapons are at greatest risk of any bullying. Weapon-carrying, smoking, and alcohol use are included in the greatest risk clusters for moderate and frequent bullying. Risk-group categories may be useful to providers in identifying children at the greatest risk for bullying and in targeting interventions. Copyright © 2012 Academic Pediatric Association. Published by Elsevier Inc. All rights reserved.
Identifying Children At Risk for Being Bullies in the US
Shetgiri, Rashmi; Lin, Hua; Flores, Glenn
2012-01-01
Objective To identify risk factors associated with the highest and lowest prevalence of bullying perpetration among US children. Methods Using the 2001–2002 Health Behavior in School-Aged Children, a nationally-representative survey of US children in 6th–10th grades, bivariate analyses were conducted to identify factors associated with any (≥ once or twice), moderate (≥ two-three times/month), and frequent (≥ weekly) bullying. Stepwise multivariable analyses identified risk factors associated with bullying. Recursive partitioning analysis (RPA) identified risk factors which, in combination, identify students with the highest and lowest bullying prevalence. Results The prevalence of any bullying in the 13,710 students was 37.3%, moderate bullying was 12.6%, and frequent bullying was 6.6%. Characteristics associated with bullying were similar in the multivariable analyses and RPA clusters. In RPA, the highest prevalence of any bullying (67%) accrued in children with a combination of fighting and weapon-carrying. Students who carry weapons, smoke, and drink alcohol more than 5–6 days weekly were at highest risk for moderate bullying (61%). Those who carry weapons, smoke, drink > once daily, have above-average academic performance, moderate/high family affluence, and feel irritable or bad-tempered daily were at highest risk for frequent bullying (68%). Conclusions Risk clusters for any, moderate, and frequent bullying differ. Children who fight and carry weapons are at highest risk of any bullying. Weapon-carrying, smoking, and alcohol use are included in the highest risk clusters for moderate and frequent bullying. Risk-group categories may be useful to providers in identifying children at highest risks for bullying and in targeting interventions. PMID:22989731
Sonora exploratory study for the detection of wheat-leaf rust
NASA Technical Reports Server (NTRS)
Payne, R. W. (Principal Investigator)
1980-01-01
The applicability of LANDSAT remote sensing technology to the detection of a wheat-leaf-rust epidemic in Sonora, Mexico, during 1977 was investigated. LANDSAT data acquired during crop years 1975-76 and 1976-77 were clustered, classified, and analyzed in order to detect agricultural changes. Analysis of 1977 data indicates a significant proportion of the identified wheat is stressed (potentially rust-infected). Additional analyses show a significant increase in fallowing during the year, as well as a substantial decrease in reservoir levels in the Sonora agricultural region. Ground observations are required to substantiate these analyses. The possibility exists that heat-rust is not LANDSAT detectable and that the clusters identified as containing stressed signatures represent different varieties of wheat or perhaps nonwheat crops.
Stopka, Thomas J; Brinkley-Rubinstein, Lauren; Johnson, Kendra; Chan, Philip A; Hutcheson, Marga; Crosby, Richard; Burke, Deirdre; Mena, Leandro; Nunn, Amy
2018-04-03
In recent years, more than half of new HIV infections in the United States occur among African Americans in the Southeastern United States. Spatial epidemiological analyses can inform public health responses in the Deep South by identifying HIV hotspots and community-level factors associated with clustering. The goal of this study was to identify and characterize HIV clusters in Mississippi through analysis of state-level HIV surveillance data. We used a combination of spatial epidemiology and statistical modeling to identify and characterize HIV hotspots in Mississippi census tracts (n=658) from 2008 to 2014. We conducted spatial analyses of all HIV infections, infections among men who have sex with men (MSM), and infections among African Americans. Multivariable logistic regression analyses identified community-level sociodemographic factors associated with HIV hotspots considering all cases. There were HIV hotspots for the entire population, MSM, and African American MSM identified in the Mississippi Delta region, Southern Mississippi, and in greater Jackson, including surrounding rural counties (P<.05). In multivariable models for all HIV cases, HIV hotspots were significantly more likely to include urban census tracts (adjusted odds ratio [AOR] 2.01, 95% CI 1.20-3.37) and census tracts that had a higher proportion of African Americans (AOR 3.85, 95% CI 2.23-6.65). The HIV hotspots were less likely to include census tracts with residents who had less than a high school education (AOR 0.95, 95% CI 0.92-0.98), census tracts with residents belonging to two or more racial/ethnic groups (AOR 0.46, 95% CI 0.30-0.70), and census tracts that had a higher percentage of the population living below the poverty level (AOR 0.51, 95% CI 0.28-0.92). We used spatial epidemiology and statistical modeling to identify and characterize HIV hotspots for the general population, MSM, and African Americans. HIV clusters concentrated in Jackson and the Mississippi Delta. African American race and urban location were positively associated with clusters, whereas having less than a high school education and having a higher percentage of the population living below the poverty level were negatively associated with clusters. Spatial epidemiological analyses can inform implementation science and public health response strategies, including improved HIV testing, targeted prevention and risk reduction education, and tailored preexposure prophylaxis to address HIV disparities in the South. ©Thomas J Stopka, Lauren Brinkley-Rubinstein, Kendra Johnson, Philip A Chan, Marga Hutcheson, Richard Crosby, Deirdre Burke, Leandro Mena, Amy Nunn. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 03.04.2018.
Comparing population structure as inferred from genealogical versus genetic information.
Colonna, Vincenza; Nutile, Teresa; Ferrucci, Ronald R; Fardella, Giulio; Aversano, Mario; Barbujani, Guido; Ciullo, Marina
2009-12-01
Algorithms for inferring population structure from genetic data (ie, population assignment methods) have shown to effectively recognize genetic clusters in human populations. However, their performance in identifying groups of genealogically related individuals, especially in scanty-differentiated populations, has not been tested empirically thus far. For this study, we had access to both genealogical and genetic data from two closely related, isolated villages in southern Italy. We found that nearly all living individuals were included in a single pedigree, with multiple inbreeding loops. Despite F(st) between villages being a low 0.008, genetic clustering analysis identified two clusters roughly corresponding to the two villages. Average kinship between individuals (estimated from genealogies) increased at increasing values of group membership (estimated from the genetic data), showing that the observed genetic clusters represent individuals who are more closely related to each other than to random members of the population. Further, average kinship within clusters and F(st) between clusters increases with increasingly stringent membership threshold requirements. We conclude that a limited number of genetic markers is sufficient to detect structuring, and that the results of genetic analyses faithfully mirror the structuring inferred from detailed analyses of population genealogies, even when F(st) values are low, as in the case of the two villages. We then estimate the impact of observed levels of population structure on association studies using simulated data.
Comparing population structure as inferred from genealogical versus genetic information
Colonna, Vincenza; Nutile, Teresa; Ferrucci, Ronald R; Fardella, Giulio; Aversano, Mario; Barbujani, Guido; Ciullo, Marina
2009-01-01
Algorithms for inferring population structure from genetic data (ie, population assignment methods) have shown to effectively recognize genetic clusters in human populations. However, their performance in identifying groups of genealogically related individuals, especially in scanty-differentiated populations, has not been tested empirically thus far. For this study, we had access to both genealogical and genetic data from two closely related, isolated villages in southern Italy. We found that nearly all living individuals were included in a single pedigree, with multiple inbreeding loops. Despite Fst between villages being a low 0.008, genetic clustering analysis identified two clusters roughly corresponding to the two villages. Average kinship between individuals (estimated from genealogies) increased at increasing values of group membership (estimated from the genetic data), showing that the observed genetic clusters represent individuals who are more closely related to each other than to random members of the population. Further, average kinship within clusters and Fst between clusters increases with increasingly stringent membership threshold requirements. We conclude that a limited number of genetic markers is sufficient to detect structuring, and that the results of genetic analyses faithfully mirror the structuring inferred from detailed analyses of population genealogies, even when Fst values are low, as in the case of the two villages. We then estimate the impact of observed levels of population structure on association studies using simulated data. PMID:19550436
Crabbe, J Christopher F; Gregorio, David I; Samociuk, Holly; Swede, Helen
2015-07-01
We considered changes in the geographic distribution of early stage breast cancer among White and non-White women while secular trends in lifestyle and health care were under way. We aggregated tumor registry and census data by age, race, place of residence, and year of diagnosis to evaluate rate variation across Connecticut census tracts between 1985 and 2009. Global and local cluster detection tests were completed. Age-adjusted incidence rates increased by 2.71% and 0.44% per year for White and non-White women, respectively. Significant global clustering was identified during surveillance of these populations, but the elements of clustering differed between groups. Among White women, fewer local clusters were detected after 1985 to 1989, whereas clustering increased over time among non-White women. Small-area variation of breast cancer incidence rates across time periods proved to be dynamic and race-specific. Incidence rates might have been affected by secular trends in lifestyle or health care. Single cross-sectional analyses might have confused our understanding of disease occurrence by not accounting for the social context in which patient preferences or provider capacity influence the numbers and locations of diagnosed cases. Serial analyses are recommended to identify "hot spots" where persistent geographic disparities in incidence occur.
Susca, Antonia; Proctor, Robert H; Butchko, Robert A E; Haidukowski, Miriam; Stea, Gaetano; Logrieco, Antonio; Moretti, Antonio
2014-12-01
The ability to produce fumonisin mycotoxins varies among members of the black aspergilli. Previously, analyses of selected genes in the fumonisin biosynthetic gene (fum) cluster in black aspergilli from California grapes indicated that fumonisin-nonproducing isolates of Aspergillus welwitschiae lack six fum genes, but nonproducing isolates of Aspergillus niger do not. In the current study, analyses of black aspergilli from grapes from the Mediterranean Basin indicate that the genomic context of the fum cluster is the same in isolates of A. niger and A. welwitschiae regardless of fumonisin-production ability and that full-length clusters occur in producing isolates of both species and nonproducing isolates of A. niger. In contrast, the cluster has undergone an eight-gene deletion in fumonisin-nonproducing isolates of A. welwitschiae. Phylogenetic analyses suggest each species consists of a mixed population of fumonisin-producing and nonproducing individuals, and that existence of both production phenotypes may provide a selective advantage to these species. Differences in gene content of fum cluster homologues and phylogenetic relationships of fum genes suggest that the mutation(s) responsible for the nonproduction phenotype differs, and therefore arose independently, in the two species. Partial fum cluster homologues were also identified in genome sequences of four other black Aspergillus species. Gene content of these partial clusters and phylogenetic relationships of fum sequences indicate that non-random partial deletion of the cluster has occurred multiple times among the species. This in turn suggests that an intact cluster and fumonisin production were once more widespread among black aspergilli. Copyright © 2014 Elsevier Inc. All rights reserved.
Patiño-Galindo, Juan Ángel; Torres-Puente, Manoli; Bracho, María Alma; Alastrué, Ignacio; Juan, Amparo; Navarro, David; Galindo, María José; Ocete, Dolores; Ortega, Enrique; Gimeno, Concepción; Belda, Josefina; Domínguez, Victoria; Moreno, Rosario; González-Candelas, Fernando
2017-09-14
HIV infections are still a very serious concern for public heath worldwide. We have applied molecular evolution methods to study the HIV-1 epidemics in the Comunidad Valenciana (CV, Spain) from a public health surveillance perspective. For this, we analysed 1804 HIV-1 sequences comprising protease and reverse transcriptase (PR/RT) coding regions, sampled between 2004 and 2014. These sequences were subtyped and subjected to phylogenetic analyses in order to detect transmission clusters. In addition, univariate and multinomial comparisons were performed to detect epidemiological differences between HIV-1 subtypes, and risk groups. The HIV epidemic in the CV is dominated by subtype B infections among local men who have sex with men (MSM). 270 transmission clusters were identified (>57% of the dataset), 12 of which included ≥10 patients; 11 of subtype B (9 affecting MSMs) and one (n = 21) of CRF14, affecting predominately intravenous drug users (IDUs). Dated phylogenies revealed these large clusters to have originated from the mid-80s to the early 00 s. Subtype B is more likely to form transmission clusters than non-B variants and MSMs to cluster than other risk groups. Multinomial analyses revealed an association between non-B variants, which are not established in the local population yet, and different foreign groups.
Thermodynamically accessible titanium clusters TiN, N = 2-32.
Lazauskas, Tomas; Sokol, Alexey A; Buckeridge, John; Catlow, C Richard A; Escher, Susanne G E T; Farrow, Matthew R; Mora-Fonz, David; Blum, Volker W; Phaahla, Tshegofatso M; Chauke, Hasani R; Ngoepe, Phuti E; Woodley, Scott M
2018-05-10
We have performed a genetic algorithm search on the tight-binding interatomic potential energy surface (PES) for small TiN (N = 2-32) clusters. The low energy candidate clusters were further refined using density functional theory (DFT) calculations with the PBEsol exchange-correlation functional and evaluated with the PBEsol0 hybrid functional. The resulting clusters were analysed in terms of their structural features, growth mechanism and surface area. The results suggest a growth mechanism that is based on forming coordination centres by interpenetrating icosahedra, icositetrahedra and Frank-Kasper polyhedra. We identify centres of coordination, which act as centres of bulk nucleation in medium sized clusters and determine the morphological features of the cluster.
Efficient generation of low-energy folded states of a model protein
NASA Astrophysics Data System (ADS)
Gordon, Heather L.; Kwan, Wai Kei; Gong, Chunhang; Larrass, Stefan; Rothstein, Stuart M.
2003-01-01
A number of short simulated annealing runs are performed on a highly-frustrated 46-"residue" off-lattice model protein. We perform, in an iterative fashion, a principal component analysis of the 946 nonbonded interbead distances, followed by two varieties of cluster analyses: hierarchical and k-means clustering. We identify several distinct sets of conformations with reasonably consistent cluster membership. Nonbonded distance constraints are derived for each cluster and are employed within a distance geometry approach to generate many new conformations, previously unidentified by the simulated annealing experiments. Subsequent analyses suggest that these new conformations are members of the parent clusters from which they were generated. Furthermore, several novel, previously unobserved structures with low energy were uncovered, augmenting the ensemble of simulated annealing results, and providing a complete distribution of low-energy states. The computational cost of this approach to generating low-energy conformations is small when compared to the expense of further Monte Carlo simulated annealing runs.
Optimal integrated abundances for chemical tagging of extragalactic globular clusters
NASA Astrophysics Data System (ADS)
Sakari, Charli M.; Venn, Kim; Shetrone, Matthew; Dotter, Aaron; Mackey, Dougal
2014-09-01
High-resolution integrated light (IL) spectroscopy provides detailed abundances of distant globular clusters whose stars cannot be resolved. Abundance comparisons with other systems (e.g. for chemical tagging) require understanding the systematic offsets that can occur between clusters, such as those due to uncertainties in the underlying stellar population. This paper analyses high-resolution IL spectra of the Galactic globular clusters 47 Tuc, M3, M13, NGC 7006, and M15 to (1) quantify potential systematic uncertainties in Fe, Ca, Ti, Ni, Ba, and Eu and (2) identify the most stable abundance ratios that will be useful in future analyses of unresolved targets. When stellar populations are well modelled, uncertainties are ˜0.1-0.2 dex based on sensitivities to the atmospheric parameters alone; in the worst-case scenarios, uncertainties can rise to 0.2-0.4 dex. The [Ca I/Fe I] ratio is identified as the optimal integrated [α/Fe] indicator (with offsets ≲ 0.1 dex), while [Ni I/Fe I] is also extremely stable to within ≲ 0.1 dex. The [Ba II/Eu II] ratios are also stable when the underlying populations are well modelled and may also be useful for chemical tagging.
van der Molen, Thys; Fletcher, Monica; Price, David
Asthma is a highly heterogeneous disease that can be classified into different clinical phenotypes, and treatment may be tailored accordingly. However, factors beyond purely clinical traits, such as patient attitudes and behaviors, can also have a marked impact on treatment outcomes. The objective of this study was to further analyze data from the REcognise Asthma and LInk to Symptoms and Experience (REALISE) Europe survey, to identify distinct patient groups sharing common attitudes toward asthma and its management. Factor analysis of respondent data (N = 7,930) from the REALISE Europe survey consolidated the 34 attitudinal variables provided by the study population into a set of 8 summary factors. Cluster analyses were used to identify patient clusters that showed similar attitudes and behaviors toward each of the 8 summary factors. Five distinct patient clusters were identified and named according to the key characteristics comprising that cluster: "Confident and self-managing," "Confident and accepting of their asthma," "Confident but dependent on others," "Concerned but confident in their health care professional (HCP)," and "Not confident in themselves or their HCP." Clusters showed clear variability in attributes such as degree of confidence in managing their asthma, use of reliever and preventer medication, and level of asthma control. The 5 patient clusters identified in this analysis displayed distinctly different personal attitudes that would require different approaches in the consultation room certainly for asthma but probably also for other chronic diseases. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Biased phylodynamic inferences from analysing clusters of viral sequences
Xiang, Fei; Frost, Simon D. W.
2017-01-01
Abstract Phylogenetic methods are being increasingly used to help understand the transmission dynamics of measurably evolving viruses, including HIV. Clusters of highly similar sequences are often observed, which appear to follow a ‘power law’ behaviour, with a small number of very large clusters. These clusters may help to identify subpopulations in an epidemic, and inform where intervention strategies should be implemented. However, clustering of samples does not necessarily imply the presence of a subpopulation with high transmission rates, as groups of closely related viruses can also occur due to non-epidemiological effects such as over-sampling. It is important to ensure that observed phylogenetic clustering reflects true heterogeneity in the transmitting population, and is not being driven by non-epidemiological effects. We qualify the effect of using a falsely identified ‘transmission cluster’ of sequences to estimate phylodynamic parameters including the effective population size and exponential growth rate under several demographic scenarios. Our simulation studies show that taking the maximum size cluster to re-estimate parameters from trees simulated under a randomly mixing, constant population size coalescent process systematically underestimates the overall effective population size. In addition, the transmission cluster wrongly resembles an exponential or logistic growth model 99% of the time. We also illustrate the consequences of false clusters in exponentially growing coalescent and birth-death trees, where again, the growth rate is skewed upwards. This has clear implications for identifying clusters in large viral databases, where a false cluster could result in wasted intervention resources. PMID:28852573
Ivors, K; Garbelotto, M; Vries, I D E; Ruyter-Spira, C; Te Hekkert, B; Rosenzweig, N; Bonants, P
2006-05-01
Analysis of 12 polymorphic simple sequence repeats identified in the genome sequence of Phytophthora ramorum, causal agent of 'sudden oak death', revealed genotypic diversity to be significantly higher in nurseries (91% of total) than in forests (18% of total). Our analysis identified only two closely related genotypes in US forests, while the genetic structure of populations from European nurseries was of intermediate complexity, including multiple, closely related genotypes. Multilocus analysis determined populations in US forests reproduce clonally and are likely descendants of a single introduced individual. The 151 isolates analysed clustered in three clades. US forest and European nursery isolates clustered into two distinct clades, while one isolate from a US nursery belonged to a third novel clade. The combined microsatellite, sequencing and morphological analyses suggest the three clades represent distinct evolutionary lineages. All three clades were identified in some US nurseries, emphasizing the role of commercial plant trade in the movement of this pathogen.
Katayama, K; Sato, T; Arai, T; Amao, H; Ohta, Y; Ozawa, T; Kenyon, P R; Hickson, R E; Tazaki, H
2013-02-01
Simple liquid chromatography-mass spectrometry (LC-MS) was applied to non-targeted metabolic analyses to discover new metabolic markers in animal plasma. Principle component analysis (PCA) and partial least squares-discriminate analysis (PLS-DA) were used to analyse LC-MS multivariate data. PCA clearly generated two separate clusters for artificially induced diabetic mice and healthy control mice. PLS-DA of time-course changes in plasma metabolites of chicks after feeding generated three clusters (pre- and immediately after feeding, 0.5-3 h after feeding and 4 h after feeding). Two separate clusters were also generated for plasma metabolites of pregnant Angus heifers with differing live-weight change profiles (gaining or losing). The accompanying PLS-DA loading plot detailed the metabolites that contribute the most to the cluster separation. In each case, the same highly hydrophilic metabolite was strongly correlated to the group separation. The metabolite was identified as betaine by LC-MS/MS. This result indicates that betaine and its metabolic precursor, choline, may be useful biomarkers to evaluate the nutritional and metabolic status of animals. © 2011 Blackwell Verlag GmbH.
Identifying a gene expression signature of cluster headache in blood
Eising, Else; Pelzer, Nadine; Vijfhuizen, Lisanne S.; Vries, Boukje de; Ferrari, Michel D.; ‘t Hoen, Peter A. C.; Terwindt, Gisela M.; van den Maagdenberg, Arn M. J. M.
2017-01-01
Cluster headache is a relatively rare headache disorder, typically characterized by multiple daily, short-lasting attacks of excruciating, unilateral (peri-)orbital or temporal pain associated with autonomic symptoms and restlessness. To better understand the pathophysiology of cluster headache, we used RNA sequencing to identify differentially expressed genes and pathways in whole blood of patients with episodic (n = 19) or chronic (n = 20) cluster headache in comparison with headache-free controls (n = 20). Gene expression data were analysed by gene and by module of co-expressed genes with particular attention to previously implicated disease pathways including hypocretin dysregulation. Only moderate gene expression differences were identified and no associations were found with previously reported pathogenic mechanisms. At the level of functional gene sets, associations were observed for genes involved in several brain-related mechanisms such as GABA receptor function and voltage-gated channels. In addition, genes and modules of co-expressed genes showed a role for intracellular signalling cascades, mitochondria and inflammation. Although larger study samples may be required to identify the full range of involved pathways, these results indicate a role for mitochondria, intracellular signalling and inflammation in cluster headache. PMID:28074859
A simple algorithm for the identification of clinical COPD phenotypes.
Burgel, Pierre-Régis; Paillasseur, Jean-Louis; Janssens, Wim; Piquet, Jacques; Ter Riet, Gerben; Garcia-Aymerich, Judith; Cosio, Borja; Bakke, Per; Puhan, Milo A; Langhammer, Arnulf; Alfageme, Inmaculada; Almagro, Pere; Ancochea, Julio; Celli, Bartolome R; Casanova, Ciro; de-Torres, Juan P; Decramer, Marc; Echazarreta, Andrés; Esteban, Cristobal; Gomez Punter, Rosa Mar; Han, MeiLan K; Johannessen, Ane; Kaiser, Bernhard; Lamprecht, Bernd; Lange, Peter; Leivseth, Linda; Marin, Jose M; Martin, Francis; Martinez-Camblor, Pablo; Miravitlles, Marc; Oga, Toru; Sofia Ramírez, Ana; Sin, Don D; Sobradillo, Patricia; Soler-Cataluña, Juan J; Turner, Alice M; Verdu Rivera, Francisco Javier; Soriano, Joan B; Roche, Nicolas
2017-11-01
This study aimed to identify simple rules for allocating chronic obstructive pulmonary disease (COPD) patients to clinical phenotypes identified by cluster analyses.Data from 2409 COPD patients of French/Belgian COPD cohorts were analysed using cluster analysis resulting in the identification of subgroups, for which clinical relevance was determined by comparing 3-year all-cause mortality. Classification and regression trees (CARTs) were used to develop an algorithm for allocating patients to these subgroups. This algorithm was tested in 3651 patients from the COPD Cohorts Collaborative International Assessment (3CIA) initiative.Cluster analysis identified five subgroups of COPD patients with different clinical characteristics (especially regarding severity of respiratory disease and the presence of cardiovascular comorbidities and diabetes). The CART-based algorithm indicated that the variables relevant for patient grouping differed markedly between patients with isolated respiratory disease (FEV 1 , dyspnoea grade) and those with multi-morbidity (dyspnoea grade, age, FEV 1 and body mass index). Application of this algorithm to the 3CIA cohorts confirmed that it identified subgroups of patients with different clinical characteristics, mortality rates (median, from 4% to 27%) and age at death (median, from 68 to 76 years).A simple algorithm, integrating respiratory characteristics and comorbidities, allowed the identification of clinically relevant COPD phenotypes. Copyright ©ERS 2017.
2011-01-01
Background Increased understanding of the variability in normal breast biology will enable us to identify mechanisms of breast cancer initiation and the origin of different subtypes, and to better predict breast cancer risk. Methods Gene expression patterns in breast biopsies from 79 healthy women referred to breast diagnostic centers in Norway were explored by unsupervised hierarchical clustering and supervised analyses, such as gene set enrichment analysis and gene ontology analysis and comparison with previously published genelists and independent datasets. Results Unsupervised hierarchical clustering identified two separate clusters of normal breast tissue based on gene-expression profiling, regardless of clustering algorithm and gene filtering used. Comparison of the expression profile of the two clusters with several published gene lists describing breast cells revealed that the samples in cluster 1 share characteristics with stromal cells and stem cells, and to a certain degree with mesenchymal cells and myoepithelial cells. The samples in cluster 1 also share many features with the newly identified claudin-low breast cancer intrinsic subtype, which also shows characteristics of stromal and stem cells. More women belonging to cluster 1 have a family history of breast cancer and there is a slight overrepresentation of nulliparous women in cluster 1. Similar findings were seen in a separate dataset consisting of histologically normal tissue from both breasts harboring breast cancer and from mammoplasty reductions. Conclusion This is the first study to explore the variability of gene expression patterns in whole biopsies from normal breasts and identified distinct subtypes of normal breast tissue. Further studies are needed to determine the specific cell contribution to the variation in the biology of normal breasts, how the clusters identified relate to breast cancer risk and their possible link to the origin of the different molecular subtypes of breast cancer. PMID:22044755
Childhood cancer in small geographical areas and proximity to air-polluting industries.
Ortega-García, Juan A; López-Hernández, Fernando A; Cárceles-Álvarez, Alberto; Fuster-Soler, José L; Sotomayor, Diana I; Ramis, Rebeca
2017-07-01
Pediatric cancer has been associated with exposure to certain environmental carcinogens. The purpose of this work is to analyse the relationship between environmental pollution and pediatric cancer risk. We analysed all incidences of pediatric cancer (<15) diagnosed in a Spanish region during the period 1998-2015. The place of residence of each patient and the exact geographical coordinates of main industrial facilities was codified in order to analyse the spatial distribution of cases of cancer in relation to industrial areas. Focal tests and focused Scan methodology were used for the identification of high-incidence-rate spatial clusters around the main industrial pollution foci. The crude rate for the period was 148.0 cases per 1,000,0000 children. The incidence of pediatric cancer increased significantly along the period of study. With respect to spatial distribution, results showed significant high incidence around some industrial pollution foci group and the Scan methodology identify spatial clustering. We observe a global major incidence of non Hodgkin lymphomas (NHL) considering all foci, and high incidence of Sympathetic Nervous System Tumour (SNST) around Energy and Electric and organic and inorganic chemical industries foci group. In the analysis foci to foci, the focused Scan test identifies several significant spatial clusters. Particularly, three significant clusters were identified: the first of SNST was around energy-generating chemical industries (2 cases versus the expected 0.26), another of NHL was around residue-valorisation plants (5 cases versus the expected 0.91) and finally one cluster of Hodgkin lymphoma around building materials (3 cases versus the expected 2.2) CONCLUSION: Results suggest a possible association between proximity to certain industries and pediatric cancer risk. More evidences are necessary before establishing the relationship between industrial pollution and pediatric cancer incidence. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Waldram, Alison; Dolan, Gayle; Ashton, Philip M; Jenkins, Claire; Dallman, Timothy J
2018-05-01
The unprecedented level of bacterial strain discrimination provided by whole genome sequencing (WGS) presents new challenges with respect to the utility and interpretation of the data. Whole genome sequences from 1445 isolates of Salmonella belonging to the most commonly identified serotypes in England and Wales isolated between April and August 2014 were analysed. Single linkage single nucleotide polymorphism thresholds at the 10, 5 and 0 level were explored for evidence of epidemiological links between clustered cases. Analysis of the WGS data organised 566 of the 1445 isolates into 32 clusters of five or more. A statistically significant epidemiological link was identified for 17 clusters. The clusters were associated with foreign travel (n = 8), consumption of Chinese takeaways (n = 4), chicken eaten at home (n = 2), and one each of the following; eating out, contact with another case in the home and contact with reptiles. In the same time frame, one cluster was detected using traditional outbreak detection methods. WGS can be used for the highly specific and highly sensitive detection of biologically related isolates when epidemiological links are obscured. Improvements in the collection of detailed, standardised exposure information would enhance cluster investigations. Copyright © 2017 Elsevier Ltd. All rights reserved.
Replicating cluster subtypes for the prevention of adolescent smoking and alcohol use.
Babbin, Steven F; Velicer, Wayne F; Paiva, Andrea L; Brick, Leslie Ann D; Redding, Colleen A
2015-01-01
Substance abuse interventions tailored to the individual level have produced effective outcomes for a wide variety of behaviors. One approach to enhancing tailoring involves using cluster analysis to identify prevention subtypes that represent different attitudes about substance use. This study applied this approach to better understand tailored interventions for smoking and alcohol prevention. Analyses were performed on a sample of sixth graders from 20 New England middle schools involved in a 36-month tailored intervention study. Most adolescents reported being in the Acquisition Precontemplation (aPC) stage at baseline: not smoking or not drinking and not planning to start in the next six months. For smoking (N=4059) and alcohol (N=3973), each sample was randomly split into five subsamples. Cluster analysis was performed within each subsample based on three variables: Pros and Cons (from Decisional Balance Scales), and Situational Temptations. Across all subsamples for both smoking and alcohol, the following four clusters were identified: (1) Most Protected (MP; low Pros, high Cons, low Temptations); (2) Ambivalent (AM; high Pros, average Cons and Temptations); (3) Risk Denial (RD; average Pros, low Cons, average Temptations); and (4) High Risk (HR; high Pros, low Cons, and very high Temptations). Finding the same four clusters within aPC for both smoking and alcohol, replicating the results across the five subsamples, and demonstrating hypothesized relations among the clusters with additional external validity analyses provide strong evidence of the robustness of these results. These clusters demonstrate evidence of validity and can provide a basis for tailoring interventions. Copyright © 2014. Published by Elsevier Ltd.
Replicating cluster subtypes for the prevention of adolescent smoking and alcohol use
Babbin, Steven F.; Velicer, Wayne F.; Paiva, Andrea L.; Brick, Leslie Ann D.; Redding, Colleen A.
2015-01-01
Introduction Substance abuse interventions tailored to the individual level have produced effective outcomes for a wide variety of behaviors. One approach to enhancing tailoring involves using cluster analysis to identify prevention subtypes that represent different attitudes about substance use. This study applied this approach to better understand tailored interventions for smoking and alcohol prevention. Methods Analyses were performed on a sample of sixth graders from 20 New England middle schools involved in a 36-month tailored intervention study. Most adolescents reported being in the Acquisition Precontemplation (aPC) stage at baseline: not smoking or not drinking and not planning to start in the next six months. For smoking (N= 4059) and alcohol (N= 3973), each sample was randomly split into five subsamples. Cluster analysis was performed within each subsample based on three variables: Pros and Cons (from Decisional Balance Scales), and Situational Temptations. Results Across all subsamples for both smoking and alcohol, the following four clusters were identified: (1) Most Protected (MP; low Pros, high Cons, low Temptations); (2) Ambivalent (AM; high Pros, average Cons and Temptations); (3) Risk Denial (RD; average Pros, low Cons, average Temptations); and (4) High Risk (HR; high Pros, low Cons, and very high Temptations). Conclusions Finding the same four clusters within aPC for both smoking and alcohol, replicating the results across the five subsamples, and demonstrating hypothesized relations among the clusters with additional external validity analyses provide strong evidence of the robustness of these results. These clusters demonstrate evidence of validity and can provide a basis for tailoring interventions. PMID:25222849
Suicide methods in children and adolescents.
Kõlves, Kairi; de Leo, Diego
2017-02-01
There are notable differences in suicide methods between countries. The aim of this paper is to analyse and describe suicide methods in children and adolescents aged 10-19 years in different countries/territories worldwide. Suicide data by ICD-10 X codes were obtained from the WHO Mortality Database and population data from the World Bank. In total, 101 countries or territories, have data at least for 5 years in 2000-2009. Cluster analysis by suicide methods was performed for countries/territories with at least 10 suicide cases separately by gender (74 for males and 71 for females) in 2000-2009. The most frequent suicide method was hanging, followed by poisoning by pesticides for females and firearms for males. Cluster analyses of similarities in the country/territory level suicide method patterns by gender identified four clusters for both gender. Hanging and poisoning by pesticides defined the clusters of countries/territories by their suicide patterns in youth for both genders. In addition, a mixed method and a jumping from height cluster were identified for females and two mixed method clusters for males. A number of geographical similarities were observed. Overall, the patterns of suicide methods in children and adolescents reflect lethality, availability and acceptability of suicide means similarly to country specific patterns of all ages. Means restriction has very good potential in preventing youth suicides in different countries. It is also crucial to consider cognitive availability influenced by sensationalised media reporting and/or provision of technical details about specific methods.
Factors influencing the quality of life of haemodialysis patients according to symptom cluster.
Shim, Hye Yeung; Cho, Mi-Kyoung
2018-05-01
To identify the characteristics in each symptom cluster and factors influencing the quality of life of haemodialysis patients in Korea according to cluster. Despite developments in renal replacement therapy, haemodialysis still restricts the activities of daily living due to pain and impairs physical functioning induced by the disease and its complications. Descriptive survey. Two hundred and thirty dialysis patients aged >18 years. They completed self-administered questionnaires of Dialysis Symptom Index and Kidney Disease Quality of Life instrument-Short Form 1.3. To determine the optimal number of clusters, the collected data were analysed using polytomous variable latent class analysis in R software (poLCA) to estimate the latent class models and the latent class regression models for polytomous outcome variables. Differences in characteristics, symptoms and QOL according to the symptom cluster of haemodialysis patients were analysed using the independent t test and chi-square test. The factors influencing the QOL according to symptom cluster were identified using hierarchical multiple regression analysis. Physical and emotional symptoms were significantly more severe, and the QOL was significantly worse in Cluster 1 than in Cluster 2. The factors influencing the QOL were spouse, job, insurance type and physical and emotional symptoms in Cluster 1, with these variables having an explanatory power of 60.9%. Physical and emotional symptoms were the only influencing factors in Cluster 2, and they had an explanatory power of 37.4%. Mitigating the symptoms experienced by haemodialysis patients and improving their QOL require educational and therapeutic symptom management interventions that are tailored according to the characteristics and symptoms in each cluster. The findings of this study are expected to lead to practical guidelines for addressing the symptoms experienced by haemodialysis patients, and they provide basic information for developing nursing interventions to manage these symptoms and improve the QOL of these patients. © 2017 John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Schaefer, Andreas M.; Daniell, James E.; Wenzel, Friedemann
2017-07-01
Earthquake clustering is an essential part of almost any statistical analysis of spatial and temporal properties of seismic activity. The nature of earthquake clusters and subsequent declustering of earthquake catalogues plays a crucial role in determining the magnitude-dependent earthquake return period and its respective spatial variation for probabilistic seismic hazard assessment. This study introduces the Smart Cluster Method (SCM), a new methodology to identify earthquake clusters, which uses an adaptive point process for spatio-temporal cluster identification. It utilises the magnitude-dependent spatio-temporal earthquake density to adjust the search properties, subsequently analyses the identified clusters to determine directional variation and adjusts its search space with respect to directional properties. In the case of rapid subsequent ruptures like the 1992 Landers sequence or the 2010-2011 Darfield-Christchurch sequence, a reclassification procedure is applied to disassemble subsequent ruptures using near-field searches, nearest neighbour classification and temporal splitting. The method is capable of identifying and classifying earthquake clusters in space and time. It has been tested and validated using earthquake data from California and New Zealand. A total of more than 1500 clusters have been found in both regions since 1980 with M m i n = 2.0. Utilising the knowledge of cluster classification, the method has been adjusted to provide an earthquake declustering algorithm, which has been compared to existing methods. Its performance is comparable to established methodologies. The analysis of earthquake clustering statistics lead to various new and updated correlation functions, e.g. for ratios between mainshock and strongest aftershock and general aftershock activity metrics.
Cluster Randomised Trials in Cochrane Reviews: Evaluation of Methodological and Reporting Practice.
Richardson, Marty; Garner, Paul; Donegan, Sarah
2016-01-01
Systematic reviews can include cluster-randomised controlled trials (C-RCTs), which require different analysis compared with standard individual-randomised controlled trials. However, it is not known whether review authors follow the methodological and reporting guidance when including these trials. The aim of this study was to assess the methodological and reporting practice of Cochrane reviews that included C-RCTs against criteria developed from existing guidance. Criteria were developed, based on methodological literature and personal experience supervising review production and quality. Criteria were grouped into four themes: identifying, reporting, assessing risk of bias, and analysing C-RCTs. The Cochrane Database of Systematic Reviews was searched (2nd December 2013), and the 50 most recent reviews that included C-RCTs were retrieved. Each review was then assessed using the criteria. The 50 reviews we identified were published by 26 Cochrane Review Groups between June 2013 and November 2013. For identifying C-RCTs, only 56% identified that C-RCTs were eligible for inclusion in the review in the eligibility criteria. For reporting C-RCTs, only eight (24%) of the 33 reviews reported the method of cluster adjustment for their included C-RCTs. For assessing risk of bias, only one review assessed all five C-RCT-specific risk-of-bias criteria. For analysing C-RCTs, of the 27 reviews that presented unadjusted data, only nine (33%) provided a warning that confidence intervals may be artificially narrow. Of the 34 reviews that reported data from unadjusted C-RCTs, only 13 (38%) excluded the unadjusted results from the meta-analyses. The methodological and reporting practices in Cochrane reviews incorporating C-RCTs could be greatly improved, particularly with regard to analyses. Criteria developed as part of the current study could be used by review authors or editors to identify errors and improve the quality of published systematic reviews incorporating C-RCTs.
Jaakkola, Timo; Wang, C K John; Soini, Markus; Liukkonen, Jarmo
2015-09-01
The purpose of this study was to identify student clusters with homogenous profiles in perceptions of task- and ego-involving, autonomy, and social relatedness supporting motivational climate in school physical education. Additionally, we investigated whether different motivational climate groups differed in their enjoyment in PE. Participants of the study were 2 594 girls and 1 803 boys, aged 14-15 years. Students responded to questionnaires assessing their perception of motivational climate and enjoyment in physical education. Latent profile analyses produced a five-cluster solution labeled 1) 'low autonomy, relatedness, task, and moderate ego climate' group', 2) 'low autonomy, relatedness, and high task and ego climate, 3) 'moderate autonomy, relatedness, task and ego climate' group 4) 'high autonomy, relatedness, task, and moderate ego climate' group, and 5) 'high relatedness and task but moderate autonomy and ego climate' group. Analyses of variance showed that students in clusters 4 and 5 perceived the highest level of enjoyment whereas students in cluster 1 experienced the lowest level of enjoyment. The results showed that the students' perceptions of various motivational climates created differential levels of enjoyment in PE classes. Key pointsLatent profile analyses produced a five-cluster solution labeled 1) 'low autonomy, relatedness, task, and moderate ego climate' group', 2) 'low autonomy, relatedness, and high task and ego climate, 3) 'moderate autonomy, relatedness, task and ego climate' group 4) 'high autonomy, relatedness, task, and moderate ego climate' group, and 5) 'high relatedness and task but moderate autonomy and ego climate' group.Analyses of variance showed that clusters 4 and 5 perceived the highest level of enjoyment whereas cluster 1 experienced the lowest level of enjoyment. The results showed that the students' perceptions of motivational climate create differential levels of enjoyment in PE classes.
Molsberry, Samantha A; Cheng, Yu; Kingsley, Lawrence; Jacobson, Lisa; Levine, Andrew J; Martin, Eileen; Miller, Eric N; Munro, Cynthia A; Ragin, Ann; Sacktor, Ned; Becker, James T
2018-05-11
Mild forms of HIV-associated neurocognitive disorder (HAND) remain prevalent in the combination anti-retroviral therapy (cART) era. This study's objective was to identify neuropsychological subgroups within the Multicenter AIDS Cohort Study (MACS) based on the participant-based latent structure of cognitive function and to identify factors associated with subgroups. The MACS is a four-site longitudinal study of the natural and treated history of HIV disease among gay and bisexual men. Using neuropsychological domain scores we used a cluster variable selection algorithm to identify the optimal subset of domains with cluster information. Latent profile analysis was applied using scores from identified domains. Exploratory and post-hoc analyses were conducted to identify factors associated with cluster membership and the drivers of the observed associations. Cluster variable selection identified all domains as containing cluster information except for Working Memory. A three-profile solution produced the best fit for the data. Profile 1 performed below average on all domains, Profile 2 performed average on executive functioning, motor, and speed and below average on learning and memory, Profile 3 performed at or above average across all domains. Several demographic, cognitive, and social factors were associated with profile membership; these associations were driven by differences between Profile 1 and the other profiles. There is an identifiable pattern of neuropsychological performance among MACS members determined by all domains except Working Memory. Neither HIV nor HIV-related biomarkers were related with cluster membership, consistent with other findings that cognitive performance patterns do not map directly onto HIV serostatus.
Sasidharan, Lekshmi; Wu, Kun-Feng; Menendez, Monica
2015-12-01
One of the major challenges in traffic safety analyses is the heterogeneous nature of safety data, due to the sundry factors involved in it. This heterogeneity often leads to difficulties in interpreting results and conclusions due to unrevealed relationships. Understanding the underlying relationship between injury severities and influential factors is critical for the selection of appropriate safety countermeasures. A method commonly employed to address systematic heterogeneity is to focus on any subgroup of data based on the research purpose. However, this need not ensure homogeneity in the data. In this paper, latent class cluster analysis is applied to identify homogenous subgroups for a specific crash type-pedestrian crashes. The manuscript employs data from police reported pedestrian (2009-2012) crashes in Switzerland. The analyses demonstrate that dividing pedestrian severity data into seven clusters helps in reducing the systematic heterogeneity of the data and to understand the hidden relationships between crash severity levels and socio-demographic, environmental, vehicle, temporal, traffic factors, and main reason for the crash. The pedestrian crash injury severity models were developed for the whole data and individual clusters, and were compared using receiver operating characteristics curve, for which results favored clustering. Overall, the study suggests that latent class clustered regression approach is suitable for reducing heterogeneity and revealing important hidden relationships in traffic safety analyses. Copyright © 2015 Elsevier Ltd. All rights reserved.
Leung, Tommy W C; Mak, Darwin; Wong, K H; Wang, Y; Song, Y H; Tsang, D N C; Wong, C; Shao, Y M; Lim, W L
2008-07-01
We conducted a molecular epidemiological study on newly diagnosed human immunodeficiency virus type 1 (HIV-1)-infected patients in Hong Kong to identify the epidemiological linkage of HIV-1 infection in the locality. Reverse transcription polymerase chain reaction (RT-PCR) for HIV-1 was performed on newly diagnosed HIV-1-positive sera collected from January 2002 to December 2006. PCR products correspond to the env C2V3V4 region and gag p17/p24 junction of the HIV-1 genome were nucleotide sequenced. Phylogenetic analyses performed on the acquired nucleotide sequences revealed that CRF01_AE and subtype B were the two dominant HIV-1 subtypes. Analyses also demonstrated the presence of three emerging HIV-1 clusters among the subtype B sequences in Hong Kong. Individual cluster possesses a unique cluster-specific amino acid signature for identification. Data show that one of the clusters (Cluster I) is rapidly expanding. In addition to the unique cluster-specific amino acid signature, the majority of sequences in Cluster I harbor a 6-amino acid insertion at the gag p17/p24 junction in a region that is thought to be closely associated with HIV-1 infectivity.
Machine-learned cluster identification in high-dimensional data.
Ultsch, Alfred; Lötsch, Jörn
2017-02-01
High-dimensional biomedical data are frequently clustered to identify subgroup structures pointing at distinct disease subtypes. It is crucial that the used cluster algorithm works correctly. However, by imposing a predefined shape on the clusters, classical algorithms occasionally suggest a cluster structure in homogenously distributed data or assign data points to incorrect clusters. We analyzed whether this can be avoided by using emergent self-organizing feature maps (ESOM). Data sets with different degrees of complexity were submitted to ESOM analysis with large numbers of neurons, using an interactive R-based bioinformatics tool. On top of the trained ESOM the distance structure in the high dimensional feature space was visualized in the form of a so-called U-matrix. Clustering results were compared with those provided by classical common cluster algorithms including single linkage, Ward and k-means. Ward clustering imposed cluster structures on cluster-less "golf ball", "cuboid" and "S-shaped" data sets that contained no structure at all (random data). Ward clustering also imposed structures on permuted real world data sets. By contrast, the ESOM/U-matrix approach correctly found that these data contain no cluster structure. However, ESOM/U-matrix was correct in identifying clusters in biomedical data truly containing subgroups. It was always correct in cluster structure identification in further canonical artificial data. Using intentionally simple data sets, it is shown that popular clustering algorithms typically used for biomedical data sets may fail to cluster data correctly, suggesting that they are also likely to perform erroneously on high dimensional biomedical data. The present analyses emphasized that generally established classical hierarchical clustering algorithms carry a considerable tendency to produce erroneous results. By contrast, unsupervised machine-learned analysis of cluster structures, applied using the ESOM/U-matrix method, is a viable, unbiased method to identify true clusters in the high-dimensional space of complex data. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Evolution of coding and non-coding genes in HOX clusters of a marsupial.
Yu, Hongshi; Lindsay, James; Feng, Zhi-Ping; Frankenberg, Stephen; Hu, Yanqiu; Carone, Dawn; Shaw, Geoff; Pask, Andrew J; O'Neill, Rachel; Papenfuss, Anthony T; Renfree, Marilyn B
2012-06-18
The HOX gene clusters are thought to be highly conserved amongst mammals and other vertebrates, but the long non-coding RNAs have only been studied in detail in human and mouse. The sequencing of the kangaroo genome provides an opportunity to use comparative analyses to compare the HOX clusters of a mammal with a distinct body plan to those of other mammals. Here we report a comparative analysis of HOX gene clusters between an Australian marsupial of the kangaroo family and the eutherians. There was a strikingly high level of conservation of HOX gene sequence and structure and non-protein coding genes including the microRNAs miR-196a, miR-196b, miR-10a and miR-10b and the long non-coding RNAs HOTAIR, HOTAIRM1 and HOXA11AS that play critical roles in regulating gene expression and controlling development. By microRNA deep sequencing and comparative genomic analyses, two conserved microRNAs (miR-10a and miR-10b) were identified and one new candidate microRNA with typical hairpin precursor structure that is expressed in both fibroblasts and testes was found. The prediction of microRNA target analysis showed that several known microRNA targets, such as miR-10, miR-414 and miR-464, were found in the tammar HOX clusters. In addition, several novel and putative miRNAs were identified that originated from elsewhere in the tammar genome and that target the tammar HOXB and HOXD clusters. This study confirms that the emergence of known long non-coding RNAs in the HOX clusters clearly predate the marsupial-eutherian divergence 160 Ma ago. It also identified a new potentially functional microRNA as well as conserved miRNAs. These non-coding RNAs may participate in the regulation of HOX genes to influence the body plan of this marsupial.
Evolution of coding and non-coding genes in HOX clusters of a marsupial
2012-01-01
Background The HOX gene clusters are thought to be highly conserved amongst mammals and other vertebrates, but the long non-coding RNAs have only been studied in detail in human and mouse. The sequencing of the kangaroo genome provides an opportunity to use comparative analyses to compare the HOX clusters of a mammal with a distinct body plan to those of other mammals. Results Here we report a comparative analysis of HOX gene clusters between an Australian marsupial of the kangaroo family and the eutherians. There was a strikingly high level of conservation of HOX gene sequence and structure and non-protein coding genes including the microRNAs miR-196a, miR-196b, miR-10a and miR-10b and the long non-coding RNAs HOTAIR, HOTAIRM1 and HOXA11AS that play critical roles in regulating gene expression and controlling development. By microRNA deep sequencing and comparative genomic analyses, two conserved microRNAs (miR-10a and miR-10b) were identified and one new candidate microRNA with typical hairpin precursor structure that is expressed in both fibroblasts and testes was found. The prediction of microRNA target analysis showed that several known microRNA targets, such as miR-10, miR-414 and miR-464, were found in the tammar HOX clusters. In addition, several novel and putative miRNAs were identified that originated from elsewhere in the tammar genome and that target the tammar HOXB and HOXD clusters. Conclusions This study confirms that the emergence of known long non-coding RNAs in the HOX clusters clearly predate the marsupial-eutherian divergence 160 Ma ago. It also identified a new potentially functional microRNA as well as conserved miRNAs. These non-coding RNAs may participate in the regulation of HOX genes to influence the body plan of this marsupial. PMID:22708672
Dennis, Ann M; Murillo, Wendy; de Maria Hernandez, Flor; Guardado, Maria Elena; Nieto, Ana Isabel; Lorenzana de Rivera, Ivette; Eron, Joseph J; Paz-Bailey, Gabriela
2013-05-01
HIV in Central America is concentrated among certain groups such as men who have sex with men (MSM) and female sex workers (FSWs). We compared social recruitment chains and HIV transmission clusters from 699 MSM and 787 FSWs to better understand factors contributing to ongoing HIV transmission in El Salvador. Phylogenies were reconstructed using pol sequences from 119 HIV-positive individuals recruited by respondent-driven sampling (RDS) and compared with RDS chains in 3 cities in El Salvador. Transmission clusters with a mean pairwise genetic distance ≤ 0.015 and Bayesian posterior probabilities =1 were identified. Factors associated with cluster membership were evaluated among MSM. Sequences from 34 (43%) MSM and 4 (10%) FSW grouped in 14 transmission clusters. Clusters were defined by risk group (12 MSM clusters) and geographic residence (only 1 spanned separate cities). In 4 MSM clusters (all n = 2), individuals were also members of the same RDS chain, but only 2 had members directly linked through recruitment. All large clusters (n ≥ 3) spanned >1 RDS chain. Among MSM, factors independently associated with cluster membership included recent infection by BED assay (P = 0.02), sex with stable male partners (P = 0.02), and sex with ≥ 3 male partners in the past year (P = 0.04). We found few HIV transmissions corresponding directly with the social recruitment. However, we identified clustering in nearly one-half of MSM suggesting that RDS recruitment was indirectly but successfully uncovering transmission networks, particularly among recent infections. Interrogating RDS chains with phylogenetic analyses may help refine methods for identifying transmission clusters.
Jackson, Ben; Gucciardi, Daniel F; Dimmock, James A
2011-06-01
Recent studies of coach-athlete interaction have explored the bivariate relationships between each of the tripartite efficacy constructs (self-efficacy; other-efficacy; relation-inferred self-efficacy, or RISE) and various indicators of relationship quality. This investigation adopted an alternative approach by using cluster analyses to identify tripartite efficacy profiles within a sample of 377 individual sport athletes (Mage = 20.25, SD = 2.12), and examined how individuals in each cluster group differed in their perceptions about their relationship with their coach (i.e., commitment, satisfaction, conflict). Four clusters emerged: High (n = 128), Moderate (n = 95), and Low (n = 78) profiles, in which athletes reported relatively high, moderate, or low scores across all tripartite perceptions, respectively, as well as an Unfulfilled profile (n = 76) in which athletes held relatively high self-efficacy, but perceived lower levels of other-efficacy and RISE. Multivariate analyses revealed differences between the clusters on all relationship variables that were in line with theory. These results underscore the utility of considering synergistic issues in the examination of the tripartite efficacy framework.
Brain structure and function correlates of cognitive subtypes in schizophrenia.
Geisler, Daniel; Walton, Esther; Naylor, Melissa; Roessner, Veit; Lim, Kelvin O; Charles Schulz, S; Gollub, Randy L; Calhoun, Vince D; Sponheim, Scott R; Ehrlich, Stefan
2015-10-30
Stable neuropsychological deficits may provide a reliable basis for identifying etiological subtypes of schizophrenia. The aim of this study was to identify clusters of individuals with schizophrenia based on dimensions of neuropsychological performance, and to characterize their neural correlates. We acquired neuropsychological data as well as structural and functional magnetic resonance imaging from 129 patients with schizophrenia and 165 healthy controls. We derived eight cognitive dimensions and subsequently applied a cluster analysis to identify possible schizophrenia subtypes. Analyses suggested the following four cognitive clusters of schizophrenia: (1) Diminished Verbal Fluency, (2) Diminished Verbal Memory and Poor Motor Control, (3) Diminished Face Memory and Slowed Processing, and (4) Diminished Intellectual Function. The clusters were characterized by a specific pattern of structural brain changes in areas such as Wernicke's area, lingual gyrus and occipital face area, and hippocampus as well as differences in working memory-elicited neural activity in several fronto-parietal brain regions. Separable measures of cognitive function appear to provide a method for deriving cognitive subtypes meaningfully related to brain structure and function. Because the present study identified brain-based neural correlates of the cognitive clusters, the proposed groups of individuals with schizophrenia have some external validity. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Barette, Florian; Poppe, Sam; Smets, Benoît; Benbakkar, Mhammed; Kervyn, Matthieu
2017-10-01
We present an integrated, spatially-explicit database of existing geochemical major-element analyses available from (post-) colonial scientific reports, PhD Theses and international publications for the Virunga Volcanic Province, located in the western branch of the East African Rift System. This volcanic province is characterised by alkaline volcanism, including silica-undersaturated, alkaline and potassic lavas. The database contains a total of 908 geochemical analyses of eruptive rocks for the entire volcanic province with a localisation for most samples. A preliminary analysis of the overall consistency of the database, using statistical techniques on sets of geochemical analyses with contrasted analytical methods or dates, demonstrates that the database is consistent. We applied a principal component analysis and cluster analysis on whole-rock major element compositions included in the database to study the spatial variation of the chemical composition of eruptive products in the Virunga Volcanic Province. These statistical analyses identify spatially distributed clusters of eruptive products. The known geochemical contrasts are highlighted by the spatial analysis, such as the unique geochemical signature of Nyiragongo lavas compared to other Virunga lavas, the geochemical heterogeneity of the Bulengo area, and the trachyte flows of Karisimbi volcano. Most importantly, we identified separate clusters of eruptive products which originate from primitive magmatic sources. These lavas of primitive composition are preferentially located along NE-SW inherited rift structures, often at distance from the central Virunga volcanoes. Our results illustrate the relevance of a spatial analysis on integrated geochemical data for a volcanic province, as a complement to classical petrological investigations. This approach indeed helps to characterise geochemical variations within a complex of magmatic systems and to identify specific petrologic and geochemical investigations that should be tackled within a study area.
Hu, Valerie W.; Steinberg, Mara E.
2009-01-01
Heterogeneity in phenotypic presentation of ASD has been cited as one explanation for the difficulty in pinpointing specific genes involved in autism. Recent studies have attempted to reduce the “noise” in genetic and other biological data by reducing the phenotypic heterogeneity of the sample population. The current study employs multiple clustering algorithms on 123 item scores from the Autism Diagnostic Interview-Revised (ADI-R) diagnostic instrument of nearly 2000 autistic individuals to identify subgroups of autistic probands with clinically relevant behavioral phenotypes in order to isolate more homogeneous groups of subjects for gene expression analyses. Our combined cluster analyses suggest optimal division of the autistic probands into 4 phenotypic clusters based on similarity of symptom severity across the 123 selected item scores. One cluster is characterized by severe language deficits, while another exhibits milder symptoms across the domains. A third group possesses a higher frequency of savant skills while the fourth group exhibited intermediate severity across all domains. Grouping autistic individuals by multivariate cluster analysis of ADI-R scores reveals meaningful phenotypes of subgroups within the autistic spectrum which we show, in a related (accompanying) study, to be associated with distinct gene expression profiles. PMID:19455643
Clusters of midlife women by physical activity and their racial/ethnic differences.
Im, Eun-Ok; Ko, Young; Chee, Eunice; Chee, Wonshik; Mao, Jun James
2017-04-01
The purpose of this study was to identify clusters of midlife women by physical activity and to determine racial/ethnic differences in physical activities in each cluster. This was a secondary analysis of the data from 542 women (157 non-Hispanic [NH] Whites, 127 Hispanics, 135 NH African Americans, and 123 NH Asian) in a larger Internet study on midlife women's attitudes toward physical activity. The instruments included the Barriers to Health Activities Scale, the Physical Activity Assessment Inventory, the Questions on Attitudes toward Physical Activity, Subjective Norm, Perceived Behavioral Control, and Behavioral Intention, and the Kaiser Physical Activity Survey. The data were analyzed using hierarchical cluster analyses, analysis of variance, and multinominal logistic analyses. A three-cluster solution was adopted: cluster 1 (high active living and sports/exercise activity group; 48%), cluster 2 (high household/caregiving and occupational activity group; 27%), and cluster 3 (low active living and sports/exercise activity group; 26%). There were significant racial/ethnic differences in occupational activities of clusters 1 and 3 (all P < 0.01). Compared with cluster 1, cluster 2 tended to have lower family income, less access to health care, higher unemployment, higher perceived barriers scores, and lower social influences scores (all P < 0.01). Compared with cluster 1, cluster 3 tended to have greater obesity, less access to health care, higher perceived barriers scores, more negative attitudes toward physical activity, and lower self-efficacy scores (all P < 0.01). Midlife women's unique patterns of physical activity and their associated factors need to be considered in future intervention development.
Clusters of Midlife Women by Physical Activity and Their Racial/Ethnic Differences
Im, Eun-Ok; Ko, Young; Chee, Eunice; Chee, Wonshik; Mao, Jun James
2016-01-01
Objective The purpose of this study was to identify clusters of midlife women by physical activity and to determine racial/ethnic differences in physical activities in each cluster. Methods This was a secondary analysis of the data from 542 women (157 Non-Hispanic [NH] Whites, 127 Hispanics, 135 NH African Americans, and 123 NH Asian) in a larger Internet study on midlife women’s attitudes toward physical activity. The instruments included the Barriers to Health Activities Scale, the Physical Activity Assessment Inventory, the Questions on Attitudes toward Physical Activity, Subjective Norm, Perceived Behavioral Control, and Behavioral Intention, and the Kaiser Physical Activity Survey. The data were analyzed using hierarchical cluster analyses, ANOVA, and multinominal logistic analyses. Results A three cluster solution was adopted: Cluster 1 (high active living and sports/exercise activity group; 48%), Cluster 2 (high household/caregiving and occupational activity group; 27%), and Cluster 3 (low active living and sports/exercise activity group; 26%). There were significant racial/ethnic differences in occupational activities of Clusters 1 and 3 (all p<.01). Compared with Cluster 1, Cluster 2 tended to have lower family income, less access to health care, higher unemployment, higher perceived barriers scores, and lower social influences scores (all p<.01). Compared with Cluster 1, Cluster 3 tended to have greater obesity, less access to health care, higher perceived barriers scores, more negative attutides toward physical activity, and lower self-efficacy scores (all p<.01). Conclusions Midlife women’s unique patterns of physical activity and their associated factors need to be considered in future intervention development. PMID:27846052
DOE Office of Scientific and Technical Information (OSTI.GOV)
N Liu; P Yu
2011-12-31
The objective of this study was to use molecular spectral analyses with the diffuse reflectance Fourier transform infrared spectroscopy (DRIFT) bioanlytical technique to study carbohydrate conformation features, molecular clustering and interrelationships in hull and seed among six barley cultivars (AC Metcalfe, CDC Dolly, McLeod, CDC Helgason, CDC Trey, CDC Cowboy), which had different degradation kinetics in rumen. The molecular structure spectral analyses in both hull and seed involved the fingerprint regions of ca. 1536-1484 cm{sup -1} (attributed mainly to aromatic lignin semicircle ring stretch), ca. 1293-1212 cm{sup -1} (attributed mainly to cellulosic compounds in the hull), ca. 1269-1217 cm{sup -1}more » (attributed mainly to cellulosic compound in the seeds), and ca. 1180-800 cm{sup -1} (attributed mainly to total CHO C-O stretching vibrations) together with an agglomerative hierarchical cluster (AHCA) and principal component spectral analyses (PCA). The results showed that the DRIFT technique plus AHCA and PCA molecular analyses were able to reveal carbohydrate conformation features and identify carbohydrate molecular structure differences in both hull and seeds among the barley varieties. The carbohydrate molecular spectral analyses at the region of ca. 1185-800 cm{sup -1} together with the AHCA and PCA were able to show that the barley seed inherent structures exhibited distinguishable differences among the barley varieties. CDC Helgason had differences from AC Metcalfe, MeLeod, CDC Cowboy and CDC Dolly in carbohydrate conformation in the seed. Clear molecular cluster classes could be distinguished and identified in AHCA analysis and the separate ellipses could be grouped in PCA analysis. But CDC Helgason had no distinguished differences from CDC Trey in carbohydrate conformation. These carbohydrate conformation/structure difference could partially explain why the varieties were different in digestive behaviors in animals. The molecular spectroscopy technique used in this study could also be used for other plant-based feed and food structure studies.« less
Gender differences in psychiatric disorders and clusters of self-esteem among detained adolescents.
Van Damme, Lore; Colins, Olivier F; Vanderplasschen, Wouter
2014-12-30
Detained minors display substantial mental health needs. This study focused on two features (psychopathology and self-esteem) that have received considerable attention in the literature and clinical work, but have rarely been studied simultaneously in detained youths. The aims of this study were to examine gender differences in psychiatric disorders and clusters of self-esteem, and to test the hypothesis that the cluster of adolescents with lower (versus higher) levels of self-esteem have higher rates of psychiatric disorders. The prevalence of psychiatric disorders was assessed in 440 Belgian, detained adolescents using the Diagnostic Interview Schedule for Children-IV. Self-esteem was assessed using the Self-perception Profile for Adolescents. Model-based cluster analyses were performed to identify youths with lower and/or higher levels of self-esteem across several domains. Girls have higher rates for most psychiatric disorders and lower levels of self-esteem than boys. A higher number of clusters was identified in boys (four) than girls (three). Generally, the cluster of adolescents with lower (versus higher) levels of self-esteem had a higher prevalence of psychiatric disorders. These results suggest that the detection of low levels of self-esteem in adolescents, especially girls, might help clinicians to identify a subgroup of detained adolescents with the highest prevalence of psychopathology.
Dumuid, Dorothea; Olds, T; Lewis, L K; Martin-Fernández, J A; Barreira, T; Broyles, S; Chaput, J-P; Fogelholm, M; Hu, G; Kuriyan, R; Kurpad, A; Lambert, E V; Maia, J; Matsudo, V; Onywera, V O; Sarmiento, O L; Standage, M; Tremblay, M S; Tudor-Locke, C; Zhao, P; Katzmarzyk, P; Gillison, F; Maher, C
2018-02-01
The relationship between children's adiposity and lifestyle behaviour patterns is an area of growing interest. The objectives of this study are to identify clusters of children based on lifestyle behaviours and compare children's adiposity among clusters. Cross-sectional data from the International Study of Childhood Obesity, Lifestyle and the Environment were used. the participants were children (9-11 years) from 12 nations (n = 5710). 24-h accelerometry and self-reported diet and screen time were clustering input variables. Objectively measured adiposity indicators were waist-to-height ratio, percent body fat and body mass index z-scores. sex-stratified analyses were performed on the global sample and repeated on a site-wise basis. Cluster analysis (using isometric log ratios for compositional data) was used to identify common lifestyle behaviour patterns. Site representation and adiposity were compared across clusters using linear models. Four clusters emerged: (1) Junk Food Screenies, (2) Actives, (3) Sitters and (4) All-Rounders. Countries were represented differently among clusters. Chinese children were over-represented in Sitters and Colombian children in Actives. Adiposity varied across clusters, being highest in Sitters and lowest in Actives. Children from different sites clustered into groups of similar lifestyle behaviours. Cluster membership was linked with differing adiposity. Findings support the implementation of activity interventions in all countries, targeting both physical activity and sedentary time. © 2016 World Obesity Federation.
Text-mining analysis of mHealth research.
Ozaydin, Bunyamin; Zengul, Ferhat; Oner, Nurettin; Delen, Dursun
2017-01-01
In recent years, because of the advancements in communication and networking technologies, mobile technologies have been developing at an unprecedented rate. mHealth, the use of mobile technologies in medicine, and the related research has also surged parallel to these technological advancements. Although there have been several attempts to review mHealth research through manual processes such as systematic reviews, the sheer magnitude of the number of studies published in recent years makes this task very challenging. The most recent developments in machine learning and text mining offer some potential solutions to address this challenge by allowing analyses of large volumes of texts through semi-automated processes. The objective of this study is to analyze the evolution of mHealth research by utilizing text-mining and natural language processing (NLP) analyses. The study sample included abstracts of 5,644 mHealth research articles, which were gathered from five academic search engines by using search terms such as mobile health, and mHealth. The analysis used the Text Explorer module of JMP Pro 13 and an iterative semi-automated process involving tokenizing, phrasing, and terming. After developing the document term matrix (DTM) analyses such as single value decomposition (SVD), topic, and hierarchical document clustering were performed, along with the topic-informed document clustering approach. The results were presented in the form of word-clouds and trend analyses. There were several major findings regarding research clusters and trends. First, our results confirmed time-dependent nature of terminology use in mHealth research. For example, in earlier versus recent years the use of terminology changed from "mobile phone" to "smartphone" and from "applications" to "apps". Second, ten clusters for mHealth research were identified including (I) Clinical Research on Lifestyle Management, (II) Community Health, (III) Literature Review, (IV) Medical Interventions, (V) Research Design, (VI) Infrastructure, (VII) Applications, (VIII) Research and Innovation in Health Technologies, (IX) Sensor-based Devices and Measurement Algorithms, (X) Survey-based Research. Third, the trend analyses indicated the infrastructure cluster as the highest percentage researched area until 2014. The Research and Innovation in Health Technologies cluster experienced the largest increase in numbers of publications in recent years, especially after 2014. This study is unique because it is the only known study utilizing text-mining analyses to reveal the streams and trends for mHealth research. The fast growth in mobile technologies is expected to lead to higher numbers of studies focusing on mHealth and its implications for various healthcare outcomes. Findings of this study can be utilized by researchers in identifying areas for future studies.
Text-mining analysis of mHealth research
Zengul, Ferhat; Oner, Nurettin; Delen, Dursun
2017-01-01
In recent years, because of the advancements in communication and networking technologies, mobile technologies have been developing at an unprecedented rate. mHealth, the use of mobile technologies in medicine, and the related research has also surged parallel to these technological advancements. Although there have been several attempts to review mHealth research through manual processes such as systematic reviews, the sheer magnitude of the number of studies published in recent years makes this task very challenging. The most recent developments in machine learning and text mining offer some potential solutions to address this challenge by allowing analyses of large volumes of texts through semi-automated processes. The objective of this study is to analyze the evolution of mHealth research by utilizing text-mining and natural language processing (NLP) analyses. The study sample included abstracts of 5,644 mHealth research articles, which were gathered from five academic search engines by using search terms such as mobile health, and mHealth. The analysis used the Text Explorer module of JMP Pro 13 and an iterative semi-automated process involving tokenizing, phrasing, and terming. After developing the document term matrix (DTM) analyses such as single value decomposition (SVD), topic, and hierarchical document clustering were performed, along with the topic-informed document clustering approach. The results were presented in the form of word-clouds and trend analyses. There were several major findings regarding research clusters and trends. First, our results confirmed time-dependent nature of terminology use in mHealth research. For example, in earlier versus recent years the use of terminology changed from “mobile phone” to “smartphone” and from “applications” to “apps”. Second, ten clusters for mHealth research were identified including (I) Clinical Research on Lifestyle Management, (II) Community Health, (III) Literature Review, (IV) Medical Interventions, (V) Research Design, (VI) Infrastructure, (VII) Applications, (VIII) Research and Innovation in Health Technologies, (IX) Sensor-based Devices and Measurement Algorithms, (X) Survey-based Research. Third, the trend analyses indicated the infrastructure cluster as the highest percentage researched area until 2014. The Research and Innovation in Health Technologies cluster experienced the largest increase in numbers of publications in recent years, especially after 2014. This study is unique because it is the only known study utilizing text-mining analyses to reveal the streams and trends for mHealth research. The fast growth in mobile technologies is expected to lead to higher numbers of studies focusing on mHealth and its implications for various healthcare outcomes. Findings of this study can be utilized by researchers in identifying areas for future studies. PMID:29430456
Opara, Umezuruike Linus; Jacobson, Dan; Al-Saady, Nadiya Abubakar
2010-01-01
Banana is an important crop grown in Oman and there is a dearth of information on its genetic diversity to assist in crop breeding and improvement programs. This study employed amplified fragment length polymorphism (AFLP) to investigate the genetic variation in local banana cultivars from the southern region of Oman. Using 12 primer combinations, a total of 1094 bands were scored, of which 1012 were polymorphic. Eighty-two unique markers were identified, which revealed the distinct separation of the seven cultivars. The results obtained show that AFLP can be used to differentiate the banana cultivars. Further classification by phylogenetic, hierarchical clustering and principal component analyses showed significant differences between the clusters found with molecular markers and those clusters created by previous studies using morphological analysis. Based on the analytical results, a consensus dendrogram of the banana cultivars is presented. PMID:20443211
Within-Group Differences in Sexual Orientation and Identity
ERIC Educational Resources Information Center
Worthington, Roger L.; Reynolds, Amy L.
2009-01-01
The purpose of this investigation was to examine within-group differences among self-identified sexual orientation and identity groups. To understand these within-group differences, 2 types of analysis were conducted. First, a sample of 2,732 participants completed the Sexual Orientation and Identity Scale. Cluster analyses were used to identify 3…
Wildfire cluster detection using space-time scan statistics
NASA Astrophysics Data System (ADS)
Tonini, M.; Tuia, D.; Ratle, F.; Kanevski, M.
2009-04-01
The aim of the present study is to identify spatio-temporal clusters of fires sequences using space-time scan statistics. These statistical methods are specifically designed to detect clusters and assess their significance. Basically, scan statistics work by comparing a set of events occurring inside a scanning window (or a space-time cylinder for spatio-temporal data) with those that lie outside. Windows of increasing size scan the zone across space and time: the likelihood ratio is calculated for each window (comparing the ratio "observed cases over expected" inside and outside): the window with the maximum value is assumed to be the most probable cluster, and so on. Under the null hypothesis of spatial and temporal randomness, these events are distributed according to a known discrete-state random process (Poisson or Bernoulli), which parameters can be estimated. Given this assumption, it is possible to test whether or not the null hypothesis holds in a specific area. In order to deal with fires data, the space-time permutation scan statistic has been applied since it does not require the explicit specification of the population-at risk in each cylinder. The case study is represented by Florida daily fire detection using the Moderate Resolution Imaging Spectroradiometer (MODIS) active fire product during the period 2003-2006. As result, statistically significant clusters have been identified. Performing the analyses over the entire frame period, three out of the five most likely clusters have been identified in the forest areas, on the North of the country; the other two clusters cover a large zone in the South, corresponding to agricultural land and the prairies in the Everglades. Furthermore, the analyses have been performed separately for the four years to analyze if the wildfires recur each year during the same period. It emerges that clusters of forest fires are more frequent in hot seasons (spring and summer), while in the South areas they are widely present along the whole year. The analysis of fires distribution to evaluate if they are statistically more frequent in some area or/and in some period of the year, can be useful to support fire management and to focus on prevention measures.
Pego-Reigosa, José María; Lois-Iglesias, Ana; Rúa-Figueroa, Íñigo; Galindo, María; Calvo-Alén, Jaime; de Uña-Álvarez, Jacobo; Balboa-Barreiro, Vanessa; Ibáñez Ruan, Jesús; Olivé, Alejandro; Rodríguez-Gómez, Manuel; Fernández Nebro, Antonio; Andrés, Mariano; Erausquin, Celia; Tomero, Eva; Horcada Rubio, Loreto; Uriarte Isacelaya, Esther; Freire, Mercedes; Montilla, Carlos; Sánchez-Atrio, Ana I; Santos-Soler, Gregorio; Zea, Antonio; Díez, Elvira; Narváez, Javier; Blanco-Alonso, Ricardo; Silva-Fernández, Lucía; Ruiz-Lucea, María Esther; Fernández-Castro, Mónica; Hernández-Beriain, José Ángel; Gantes-Mora, Marian; Hernández-Cruz, Blanca; Pérez-Venegas, José; Pecondón-Español, Ángela; Marras Fernández-Cid, Carlos; Ibáñez-Barcelo, Mónica; Bonilla, Gema; Torrente-Segarra, Vicenç; Castellví, Iván; Alegre, Juan José; Calvet, Joan; Marenco de la Fuente, José Luis; Raya, Enrique; Vázquez-Rodríguez, Tomás Ramón; Quevedo-Vila, Víctor; Muñoz-Fernández, Santiago; Otón, Teresa; Rahman, Anisur; López-Longo, Francisco Javier
2016-07-01
To identify patterns (clusters) of damage manifestations within a large cohort of SLE patients and evaluate the potential association of these clusters with a higher risk of mortality. This is a multicentre, descriptive, cross-sectional study of a cohort of 3656 SLE patients from the Spanish Society of Rheumatology Lupus Registry. Organ damage was ascertained using the Systemic Lupus International Collaborating Clinics Damage Index. Using cluster analysis, groups of patients with similar patterns of damage manifestations were identified. Then, overall clusters were compared as well as the subgroup of patients within every cluster with disease duration shorter than 5 years. Three damage clusters were identified. Cluster 1 (80.6% of patients) presented a lower amount of individuals with damage (23.2 vs 100% in clusters 2 and 3, P < 0.001). Cluster 2 (11.4% of patients) was characterized by musculoskeletal damage in all patients. Cluster 3 (8.0% of patients) was the only group with cardiovascular damage, and this was present in all patients. The overall mortality rate of patients in clusters 2 and 3 was higher than that in cluster 1 (P < 0.001 for both comparisons) and in patients with disease duration shorter than 5 years as well. In a large cohort of SLE patients, cardiovascular and musculoskeletal damage manifestations were the two dominant forms of damage to sort patients into clinically meaningful clusters. Both in early and late stages of the disease, there was a significant association of these clusters with an increased risk of mortality. Physicians should pay special attention to the early prevention of damage in these two systems. © The Author 2016. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Wasito, Ito; Hashim, Siti Zaiton M; Sukmaningrum, Sri
2007-01-01
Gene expression profiling plays an important role in the identification of biological and clinical properties of human solid tumors such as colorectal carcinoma. Profiling is required to reveal underlying molecular features for diagnostic and therapeutic purposes. A non-parametric density-estimation-based approach called iterative local Gaussian clustering (ILGC), was used to identify clusters of expressed genes. We used experimental data from a previous study by Muro and others consisting of 1,536 genes in 100 colorectal cancer and 11 normal tissues. In this dataset, the ILGC finds three clusters, two large and one small gene clusters, similar to their results which used Gaussian mixture clustering. The correlation of each cluster of genes and clinical properties of malignancy of human colorectal cancer was analysed for the existence of tumor or normal, the existence of distant metastasis and the existence of lymph node metastasis. PMID:18305825
Wasito, Ito; Hashim, Siti Zaiton M; Sukmaningrum, Sri
2007-12-30
Gene expression profiling plays an important role in the identification of biological and clinical properties of human solid tumors such as colorectal carcinoma. Profiling is required to reveal underlying molecular features for diagnostic and therapeutic purposes. A non-parametric density-estimation-based approach called iterative local Gaussian clustering (ILGC), was used to identify clusters of expressed genes. We used experimental data from a previous study by Muro and others consisting of 1,536 genes in 100 colorectal cancer and 11 normal tissues. In this dataset, the ILGC finds three clusters, two large and one small gene clusters, similar to their results which used Gaussian mixture clustering. The correlation of each cluster of genes and clinical properties of malignancy of human colorectal cancer was analysed for the existence of tumor or normal, the existence of distant metastasis and the existence of lymph node metastasis.
Qin, Qianqian; Guo, Wei; Tang, Weiming; Mahapatra, Tanmay; Wang, Liyan; Zhang, Nanci; Ding, Zhengwei; Cai, Chang; Cui, Yan; Sun, Jiangping
2017-04-01
Studies have shown a recent upsurge in human immunodeficiency virus (HIV) burden among men who have sex with men (MSM) in China, especially in urban areas. For intervention planning and resource allocation, spatial analyses of HIV/AIDS case-clusters were required to identify epidemic foci and trends among MSM in China. Information regarding MSM recorded as HIV/AIDS cases during 2006-2015 were extracted from the National Case Reporting System. Demographic trends were determined through Cochran-Armitage trend tests. Distribution of case-clusters was examined using spatial autocorrelation. Spatial-temporal scan was used to detect disease clustering. Spatial correlations between cases and socioenvironmental factors were determined by spatial regression. Between 2006 and 2015, in China, 120 371 HIV/AIDS cases were identified among MSM. Newly identified HIV/AIDS cases among self-reported MSM increased from 487 cases in 2006 to >30 000 cases in 2015. Among those HIV/AIDS cases recorded during 2006-2015, 47.0% were 20-29 years old and 24.9% were aged 30-39 years. Based on clusters of HIV/AIDS cases identified through spatial analysis, the epidemic was concentrated among MSM in large cities. Spatial-temporal clusters contained municipalities, provincial capitals, and main cities such as Beijing, Shanghai, Chongqing, Chengdu, and Guangzhou. Spatial regression analysis showed that sociodemographic indicators such as population density, per capita gross domestic product, and number of county-level medical institutions had statistically significant positive correlations with HIV/AIDS among MSM. Assorted spatial analyses revealed an increasingly concentrated HIV epidemic among young MSM in Chinese cities, calling for targeted health education and intensive interventions at an early age. © The Author 2017. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail: journals.permissions@oup.com.
Gene duplications in prokaryotes can be associated with environmental adaptation
2010-01-01
Background Gene duplication is a normal evolutionary process. If there is no selective advantage in keeping the duplicated gene, it is usually reduced to a pseudogene and disappears from the genome. However, some paralogs are retained. These gene products are likely to be beneficial to the organism, e.g. in adaptation to new environmental conditions. The aim of our analysis is to investigate the properties of paralog-forming genes in prokaryotes, and to analyse the role of these retained paralogs by relating gene properties to life style of the corresponding prokaryotes. Results Paralogs were identified in a number of prokaryotes, and these paralogs were compared to singletons of persistent orthologs based on functional classification. This showed that the paralogs were associated with for example energy production, cell motility, ion transport, and defence mechanisms. A statistical overrepresentation analysis of gene and protein annotations was based on paralogs of the 200 prokaryotes with the highest fraction of paralog-forming genes. Biclustering of overrepresented gene ontology terms versus species was used to identify clusters of properties associated with clusters of species. The clusters were classified using similarity scores on properties and species to identify interesting clusters, and a subset of clusters were analysed by comparison to literature data. This analysis showed that paralogs often are associated with properties that are important for survival and proliferation of the specific organisms. This includes processes like ion transport, locomotion, chemotaxis and photosynthesis. However, the analysis also showed that the gene ontology terms sometimes were too general, imprecise or even misleading for automatic analysis. Conclusions Properties described by gene ontology terms identified in the overrepresentation analysis are often consistent with individual prokaryote lifestyles and are likely to give a competitive advantage to the organism. Paralogs and singletons dominate different categories of functional classification, where paralogs in particular seem to be associated with processes involving interaction with the environment. PMID:20961426
Gene duplications in prokaryotes can be associated with environmental adaptation.
Bratlie, Marit S; Johansen, Jostein; Sherman, Brad T; Huang, Da Wei; Lempicki, Richard A; Drabløs, Finn
2010-10-20
Gene duplication is a normal evolutionary process. If there is no selective advantage in keeping the duplicated gene, it is usually reduced to a pseudogene and disappears from the genome. However, some paralogs are retained. These gene products are likely to be beneficial to the organism, e.g. in adaptation to new environmental conditions. The aim of our analysis is to investigate the properties of paralog-forming genes in prokaryotes, and to analyse the role of these retained paralogs by relating gene properties to life style of the corresponding prokaryotes. Paralogs were identified in a number of prokaryotes, and these paralogs were compared to singletons of persistent orthologs based on functional classification. This showed that the paralogs were associated with for example energy production, cell motility, ion transport, and defence mechanisms. A statistical overrepresentation analysis of gene and protein annotations was based on paralogs of the 200 prokaryotes with the highest fraction of paralog-forming genes. Biclustering of overrepresented gene ontology terms versus species was used to identify clusters of properties associated with clusters of species. The clusters were classified using similarity scores on properties and species to identify interesting clusters, and a subset of clusters were analysed by comparison to literature data. This analysis showed that paralogs often are associated with properties that are important for survival and proliferation of the specific organisms. This includes processes like ion transport, locomotion, chemotaxis and photosynthesis. However, the analysis also showed that the gene ontology terms sometimes were too general, imprecise or even misleading for automatic analysis. Properties described by gene ontology terms identified in the overrepresentation analysis are often consistent with individual prokaryote lifestyles and are likely to give a competitive advantage to the organism. Paralogs and singletons dominate different categories of functional classification, where paralogs in particular seem to be associated with processes involving interaction with the environment.
Gonzalez, Robert; Suppes, Trisha; Zeitzer, Jamie; McClung, Colleen; Tamminga, Carol; Tohen, Mauricio; Forero, Angelica; Dwivedi, Alok; Alvarado, Andres
2018-02-19
Multiple types of chronobiological disturbances have been reported in bipolar disorder, including characteristics associated with general activity levels, sleep, and rhythmicity. Previous studies have focused on examining the individual relationships between affective state and chronobiological characteristics. The aim of this study was to conduct a variable cluster analysis in order to ascertain how mood states are associated with chronobiological traits in bipolar I disorder (BDI). We hypothesized that manic symptomatology would be associated with disturbances of rhythm. Variable cluster analysis identified five chronobiological clusters in 105 BDI subjects. Cluster 1, comprising subjective sleep quality was associated with both mania and depression. Cluster 2, which comprised variables describing the degree of rhythmicity, was associated with mania. Significant associations between mood state and cluster analysis-identified chronobiological variables were noted. Disturbances of mood were associated with subjectively assessed sleep disturbances as opposed to objectively determined, actigraphy-based sleep variables. No associations with general activity variables were noted. Relationships between gender and medication classes in use and cluster analysis-identified chronobiological characteristics were noted. Exploratory analyses noted that medication class had a larger impact on these relationships than the number of psychiatric medications in use. In a BDI sample, variable cluster analysis was able to group related chronobiological variables. The results support our primary hypothesis that mood state, particularly mania, is associated with chronobiological disturbances. Further research is required in order to define these relationships and to determine the directionality of the associations between mood state and chronobiological characteristics.
On the game of life: population and its diversity
NASA Astrophysics Data System (ADS)
Sales, T. M.; Garcia, J. B. C.; Jyh, T. I.; Ren, T. I.; Gomes, M. A. F.
1993-08-01
One of the most important features of biological life in all levels is its astounding diversity. In this work we study the well-known game “Life” due to Conway analysing the statistics of cluster population, N( t), and cluster diversity, D( t). We have performed simulations on “Life” for dimensions d = 1 and 2 starting with an uncorrelated distribution of live and dead sites at t = 0. For d = 2 we study the effect of different neighbourhood relations in identifying and counting clusters. An interesting scaling relation connecting the maxima of N( t) and D( t) is found.
Pressure of the hot gas in simulations of galaxy clusters
NASA Astrophysics Data System (ADS)
Planelles, S.; Fabjan, D.; Borgani, S.; Murante, G.; Rasia, E.; Biffi, V.; Truong, N.; Ragone-Figueroa, C.; Granato, G. L.; Dolag, K.; Pierpaoli, E.; Beck, A. M.; Steinborn, Lisa K.; Gaspari, M.
2017-06-01
We analyse the radial pressure profiles, the intracluster medium (ICM) clumping factor and the Sunyaev-Zel'dovich (SZ) scaling relations of a sample of simulated galaxy clusters and groups identified in a set of hydrodynamical simulations based on an updated version of the treepm-SPH GADGET-3 code. Three different sets of simulations are performed: the first assumes non-radiative physics, the others include, among other processes, active galactic nucleus (AGN) and/or stellar feedback. Our results are analysed as a function of redshift, ICM physics, cluster mass and cluster cool-coreness or dynamical state. In general, the mean pressure profiles obtained for our sample of groups and clusters show a good agreement with X-ray and SZ observations. Simulated cool-core (CC) and non-cool-core (NCC) clusters also show a good match with real data. We obtain in all cases a small (if any) redshift evolution of the pressure profiles of massive clusters, at least back to z = 1. We find that the clumpiness of gas density and pressure increases with the distance from the cluster centre and with the dynamical activity. The inclusion of AGN feedback in our simulations generates values for the gas clumping (√{C}_{ρ }˜ 1.2 at R200) in good agreement with recent observational estimates. The simulated YSZ-M scaling relations are in good accordance with several observed samples, especially for massive clusters. As for the scatter of these relations, we obtain a clear dependence on the cluster dynamical state, whereas this distinction is not so evident when looking at the subsamples of CC and NCC clusters.
MOCCA-SURVEY Database I: Is NGC 6535 a dark star cluster harbouring an IMBH?
NASA Astrophysics Data System (ADS)
Askar, Abbas; Bianchini, Paolo; de Vita, Ruggero; Giersz, Mirek; Hypki, Arkadiusz; Kamann, Sebastian
2017-01-01
We describe the dynamical evolution of a unique type of dark star cluster model in which the majority of the cluster mass at Hubble time is dominated by an intermediate-mass black hole (IMBH). We analysed results from about 2000 star cluster models (Survey Database I) simulated using the Monte Carlo code MOnte Carlo Cluster simulAtor and identified these dark star cluster models. Taking one of these models, we apply the method of simulating realistic `mock observations' by utilizing the Cluster simulatiOn Comparison with ObservAtions (COCOA) and Simulating Stellar Cluster Observation (SISCO) codes to obtain the photometric and kinematic observational properties of the dark star cluster model at 12 Gyr. We find that the perplexing Galactic globular cluster NGC 6535 closely matches the observational photometric and kinematic properties of the dark star cluster model presented in this paper. Based on our analysis and currently observed properties of NGC 6535, we suggest that this globular cluster could potentially harbour an IMBH. If it exists, the presence of this IMBH can be detected robustly with proposed kinematic observations of NGC 6535.
Gould, Madelyn S; Kleinman, Marjorie H; Lake, Alison M; Forman, Judith; Midle, Jennifer Bassett
2014-06-01
Public health and clinical efforts to prevent suicide clusters are seriously hampered by the unanswered question of why such outbreaks occur. We aimed to establish whether an environmental factor-newspaper reports of suicide-has a role in the emergence of suicide clusters. In this retrospective, population-based, case-control study, we identified suicide clusters in young people aged 13-20 years in the USA from 1988 to 1996 (preceding the advent of social media) using the time-space Scan statistic. For each cluster community, we selected two matched non-cluster control communities in which suicides of similarly aged youth occurred, from non-contiguous counties within the same state as the cluster. We examined newspapers within each cluster community for stories about suicide published in the days between the first and second suicides in the cluster. In non-cluster communities, we examined a matched length of time after the matched control suicide. We used a content-analysis procedure to code the characteristics of each story and compared newspaper stories about suicide published in case and control communities with mixed-effect regression analyses. We identified 53 suicide clusters, of which 48 were included in the media review. For one cluster we could identify only one appropriate control; therefore, 95 matched control communities were included. The mean number of news stories about suicidal individuals published after an index cluster suicide (7·42 [SD 10·02]) was significantly greater than the mean number of suicide stories published after a non-cluster suicide (5·14 [6.00]; p<0·0001). Several story characteristics, including front-page placement, headlines containing the word suicide or a description of the method used, and detailed descriptions of the suicidal individual and act, appeared more often in stories published after the index cluster suicides than after non-cluster suicides. Our identification of an association between newspaper reports about suicide (including specific story characteristics) and the initiation of teenage suicide clusters should provide an empirical basis to support efforts by mental health professionals, community officials, and the media to work together to identify and prevent the onset of suicide clusters. US National Institute of Mental Health and American Foundation for Suicide Prevention. Copyright © 2014 Elsevier Ltd. All rights reserved.
Convergence tests on tax burden and economic growth among China, Taiwan and the OECD countries
NASA Astrophysics Data System (ADS)
Wang, David Han-Min
2007-07-01
The unfolding globalization has profound impact on a wide range of nations’ policies including tax and economy policies. This study adopts the time series and cluster analyses to examine the convergence property of tax burden and per capita gross domestic product among Taiwan, China and the OECD countries. The empirical results show that there is no significant relationship between the integration process and fiscal convergence among countries. However, the cluster analyses identify that the group of China, Taiwan, and Korea was stably moving toward one model during the 1970s, 1980s and 1990s. And, the convergence of tax burden is found in the group, but no pairwise convergence exists.
Proposed shade guide for human facial skin and lip: a pilot study.
Wee, Alvin G; Beatty, Mark W; Gozalo-Diaz, David J; Kim-Pusateri, Seungyee; Marx, David B
2013-08-01
Currently, no commercially available facial shade guide exists in the United States for the fabrication of facial prostheses. The purpose of this study was to measure facial skin and lip color in a human population sample stratified by age, gender, and race. Clustering analysis was used to determine optimal color coordinates for a proposed facial shade guide. Participants (n=119) were recruited from 4 racial/ethnic groups, 5 age groups, and both genders. Reflectance measurements of participants' noses and lower lips were made by using a spectroradiometer and xenon arc lamp with a 45/0 optical configuration. Repeated measures ANOVA (α=.05), to identify skin and lip color differences, resulting from race, age, gender, and location, and a hierarchical clustering analysis, to identify clusters of skin colors) were used. Significant contributors to L*a*b* facial color were race and facial location (P<.01). b* affected all factors (P<.05). Age affected only b* (P<.001), while gender affected only L* (P<.05) and b* (P<.05). Analyses identified 5 clusters of skin color. The study showed that skin color caused by age and gender primarily occurred within the yellow-blue axis. A significant lightness difference between gender groups was also found. Clustering analysis identified 5 distinct skin shade tabs. Copyright © 2013 The Editorial Council of the Journal of Prosthetic Dentistry. Published by Mosby, Inc. All rights reserved.
Spatial event cluster detection using an approximate normal distribution.
Torabi, Mahmoud; Rosychuk, Rhonda J
2008-12-12
In geographic surveillance of disease, areas with large numbers of disease cases are to be identified so that investigations of the causes of high disease rates can be pursued. Areas with high rates are called disease clusters and statistical cluster detection tests are used to identify geographic areas with higher disease rates than expected by chance alone. Typically cluster detection tests are applied to incident or prevalent cases of disease, but surveillance of disease-related events, where an individual may have multiple events, may also be of interest. Previously, a compound Poisson approach that detects clusters of events by testing individual areas that may be combined with their neighbours has been proposed. However, the relevant probabilities from the compound Poisson distribution are obtained from a recursion relation that can be cumbersome if the number of events are large or analyses by strata are performed. We propose a simpler approach that uses an approximate normal distribution. This method is very easy to implement and is applicable to situations where the population sizes are large and the population distribution by important strata may differ by area. We demonstrate the approach on pediatric self-inflicted injury presentations to emergency departments and compare the results for probabilities based on the recursion and the normal approach. We also implement a Monte Carlo simulation to study the performance of the proposed approach. In a self-inflicted injury data example, the normal approach identifies twelve out of thirteen of the same clusters as the compound Poisson approach, noting that the compound Poisson method detects twelve significant clusters in total. Through simulation studies, the normal approach well approximates the compound Poisson approach for a variety of different population sizes and case and event thresholds. A drawback of the compound Poisson approach is that the relevant probabilities must be determined through a recursion relation and such calculations can be computationally intensive if the cluster size is relatively large or if analyses are conducted with strata variables. On the other hand, the normal approach is very flexible, easily implemented, and hence, more appealing for users. Moreover, the concepts may be more easily conveyed to non-statisticians interested in understanding the methodology associated with cluster detection test results.
Marshman, Z; Broomhead, T; Rodd, H D; Jones, K; Burke, D; Baker, S R
2016-09-28
Emergency departments (EDs) have been identified as key providers of dental care although few studies have examined patterns of attendance or clusters of characteristics. The aim was to identify the reasons for visits to an ED, whether these remained stable over time, and characterize clusters of patients by socio-demographic and attendance variables. Pseudonymized data were obtained for children who attended the ED in 2003-2004, 2004-2005 and 2012-2013. Presenting complaint was categorized as attending for dental or nondental reasons. Other variables analysed included patient (age, sex, ethnicity and deprivation) and attendance characteristics (distance travelled, season, nature of complaint, time elapsed since onset of symptoms, day of week and hours of attendance), together with treatment outcome (advice, antibiotics and referral). To assess trends over time, analyses were conducted on patient, attendance and treatment outcome variables. To examine whether patients could be characterized by socio-demographic and attendance variables, a two-step cluster analysis was undertaken on 2003-2004 data set and validated on 2004-2005 and 2012-2013 data sets. In 2003-2004, 550 children attended the ED for dental reasons rising to 687 in 2012-2013. The most important predictors of dental attendance were as follows: nature of complaint, ethnicity, time elapsed, sex and deprivation of the area in which children lived. The analysis showed two clusters: cluster 1 was comprised of children who attended the ED for dental injury, were of White ethnicity and attended within 24 h of onset of symptoms. Children in this cluster were likely to be from the least or less deprived areas (compared to Cluster 2) and were more likely to be males. Cluster 2 comprised of children attending the ED for caries, oral mucosal lesions or other complaints, were likely to be of other (non-White) ethnicities and were likely to attend more than 24 h after symptoms began. Children in this cluster were more likely to come from the most deprived areas and were both males and females. The clusters varied according to treatment outcome; those patients in Cluster 2 were more likely to be prescribed medication, whilst those children in Cluster 1 were more likely to be referred to another specialty. A significant number of visits to the ED were for dental reasons with two clusters of children. The results have identified groups of patients for whom appropriate dental provision is lacking and where targeted services are needed to improve outcomes for children and reduce the burden on EDs. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Identifying technical aliases in SELDI mass spectra of complex mixtures of proteins
2013-01-01
Background Biomarker discovery datasets created using mass spectrum protein profiling of complex mixtures of proteins contain many peaks that represent the same protein with different charge states. Correlated variables such as these can confound the statistical analyses of proteomic data. Previously we developed an algorithm that clustered mass spectrum peaks that were biologically or technically correlated. Here we demonstrate an algorithm that clusters correlated technical aliases only. Results In this paper, we propose a preprocessing algorithm that can be used for grouping technical aliases in mass spectrometry protein profiling data. The stringency of the variance allowed for clustering is customizable, thereby affecting the number of peaks that are clustered. Subsequent analysis of the clusters, instead of individual peaks, helps reduce difficulties associated with technically-correlated data, and can aid more efficient biomarker identification. Conclusions This software can be used to pre-process and thereby decrease the complexity of protein profiling proteomics data, thus simplifying the subsequent analysis of biomarkers by decreasing the number of tests. The software is also a practical tool for identifying which features to investigate further by purification, identification and confirmation. PMID:24010718
Farhan, Sali M K; Wang, Jian; Robinson, John F; Lahiry, Piya; Siu, Victoria M; Prasad, Chitra; Kronick, Jonathan B; Ramsay, David A; Rupar, C Anthony; Hegele, Robert A
2014-01-01
Iron-sulfur (Fe-S) clusters are a class of highly conserved and ubiquitous prosthetic groups with unique chemical properties that allow the proteins that contain them, Fe-S proteins, to assist in various key biochemical pathways. Mutations in Fe-S proteins often disrupt Fe-S cluster assembly leading to a spectrum of severe disorders such as Friedreich's ataxia or iron-sulfur cluster assembly enzyme (ISCU) myopathy. Herein, we describe infantile mitochondrial complex II/III deficiency, a novel autosomal recessive mitochondrial disease characterized by lactic acidemia, hypotonia, respiratory chain complex II and III deficiency, multisystem organ failure and abnormal mitochondria. Through autozygosity mapping, exome sequencing, in silico analyses, population studies and functional tests, we identified c.215G>A, p.Arg72Gln in NFS1 as the likely causative mutation. We describe the first disease in man likely caused by deficiency in NFS1, a cysteine desulfurase that is implicated in respiratory chain function and iron maintenance by initiating Fe-S cluster biosynthesis. Our results further demonstrate the importance of sufficient NFS1 expression in human physiology.
Spatial and space-time clustering of tuberculosis in Gurage Zone, Southern Ethiopia.
Tadesse, Sebsibe; Enqueselassie, Fikre; Hagos, Seifu
2018-01-01
Spatial targeting is advocated as an effective method that contributes for achieving tuberculosis control in high-burden countries. However, there is a paucity of studies clarifying the spatial nature of the disease in these countries. This study aims to identify the location, size and risk of purely spatial and space-time clusters for high occurrence of tuberculosis in Gurage Zone, Southern Ethiopia during 2007 to 2016. A total of 15,805 patient data that were retrieved from unit TB registers were included in the final analyses. The spatial and space-time cluster analyses were performed using the global Moran's I, Getis-Ord [Formula: see text] and Kulldorff's scan statistics. Eleven purely spatial and three space-time clusters were detected (P <0.001).The clusters were concentrated in border areas of the Gurage Zone. There were considerable spatial variations in the risk of tuberculosis by year during the study period. This study showed that tuberculosis clusters were mainly concentrated at border areas of the Gurage Zone during the study period, suggesting that there has been sustained transmission of the disease within these locations. The findings may help intensify the implementation of tuberculosis control activities in these locations. Further study is warranted to explore the roles of various ecological factors on the observed spatial distribution of tuberculosis.
Strain-Level Diversity of Secondary Metabolism in Streptomyces albus
Seipke, Ryan F.
2015-01-01
Streptomyces spp. are robust producers of medicinally-, industrially- and agriculturally-important small molecules. Increased resistance to antibacterial agents and the lack of new antibiotics in the pipeline have led to a renaissance in natural product discovery. This endeavor has benefited from inexpensive high quality DNA sequencing technology, which has generated more than 140 genome sequences for taxonomic type strains and environmental Streptomyces spp. isolates. Many of the sequenced streptomycetes belong to the same species. For instance, Streptomyces albus has been isolated from diverse environmental niches and seven strains have been sequenced, consequently this species has been sequenced more than any other streptomycete, allowing valuable analyses of strain-level diversity in secondary metabolism. Bioinformatics analyses identified a total of 48 unique biosynthetic gene clusters harboured by Streptomyces albus strains. Eighteen of these gene clusters specify the core secondary metabolome of the species. Fourteen of the gene clusters are contained by one or more strain and are considered auxiliary, while 16 of the gene clusters encode the production of putative strain-specific secondary metabolites. Analysis of Streptomyces albus strains suggests that each strain of a Streptomyces species likely harbours at least one strain-specific biosynthetic gene cluster. Importantly, this implies that deep sequencing of a species will not exhaust gene cluster diversity and will continue to yield novelty. PMID:25635820
Comparative analysis of prophages in Streptococcus mutans genomes
Fu, Tiwei; Fan, Xiangyu; Long, Quanxin; Deng, Wanyan; Song, Jinlin
2017-01-01
Prophages have been considered genetic units that have an intimate association with novel phenotypic properties of bacterial hosts, such as pathogenicity and genomic variation. Little is known about the genetic information of prophages in the genome of Streptococcus mutans, a major pathogen of human dental caries. In this study, we identified 35 prophage-like elements in S. mutans genomes and performed a comparative genomic analysis. Comparative genomic and phylogenetic analyses of prophage sequences revealed that the prophages could be classified into three main large clusters: Cluster A, Cluster B, and Cluster C. The S. mutans prophages in each cluster were compared. The genomic sequences of phismuN66-1, phismuNLML9-1, and phismu24-1 all shared similarities with the previously reported S. mutans phages M102, M102AD, and ϕAPCM01. The genomes were organized into seven major gene clusters according to the putative functions of the predicted open reading frames: packaging and structural modules, integrase, host lysis modules, DNA replication/recombination modules, transcriptional regulatory modules, other protein modules, and hypothetical protein modules. Moreover, an integrase gene was only identified in phismuNLML9-1 prophages. PMID:29158986
A HIV-1 heterosexual transmission chain in Guangzhou, China: a molecular epidemiological study.
Han, Zhigang; Leung, Tommy W C; Zhao, Jinkou; Wang, Ming; Fan, Lirui; Li, Kai; Pang, Xinli; Liang, Zhenbo; Lim, Wilina W L; Xu, Huifang
2009-09-25
We conducted molecular analyses to confirm four clustering HIV-1 infections (Patient A, B, C & D) in Guangzhou, China. These cases were identified by epidemiological investigation and suspected to acquire the infection through a common heterosexual transmission chain. Env C2V3V4 region, gag p17/p24 junction and partial pol gene of HIV-1 genome from serum specimens of these infected cases were amplified by reverse transcription polymerase chain reaction (RT-PCR) and nucleotide sequenced. Phylogenetic analyses indicated that their viral nucleotide sequences were significantly clustered together (bootstrap value is 99%, 98% and 100% in env, gag and pol tree respectively). Evolutionary distance analysis indicated that their genetic diversities of env, gag and pol genes were significantly lower than non-clustered controls, as measured by unpaired t-test (env gene comparison: p < 0.005; gag gene comparison: p < 0.005; pol gene comparison: p < 0.005). Epidemiological results and molecular analyses consistently illustrated these four cases represented a transmission chain which dispersed in the locality through heterosexual contact involving commercial sex worker.
Saeed, Mohammad
2017-05-01
Systemic lupus erythematosus (SLE) is a complex disorder. Genetic association studies of complex disorders suffer from the following three major issues: phenotypic heterogeneity, false positive (type I error), and false negative (type II error) results. Hence, genes with low to moderate effects are missed in standard analyses, especially after statistical corrections. OASIS is a novel linkage disequilibrium clustering algorithm that can potentially address false positives and negatives in genome-wide association studies (GWAS) of complex disorders such as SLE. OASIS was applied to two SLE dbGAP GWAS datasets (6077 subjects; ∼0.75 million single-nucleotide polymorphisms). OASIS identified three known SLE genes viz. IFIH1, TNIP1, and CD44, not previously reported using these GWAS datasets. In addition, 22 novel loci for SLE were identified and the 5 SLE genes previously reported using these datasets were verified. OASIS methodology was validated using single-variant replication and gene-based analysis with GATES. This led to the verification of 60% of OASIS loci. New SLE genes that OASIS identified and were further verified include TNFAIP6, DNAJB3, TTF1, GRIN2B, MON2, LATS2, SNX6, RBFOX1, NCOA3, and CHAF1B. This study presents the OASIS algorithm, software, and the meta-analyses of two publicly available SLE GWAS datasets along with the novel SLE genes. Hence, OASIS is a novel linkage disequilibrium clustering method that can be universally applied to existing GWAS datasets for the identification of new genes.
Liu, Chan; Feng, Juan; Zhang, Defeng; Xie, Yundan; Li, Anxing; Wang, Jiangyong; Su, Youlu
2018-05-11
In view of the changing antibiotic-resistance profiles of Streptococcus agalactiae from tilapia in China, antimicrobial susceptibilities of 75 S. agalactiae strains were determined by the disc diffusion method, and cluster analyses of the antibiograms and antibiogram types were performed. All strains displayed multidrug resistance (MDR). The antimicrobial-resistance rates were highest (>90%) to aminoglycosides, sulfonamides, pipemidic acid, and norfloxacin, followed by penicillin, ampicillin, and ciprofloxacin (26.7-38.7%); those to furadantin, lincomycin, erythromycin, ofloxacin, tetracycline, and florfenicol were low (<10%), and no resistance to vancomycin, cefalexin, cefoxitin, amoxicillin, medemycin, doxitard, oxytetracycline, rifampin, chloramphenicol, or thiamphenicol was detected. Statistical analysis showed that the resistance rate to ciprofloxacin increased significantly in 2016 (p = 0.009), whereas that to trimethoprim/sulfamethoxazole decreased (p = 0.017). Cluster analyses identified that the strains had 23 antibiogram types (A-W) and clustered in five groups (Groups I-V). The strains with higher antimicrobial resistance mainly clustered in Groups I and II. Our results show that the antibiograms varied with time and by location and that antibiogram types are constantly updating and expanding. Effective measures must be taken to reduce the antimicrobial resistance and spread of MDR strains.
Cárceles-Álvarez, Alberto; Ortega-García, Juan A; López-Hernández, Fernando A; Orozco-Llamas, Mayra; Espinosa-López, Blanca; Tobarra-Sánchez, Esther; Alvarez, Lizbeth
2017-07-01
Leukaemia remains the most common type of paediatric cancer and its aetiology remains unknown, but considered to be multifactorial. It is suggested that the initiation in utero by relevant exposures and/or inherited genetic variants and, other promotional postnatal exposures are probably required to develop leukaemia. This study aimed to map the incidence and analyse possible clusters in the geographical distribution of childhood acute leukaemia during the critical periods and to evaluate the factors that may be involved in the aetiology by conducting community and individual risk assessments. We analysed all incident cases of acute childhood leukaemia (<15 years) diagnosed in a Spanish region during the period 1998-2013. At diagnosis, the addresses during pregnancy, early childhood and diagnosis were collected and codified to analyse the spatial distribution of acute leukaemia. Scan statistical test methodology was used for the identification of high-incidence spatial clusters. Once identified, individual and community risk assessments were conducted using the Paediatric Environmental History. A total of 158 cases of acute leukaemia were analysed. The crude rate for the period was 42.7 cases per million children. Among subtypes, acute lymphoblastic leukaemia had the highest incidence (31.9 per million children). A spatial cluster of acute lymphoblastic leukaemia was detected using the pregnancy address (p<0.05). The most common environmental risk factors related with the aetiology of acute lymphoblastic leukaemia, identified by the Paediatric Environmental History were: prenatal exposure to tobacco (75%) and alcohol (50%); residential and community exposure to pesticides (62.5%); prenatal or neonatal ionizing radiation (42.8%); and parental workplace exposure (37.5%) CONCLUSIONS: Our study suggests that environmental exposures in utero may be important in the development of childhood leukaemia. Due to the presence of high-incidence clusters using pregnancy address, it is necessary to introduce this address into the childhood cancer registers. The Paediatric Environmental History which includes pregnancy address and a careful and comprehensive evaluation of the environmental exposures will allow us to build the knowledge of the causes of childhood leukaemia. Copyright © 2017 Elsevier Inc. All rights reserved.
Keita, Akilah Dulin; Whittaker, Shannon; Wynter, Jamila; Kidanu, Tamnnet Woldemichael; Chhay, Channavy; Cardel, Michelle; Gans, Kim M
2016-01-01
This study identifies Southeast Asian refugee parents' and grandparents' perceptions of the risk and protective factors for childhood obesity. We used a mixed methods approach (concept mapping) for data collection and analyses. Fifty-nine participants engaged in modified nominal group meetings where they generated statements about children's weight status and structuring meetings where they sorted statements into piles based on similarity and rated statements on relative importance. Concept Systems® software generated clusters of ideas, cluster ratings, and pattern matches. Eleven clusters emerged. Participants rated "Healthy Food Changes Made within the School" and "Parent-related Physical Activity Factors" as most important, whereas "Neighborhood Built Features" was rated as the least important. Cambodian and Hmong participants agreed the most on cluster ratings of relative importance (r = 0.62). The study findings may be used to inform the development of culturally appropriate obesity prevention interventions for Southeast Asian refugee communities.
Fast gene ontology based clustering for microarray experiments.
Ovaska, Kristian; Laakso, Marko; Hautaniemi, Sampsa
2008-11-21
Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.
Morgan, Ethan; Nyaku, Amesika N; DʼAquila, Richard T; Schneider, John A
2017-07-01
Phylogenetic analysis determines similarities among HIV genetic sequences from persons infected with HIV, identifying clusters of transmission. We determined characteristics associated with both membership in an HIV transmission cluster and the number of clustered sequences among a cohort of young black men who have sex with men (YBMSM) in Chicago. Pairwise genetic distances of HIV-1 pol sequences were collected during 2013-2016. Potential transmission ties were identified among HIV-infected persons whose sequences were ≤1.5% genetically distant. Putative transmission pairs were defined as ≥1 tie to another sequence. We then determined demographic and risk attributes associated with both membership in an HIV transmission cluster and the number of ties to the sequences from other persons in the cluster. Of 86 available sequences, 31 (36.0%) were tied to ≥1 other sequence. Through multivariable analyses, we determined that those who reported symptoms of depression and those who had a higher number of confidants in their network had significantly decreased odds of membership in transmission clusters. We found that those who had unstable housing and who reported heavy marijuana use had significantly more ties to other individuals within transmission clusters, whereas those identifying as bisexual, those participating in group sex, and those with higher numbers of sexual partners had significantly fewer ties. This study demonstrates the potential for combining phylogenetic and individual and network attributes to target HIV control efforts to persons with potentially higher transmission risk, as well as suggesting some unappreciated specific predictors of transmission risk among YBMSM in Chicago for future study.
Young star clusters in circumnuclear starburst rings
NASA Astrophysics Data System (ADS)
de Grijs, Richard; Ma, Chao; Jia, Siyao; Ho, Luis C.; Anders, Peter
2017-03-01
We analyse the cluster luminosity functions (CLFs) of the youngest star clusters in two galaxies exhibiting prominent circumnuclear starburst rings. We focus specifically on NGC 1512 and NGC 6951, for which we have access to Hα data that allow us to unambiguously identify the youngest sample clusters. To place our results on a firm statistical footing, we first explore in detail a number of important technical issues affecting the process from converting the observational data into the spectral energy distributions of the objects in our final catalogues. The CLFs of the young clusters in both galaxies exhibit approximate power-law behaviour down to the 90 per cent observational completeness limits, thus showing that star cluster formation in the violent environments of starburst rings appears to proceed similarly as that elsewhere in the local Universe. We discuss this result in the context of the density of the interstellar medium in our starburst-ring galaxies.
ADHD latent class clusters: DSM-IV subtypes and comorbidity
Elia, Josephine; Arcos-Burgos, Mauricio; Bolton, Kelly L.; Ambrosini, Paul J.; Berrettini, Wade; Muenke, Maximilian
2014-01-01
ADHD (Attention Deficit Hyperactivity Disorder) has a complex, heterogeneous phenotype only partially captured by Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) criteria. In this report, latent class analyses (LCA) are used to identify ADHD phenotypes using K-SADS-IVR (Schedule for Affective Disorders & Schizophrenia for School Age Children-IV-Revised) symptoms and symptom severity data from a clinical sample of 500 ADHD subjects, ages 6–18, participating in an ADHD genetic study. Results show that LCA identified six separate ADHD clusters, some corresponding to specific DSM-IV subtypes while others included several subtypes. DSM-IV comorbid anxiety and mood disorders were generally similar across all clusters, and subjects without comorbidity did not aggregate within any one cluster. Age and gender composition also varied. These results support findings from population-based LCA studies. The six clusters provide additional homogenous groups that can be used to define ADHD phenotypes in genetic association studies. The limited age ranges aggregating in the different clusters may prove to be a particular advantage in genetic studies where candidate gene expression may vary during developmental phases. DSM-IV comorbid mood and anxiety disorders also do not appear to increase cluster heterogeneity; however, longitudinal studies that cover period of risk are needed to support this finding. PMID:19900717
DMINDA: an integrated web server for DNA motif identification and analyses
Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying
2014-01-01
DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. PMID:24753419
Review of methods for handling confounding by cluster and informative cluster size in clustered data
Seaman, Shaun; Pavlou, Menelaos; Copas, Andrew
2014-01-01
Clustered data are common in medical research. Typically, one is interested in a regression model for the association between an outcome and covariates. Two complications that can arise when analysing clustered data are informative cluster size (ICS) and confounding by cluster (CBC). ICS and CBC mean that the outcome of a member given its covariates is associated with, respectively, the number of members in the cluster and the covariate values of other members in the cluster. Standard generalised linear mixed models for cluster-specific inference and standard generalised estimating equations for population-average inference assume, in general, the absence of ICS and CBC. Modifications of these approaches have been proposed to account for CBC or ICS. This article is a review of these methods. We express their assumptions in a common format, thus providing greater clarity about the assumptions that methods proposed for handling CBC make about ICS and vice versa, and about when different methods can be used in practice. We report relative efficiencies of methods where available, describe how methods are related, identify a previously unreported equivalence between two key methods, and propose some simple additional methods. Unnecessarily using a method that allows for ICS/CBC has an efficiency cost when ICS and CBC are absent. We review tools for identifying ICS/CBC. A strategy for analysis when CBC and ICS are suspected is demonstrated by examining the association between socio-economic deprivation and preterm neonatal death in Scotland. PMID:25087978
Patterns of gender equality at workplaces and psychological distress.
Elwér, Sofia; Harryson, Lisa; Bolin, Malin; Hammarström, Anne
2013-01-01
Research in the field of occupational health often uses a risk factor approach which has been criticized by feminist researchers for not considering the combination of many different variables that are at play simultaneously. To overcome this shortcoming this study aims to identify patterns of gender equality at workplaces and to investigate how these patterns are associated with psychological distress. Questionnaire data from the Northern Swedish Cohort (n = 715) have been analysed and supplemented with register data about the participants' workplaces. The register data were used to create gender equality indicators of women/men ratios of number of employees, educational level, salary and parental leave. Cluster analysis was used to identify patterns of gender equality at the workplaces. Differences in psychological distress between the clusters were analysed by chi-square test and logistic regression analyses, adjusting for individual socio-demographics and previous psychological distress. The cluster analysis resulted in six distinctive clusters with different patterns of gender equality at the workplaces that were associated to psychological distress for women but not for men. For women the highest odds of psychological distress was found on traditionally gender unequal workplaces. The lowest overall occurrence of psychological distress as well as same occurrence for women and men was found on the most gender equal workplaces. The results from this study support the convergence hypothesis as gender equality at the workplace does not only relate to better mental health for women, but also more similar occurrence of mental ill-health between women and men. This study highlights the importance of utilizing a multidimensional view of gender equality to understand its association to health outcomes. Health policies need to consider gender equality at the workplace level as a social determinant of health that is of importance for reducing differences in health outcomes for women and men.
Dennis, Ann M; Hué, Stephane; Learner, Emily; Sebastian, Joseph; Miller, William C; Eron, Joseph J
2017-01-01
HIV-1 diversity is increasing in North American and European cohorts which may have public health implications. However, little is known about non-B subtype diversity in the southern United States, despite the region being the epicenter of the nation's epidemic. We characterized HIV-1 diversity and transmission clusters to identify the extent to which non-B strains are transmitted locally. We conducted cross-sectional analyses of HIV-1 partial pol sequences collected from 1997 to 2014 from adults accessing routine clinical care in North Carolina (NC). Subtypes were evaluated using COMET and phylogenetic analysis. Putative transmission clusters were identified using maximum-likelihood trees. Clusters involving non-B strains were confirmed and their dates of origin were estimated using Bayesian phylogenetics. Data were combined with demographic information collected at the time of sample collection and country of origin for a subset of patients. Among 24,972 sequences from 15,246 persons, the non-B subtype prevalence increased from 0% to 3.46% over the study period. Of 325 persons with non-B subtypes, diversity was high with over 15 pure subtypes and recombinants; subtype C (28.9%) and CRF02_AG (24.0%) were most common. While identification of transmission clusters was lower for persons with non-B versus B subtypes, several local transmission clusters (≥3 persons) involving non-B subtypes were identified and all were presumably due to heterosexual transmission. Prevalence of non-B subtype diversity remains low in NC but a statistically significant rise was identified over time which likely reflects multiple importation. However, the combined phylogenetic clustering analysis reveals evidence for local onward transmission. Detection of these non-B clusters suggests heterosexual transmission and may guide diagnostic and prevention interventions.
NASA Astrophysics Data System (ADS)
Leckebusch, G. C.; Kirchner-Bossi, N. O.; Befort, D. J.; Ulbrich, U.
2015-12-01
Time-clustered mid-latitude winter storms are responsible for a large portion of the overall windstorm-related damage in Europe. Thus, its study entails a high meteorological interest, while its outcome can result in a crucial utility for the (re)insurance industry. In addition to existing cyclone-based studies, here we use an event identification approach based on surface near wind speeds only, to investigate windstorm clustering and compare it to cyclone clustering. Specifically, cyclone and windstorm tracks are identified for winter 1979-2013 (Oct-Mar), to perform two sensitivity analyses on event-clustering in the North Atlantic using ERA-Interim Reanalysis. First, the link between clustering and cyclone intensity is analysed and compared to windstorms. Secondly, the sensitivity of clustering on intra-seasonal time scales is investigated, for both cyclones and windstorms. The wind-based approach reveals additional regions of clustering over Western Europe, which could be related to extreme damages, showing the added value of investigating wind field derived tracks in addition to that of cyclone tracks. Previous studies indicate a higher degree of clustering for stronger cyclones. However, our results show that this assumption is not always met. Although a positive relationship is confirmed for the clustering centre located over Iceland, clustering off the coast of the Iberian Peninsula behaves opposite. Even though this region shows the highest clustering, most of its signal is due to cyclones with intensities below the 70th percentile of the Laplacian of MSLP. Results on the sensitivity of clustering to the time of the winter season (Oct-Mar) show a temporal evolution of the clustering patterns, for both windstorms and cyclones. Compared to all cyclones, clustering of windstorms and strongest cyclones culminate around February, while all cyclone clustering peak in December to January.
Update of membership and mean proper motion of open clusters from UCAC5 catalog
NASA Astrophysics Data System (ADS)
Dias, W. S.; Monteiro, H.; Assafin, M.
2018-06-01
We present mean proper motions and membership probabilities of individual stars for optically visible open clusters, which have been determined using data from the UCAC5 catalog. This follows our previous studies with the UCAC2 and UCAC4 catalogs, but now using improved proper motions in the GAIA reference frame. In the present study results were obtained for a sample of 1108 open clusters. For five clusters, this is the first determination of mean proper motion, and for the whole sample, we present results with a much larger number of identified astrometric member stars than on previous studies. It is the last update of our Open cluster Catalog based on proper motion data only. Future updates will count on astrometric, photometric and spectroscopic GAIA data as input for analyses.
Ecological tolerances of Miocene larger benthic foraminifera from Indonesia
NASA Astrophysics Data System (ADS)
Novak, Vibor; Renema, Willem
2018-01-01
To provide a comprehensive palaeoenvironmental reconstruction based on larger benthic foraminifera (LBF), a quantitative analysis of their assemblage composition is needed. Besides microfacies analysis which includes environmental preferences of foraminiferal taxa, statistical analyses should also be employed. Therefore, detrended correspondence analysis and cluster analysis were performed on relative abundance data of identified LBF assemblages deposited in mixed carbonate-siliciclastic (MCS) systems and blue-water (BW) settings. Studied MCS system localities include ten sections from the central part of the Kutai Basin in East Kalimantan, ranging from late Burdigalian to Serravallian age. The BW samples were collected from eleven sections of the Bulu Formation on Central Java, dated as Serravallian. Results from detrended correspondence analysis reveal significant differences between these two environmental settings. Cluster analysis produced five clusters of samples; clusters 1 and 2 comprise dominantly MCS samples, clusters 3 and 4 with dominance of BW samples, and cluster 5 showing a mixed composition with both MCS and BW samples. The results of cluster analysis were afterwards subjected to indicator species analysis resulting in the interpretation that generated three groups among LBF taxa: typical assemblage indicators, regularly occurring taxa and rare taxa. By interpreting the results of detrended correspondence analysis, cluster analysis and indicator species analysis, along with environmental preferences of identified LBF taxa, a palaeoenvironmental model is proposed for the distribution of LBF in Miocene MCS systems and adjacent BW settings of Indonesia.
Morphology delimits more species than molecular genetic clusters of invasive Pilosella.
Moffat, Chandra E; Ensing, David J; Gaskin, John F; De Clerck-Floate, Rosemarie A; Pither, Jason
2015-07-01
• Accurate assessments of biodiversity are paramount for understanding ecosystem processes and adaptation to change. Invasive species often contribute substantially to local biodiversity; correctly identifying and distinguishing invaders is thus necessary to assess their potential impacts. We compared the reliability of morphology and molecular sequences to discriminate six putative species of invasive Pilosella hawkweeds (syn. Hieracium, Asteraceae), known for unreliable identifications and historical introgression. We asked (1) which morphological traits dependably discriminate putative species, (2) if genetic clusters supported morphological species, and (3) if novel hybridizations occur in the invaded range.• We assessed 33 morphometric characters for their discriminatory power using the randomForest classifier and, using AFLPs, evaluated genetic clustering with the program structure and subsequently with an AMOVA. The strength of the association between morphological and genotypic dissimilarity was assessed with a Mantel test.• Morphometric analyses delimited six species while genetic analyses defined only four clusters. Specifically, we found (1) eight morphological traits could reliably distinguish species, (2) structure suggested strong genetic differentiation but for only four putative species clusters, and (3) genetic data suggest both novel hybridizations and multiple introductions have occurred.• (1) Traditional floristic techniques may resolve more species than molecular analyses in taxonomic groups subject to introgression. (2) Even within complexes of closely related species, relatively few but highly discerning morphological characters can reliably discriminate species. (3) By clarifying patterns of morphological and genotypic variation of invasive Pilosella, we lay foundations for further ecological study and mitigation. © 2015 Botanical Society of America, Inc.
Hebels, Dennie G A J; Rasche, Axel; Herwig, Ralf; van Westen, Gerard J P; Jennen, Danyel G J; Kleinjans, Jos C S
2016-01-01
When evaluating compound similarity, addressing multiple sources of information to reach conclusions about common pharmaceutical and/or toxicological mechanisms of action is a crucial strategy. In this chapter, we describe a systems biology approach that incorporates analyses of hepatotoxicant data for 33 compounds from three different sources: a chemical structure similarity analysis based on the 3D Tanimoto coefficient, a chemical structure-based protein target prediction analysis, and a cross-study/cross-platform meta-analysis of in vitro and in vivo human and rat transcriptomics data derived from public resources (i.e., the diXa data warehouse). Hierarchical clustering of the outcome scores of the separate analyses did not result in a satisfactory grouping of compounds considering their known toxic mechanism as described in literature. However, a combined analysis of multiple data types may hypothetically compensate for missing or unreliable information in any of the single data types. We therefore performed an integrated clustering analysis of all three data sets using the R-based tool iClusterPlus. This indeed improved the grouping results. The compound clusters that were formed by means of iClusterPlus represent groups that show similar gene expression while simultaneously integrating a similarity in structure and protein targets, which corresponds much better with the known mechanism of action of these toxicants. Using an integrative systems biology approach may thus overcome the limitations of the separate analyses when grouping liver toxicants sharing a similar mechanism of toxicity.
Schramm, Catherine; Vial, Céline; Bachoud-Lévi, Anne-Catherine; Katsahian, Sandrine
2018-01-01
Heterogeneity in treatment efficacy is a major concern in clinical trials. Clustering may help to identify the treatment responders and the non-responders. In the context of longitudinal cluster analyses, sample size and variability of the times of measurements are the main issues with the current methods. Here, we propose a new two-step method for the Clustering of Longitudinal data by using an Extended Baseline. The first step relies on a piecewise linear mixed model for repeated measurements with a treatment-time interaction. The second step clusters the random predictions and considers several parametric (model-based) and non-parametric (partitioning, ascendant hierarchical clustering) algorithms. A simulation study compares all options of the clustering of longitudinal data by using an extended baseline method with the latent-class mixed model. The clustering of longitudinal data by using an extended baseline method with the two model-based algorithms was the more robust model. The clustering of longitudinal data by using an extended baseline method with all the non-parametric algorithms failed when there were unequal variances of treatment effect between clusters or when the subgroups had unbalanced sample sizes. The latent-class mixed model failed when the between-patients slope variability is high. Two real data sets on neurodegenerative disease and on obesity illustrate the clustering of longitudinal data by using an extended baseline method and show how clustering may help to identify the marker(s) of the treatment response. The application of the clustering of longitudinal data by using an extended baseline method in exploratory analysis as the first stage before setting up stratified designs can provide a better estimation of treatment effect in future clinical trials.
Nghia, Nguyen Anh; Kadir, Jugah; Sunderasan, E; Puad Abdullah, Mohd; Malik, Adam; Napis, Suhaimi
2008-10-01
Morphological features and Inter Simple Sequence Repeat (ISSR) polymorphism were employed to analyse 21 Corynespora cassiicola isolates obtained from a number of Hevea clones grown in rubber plantations in Malaysia. The C. cassiicola isolates used in this study were collected from several states in Malaysia from 1998 to 2005. The morphology of the isolates was characteristic of that previously described for C. cassiicola. Variations in colony and conidial morphology were observed not only among isolates but also within a single isolate with no inclination to either clonal or geographical origin of the isolates. ISSR analysis delineated the isolates into two distinct clusters. The dendrogram created from UPGMA analysis based on Nei and Li's coefficient (calculated from the binary matrix data of 106 amplified DNA bands generated from 8 ISSR primers) showed that cluster 1 encompasses 12 isolates from the states of Johor and Selangor (this cluster was further split into 2 sub clusters (1A, 1B), sub cluster 1B consists of a unique isolate, CKT05D); while cluster 2 comprises of 9 isolates that were obtained from the other states. Detached leaf assay performed on selected Hevea clones showed that the pathogenicity of representative isolates from cluster 1 (with the exception of CKT05D) resembled that of race 1; and isolates in cluster 2 showed pathogenicity similar to race 2 of the fungus that was previously identified in Malaysia. The isolate CKT05D from sub cluster 1B showed pathogenicity dissimilar to either race 1 or race 2.
A Network-Based Algorithm for Clustering Multivariate Repeated Measures Data
NASA Technical Reports Server (NTRS)
Koslovsky, Matthew; Arellano, John; Schaefer, Caroline; Feiveson, Alan; Young, Millennia; Lee, Stuart
2017-01-01
The National Aeronautics and Space Administration (NASA) Astronaut Corps is a unique occupational cohort for which vast amounts of measures data have been collected repeatedly in research or operational studies pre-, in-, and post-flight, as well as during multiple clinical care visits. In exploratory analyses aimed at generating hypotheses regarding physiological changes associated with spaceflight exposure, such as impaired vision, it is of interest to identify anomalies and trends across these expansive datasets. Multivariate clustering algorithms for repeated measures data may help parse the data to identify homogeneous groups of astronauts that have higher risks for a particular physiological change. However, available clustering methods may not be able to accommodate the complex data structures found in NASA data, since the methods often rely on strict model assumptions, require equally-spaced and balanced assessment times, cannot accommodate missing data or differing time scales across variables, and cannot process continuous and discrete data simultaneously. To fill this gap, we propose a network-based, multivariate clustering algorithm for repeated measures data that can be tailored to fit various research settings. Using simulated data, we demonstrate how our method can be used to identify patterns in complex data structures found in practice.
NASA Astrophysics Data System (ADS)
Salimi, F.; Ristovski, Z.; Mazaheri, M.; Laiman, R.; Crilley, L. R.; He, C.; Clifford, S.; Morawska, L.
2014-06-01
Long-term measurements of particle number size distribution (PNSD) produce a very large number of observations and their analysis requires an efficient approach in order to produce results in the least possible time and with maximum accuracy. Clustering techniques are a family of sophisticated methods which have been recently employed to analyse PNSD data, however, very little information is available comparing the performance of different clustering techniques on PNSD data. This study aims to apply several clustering techniques (i.e. K-means, PAM, CLARA and SOM) to PNSD data, in order to identify and apply the optimum technique to PNSD data measured at 25 sites across Brisbane, Australia. A new method, based on the Generalised Additive Model (GAM) with a basis of penalised B-splines, was proposed to parameterise the PNSD data and the temporal weight of each cluster was also estimated using the GAM. In addition, each cluster was associated with its possible source based on the results of this parameterisation, together with the characteristics of each cluster. The performances of four clustering techniques were compared using the Dunn index and silhouette width validation values and the K-means technique was found to have the highest performance, with five clusters being the optimum. Therefore, five clusters were found within the data using the K-means technique. The diurnal occurrence of each cluster was used together with other air quality parameters, temporal trends and the physical properties of each cluster, in order to attribute each cluster to its source and origin. The five clusters were attributed to three major sources and origins, including regional background particles, photochemically induced nucleated particles and vehicle generated particles. Overall, clustering was found to be an effective technique for attributing each particle size spectra to its source and the GAM was suitable to parameterise the PNSD data. These two techniques can help researchers immensely in analysing PNSD data for characterisation and source apportionment purposes.
NASA Astrophysics Data System (ADS)
Salimi, F.; Ristovski, Z.; Mazaheri, M.; Laiman, R.; Crilley, L. R.; He, C.; Clifford, S.; Morawska, L.
2014-11-01
Long-term measurements of particle number size distribution (PNSD) produce a very large number of observations and their analysis requires an efficient approach in order to produce results in the least possible time and with maximum accuracy. Clustering techniques are a family of sophisticated methods that have been recently employed to analyse PNSD data; however, very little information is available comparing the performance of different clustering techniques on PNSD data. This study aims to apply several clustering techniques (i.e. K means, PAM, CLARA and SOM) to PNSD data, in order to identify and apply the optimum technique to PNSD data measured at 25 sites across Brisbane, Australia. A new method, based on the Generalised Additive Model (GAM) with a basis of penalised B-splines, was proposed to parameterise the PNSD data and the temporal weight of each cluster was also estimated using the GAM. In addition, each cluster was associated with its possible source based on the results of this parameterisation, together with the characteristics of each cluster. The performances of four clustering techniques were compared using the Dunn index and Silhouette width validation values and the K means technique was found to have the highest performance, with five clusters being the optimum. Therefore, five clusters were found within the data using the K means technique. The diurnal occurrence of each cluster was used together with other air quality parameters, temporal trends and the physical properties of each cluster, in order to attribute each cluster to its source and origin. The five clusters were attributed to three major sources and origins, including regional background particles, photochemically induced nucleated particles and vehicle generated particles. Overall, clustering was found to be an effective technique for attributing each particle size spectrum to its source and the GAM was suitable to parameterise the PNSD data. These two techniques can help researchers immensely in analysing PNSD data for characterisation and source apportionment purposes.
Characteristics of airflow and particle deposition in COPD current smokers
NASA Astrophysics Data System (ADS)
Zou, Chunrui; Choi, Jiwoong; Haghighi, Babak; Choi, Sanghun; Hoffman, Eric A.; Lin, Ching-Long
2017-11-01
A recent imaging-based cluster analysis of computed tomography (CT) lung images in a chronic obstructive pulmonary disease (COPD) cohort identified four clusters, viz. disease sub-populations. Cluster 1 had relatively normal airway structures; Cluster 2 had wall thickening; Cluster 3 exhibited decreased wall thickness and luminal narrowing; Cluster 4 had a significant decrease of luminal diameter and a significant reduction of lung deformation, thus having relatively low pulmonary functions. To better understand the characteristics of airflow and particle deposition in these clusters, we performed computational fluid and particle dynamics analyses on representative cluster patients and healthy controls using CT-based airway models and subject-specific 3D-1D coupled boundary conditions. The results show that particle deposition in central airways of cluster 4 patients was noticeably increased especially with increasing particle size despite reduced vital capacity as compared to other clusters and healthy controls. This may be attributable in part to significant airway constriction in cluster 4. This study demonstrates the potential application of cluster-guided CFD analysis in disease populations. NIH Grants U01HL114494 and S10-RR022421, and FDA Grant U01FD005837.
Spatial cluster analysis of human cases of Crimean Congo hemorrhagic fever reported in Pakistan.
Abbas, Tariq; Younus, Muhammad; Muhammad, Sayyad Aun
2015-01-01
Crimean Congo hemorrhagic fever (CCHF) is a tick-borne viral zoonotic disease that has been reported in almost all geographic regions in Pakistan. The aim of this study was to identify spatial clusters of human cases of CCHF reported in country. Kulldorff's spatial scan statisitc, Anselin's Local Moran's I and Getis Ord Gi* tests were applied on data (i.e. number of laboratory confirmed cases reported from each district during year 2013). The analyses revealed a large multi-district cluster of high CCHF incidence in the uplands of Balochistan province near it border with Afghanistan. The cluster comprised the following districts: Qilla Abdullah; Qilla Saifullah; Loralai, Quetta, Sibi, Chagai, and Mastung. Another cluster was detected in Punjab and included Rawalpindi district and a part of Islamabad. We provide empirical evidence of spatial clustering of human CCHF cases in the country. The districts in the clusters should be given priority in surveillance, control programs, and further research.
Jang, Yuri; Lee, Beom S.; Ko, Jung Eun; Haley, William E.; Chiriboga, David A.
2015-01-01
Objectives. In the context of social convoy theory, the purposes of the study were (a) to identify an empirical typology of the social networks evident in older Korean immigrants and (b) to examine its association with self-rated health and depressive symptoms. Method. The sample consisted of 1,092 community-dwelling older Korean immigrants in Florida and New York. Latent class analyses were conducted to identify the optimal social network typology based on 8 indicators of interpersonal relationships and activities. Bivariate and multivariate analyses were conducted to examine how the identified social network typology was associated with self-rating of health and depressive symptoms. Results. Results from the latent class analysis identified 6 clusters as being most optimal, and they were named diverse, unmarried/diverse, married/coresidence, family focused, unmarried/restricted, and restricted. Memberships in the clusters of diverse and married/coresidence were significantly associated with more favorable ratings of health and lower levels of depressive symptoms. Discussion. Notably, no distinct network solely composed of friends was identified in the present sample of older immigrants; this may reflect the disruptions in social convoys caused by immigration. The findings of this study promote our understanding of the unique patterns of social connectedness in older immigrants. PMID:23887929
Naushad, Sohail; Adeolu, Mobolaji; Goel, Nisha; Khadka, Bijendra; Al-Dahwi, Aqeel; Gupta, Radhey S.
2015-01-01
The genera Actinobacillus, Haemophilus, and Pasteurella exhibit extensive polyphyletic branching in phylogenetic trees and do not represent coherent clusters of species. In this study, we have utilized molecular signatures identified through comparative genomic analyses in conjunction with genome based and multilocus sequence based phylogenetic analyses to clarify the phylogenetic and taxonomic boundary of these genera. We have identified large clusters of Actinobacillus, Haemophilus, and Pasteurella species which represent the “sensu stricto” members of these genera. We have identified 3, 7, and 6 conserved signature indels (CSIs), which are specifically shared by sensu stricto members of Actinobacillus, Haemophilus, and Pasteurella, respectively. We have also identified two different sets of CSIs that are unique characteristics of the pathogen containing genera Aggregatibacter and Mannheimia, respectively. It is now possible to demarcate the genera Actinobacillus sensu stricto, Haemophilus sensu stricto, and Pasteurella sensu stricto on the basis of discrete molecular signatures. The other members of the genera Actinobacillus, Haemophilus, and Pasteurella that do not fall within the “sensu stricto” clades and do not contain these molecular signatures should be reclassified as other genera. The CSIs identified here also provide useful diagnostic targets for the identification of current and novel members of the indicated genera. PMID:25821780
Robustness of serial clustering of extra-tropical cyclones to the choice of tracking method
NASA Astrophysics Data System (ADS)
Pinto, Joaquim G.; Ulbrich, Sven; Karremann, Melanie K.; Stephenson, David B.; Economou, Theodoros; Shaffrey, Len C.
2016-04-01
Cyclone families are a frequent synoptic weather feature in the Euro-Atlantic area in winter. Given appropriate large-scale conditions, the occurrence of such series (clusters) of storms may lead to large socio-economic impacts and cumulative losses. Recent studies analyzing Reanalysis data using single cyclone tracking methods have shown that serial clustering of cyclones occurs on both flanks and downstream regions of the North Atlantic storm track. This study explores the sensitivity of serial clustering to the choice of tracking method. With this aim, the IMILAST cyclone track database based on ERA-interim data is analysed. Clustering is estimated by the dispersion (ratio of variance to mean) of winter (DJF) cyclones passages near each grid point over the Euro-Atlantic area. Results indicate that while the general pattern of clustering is identified for all methods, there are considerable differences in detail. This can primarily be attributed to the differences in the variance of cyclone counts between the methods, which range up to one order of magnitude. Nevertheless, clustering over the Eastern North Atlantic and Western Europe can be identified for all methods and can thus be generally considered as a robust feature. The statistical links between large-scale patterns like the NAO and clustering are obtained for all methods, though with different magnitudes. We conclude that the occurrence of cyclone clustering over the Eastern North Atlantic and Western Europe is largely independent from the choice of tracking method and hence from the definition of a cyclone.
Breland, Jessica Y; Hundt, Natalie E; Barrera, Terri L; Mignogna, Joseph; Petersen, Nancy J; Stanley, Melinda A; Cully, Jeffery A
2015-10-01
Treatment of chronic obstructive pulmonary disease (COPD) is palliative, and quality of life is important. Increased understanding of correlates of quality of life and its domains could help clinicians and researchers better tailor COPD treatments and better support patients engaging in those treatments or other important self-management behaviors. Anxiety is common in those with COPD; however, overlap of physical and emotional symptoms complicates its assessment. The current study aimed to identify anxiety symptom clusters and to assess the association of these symptom clusters with COPD-related quality of life. Participants (N = 162) with COPD completed the Beck Anxiety Inventory (BAI), Chronic Respiratory Disease Questionnaire, Patient Health Questionnaire-9, and Medical Research Council dyspnea scale. Anxiety clusters were identified, using principal component analysis (PCA) on the BAI's 21 items. Anxiety clusters, along with factors previously associated with quality of life, were entered into a multiple regression designed to predict COPD-related quality of life. PCA identified four symptom clusters related to (1) general somatic distress, (2) fear, (3) nervousness, and (4) respiration-related distress. Multiple regression analyses indicated that greater fear was associated with less perceived mastery over COPD (β = -0.19, t(149) = -2.69, p < 0.01). Anxiety symptoms associated with fear appear to be an important indicator of anxiety in patients with COPD. In particular, fear was associated with perceptions of mastery, an important psychological construct linked to disease self-management. Assessing the BAI symptom cluster associated with fear (five items) may be a valuable rapid assessment tool to improve COPD treatment and physical health outcomes.
Evolution of specifier proteins in glucosinolate-containing plants
2012-01-01
Background The glucosinolate-myrosinase system is an activated chemical defense system found in plants of the Brassicales order. Glucosinolates are stored separately from their hydrolytic enzymes, the myrosinases, in plant tissues. Upon tissue damage, e.g. by herbivory, glucosinolates and myrosinases get mixed and glucosinolates are broken down to an array of biologically active compounds of which isothiocyanates are toxic to a wide range of organisms. Specifier proteins occur in some, but not all glucosinolate-containing plants and promote the formation of biologically active non-isothiocyanate products upon myrosinase-catalyzed glucosinolate breakdown. Results Based on a phytochemical screening among representatives of the Brassicales order, we selected candidate species for identification of specifier protein cDNAs. We identified ten specifier proteins from a range of species of the Brassicaceae and assigned each of them to one of the three specifier protein types (NSP, nitrile-specifier protein, ESP, epithiospecifier protein, TFP, thiocyanate-forming protein) after heterologous expression in Escherichia coli. Together with nine known specifier proteins and three putative specifier proteins found in databases, we subjected the newly identified specifier proteins to phylogenetic analyses. Specifier proteins formed three major clusters, named AtNSP5-cluster, AtNSP1-cluster, and ESP/TFP cluster. Within the ESP/TFP cluster, specifier proteins grouped according to the Brassicaceae lineage they were identified from. Non-synonymous vs. synonymous substitution rate ratios suggested purifying selection to act on specifier protein genes. Conclusions Among specifier proteins, NSPs represent the ancestral activity. The data support a monophyletic origin of ESPs from NSPs. The split between NSPs and ESPs/TFPs happened before the radiation of the core Brassicaceae. Future analyses have to show if TFP activity evolved from ESPs at least twice independently in different Brassicaceae lineages as suggested by the phylogeny. The ability to form non-isothiocyanate products by specifier protein activity may provide plants with a selective advantage. The evolution of specifier proteins in the Brassicaceae demonstrates the plasticity of secondary metabolism within an activated plant defense system. PMID:22839361
Cluster Analysis Identifies 3 Phenotypes within Allergic Asthma.
Sendín-Hernández, María Paz; Ávila-Zarza, Carmelo; Sanz, Catalina; García-Sánchez, Asunción; Marcos-Vadillo, Elena; Muñoz-Bellido, Francisco J; Laffond, Elena; Domingo, Christian; Isidoro-García, María; Dávila, Ignacio
Asthma is a heterogeneous chronic disease with different clinical expressions and responses to treatment. In recent years, several unbiased approaches based on clinical, physiological, and molecular features have described several phenotypes of asthma. Some phenotypes are allergic, but little is known about whether these phenotypes can be further subdivided. We aimed to phenotype patients with allergic asthma using an unbiased approach based on multivariate classification techniques (unsupervised hierarchical cluster analysis). From a total of 54 variables of 225 patients with well-characterized allergic asthma diagnosed following American Thoracic Society (ATS) recommendation, positive skin prick test to aeroallergens, and concordant symptoms, we finally selected 19 variables by multiple correspondence analyses. Then a cluster analysis was performed. Three groups were identified. Cluster 1 was constituted by patients with intermittent or mild persistent asthma, without family antecedents of atopy, asthma, or rhinitis. This group showed the lowest total IgE levels. Cluster 2 was constituted by patients with mild asthma with a family history of atopy, asthma, or rhinitis. Total IgE levels were intermediate. Cluster 3 included patients with moderate or severe persistent asthma that needed treatment with corticosteroids and long-acting β-agonists. This group showed the highest total IgE levels. We identified 3 phenotypes of allergic asthma in our population. Furthermore, we described 2 phenotypes of mild atopic asthma mainly differentiated by a family history of allergy. Copyright © 2017 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
GALAXY CLUSTERS DISCOVERED VIA THE SUNYAEV-ZEL'DOVICH EFFECT IN THE 2500-SQUARE-DEGREE SPT-SZ SURVEY
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bleem, L. E.; Stalder, B.; de Haan, T.
2015-01-29
We present a catalog of galaxy clusters selected via their Sunyaev-Zel'dovich (SZ) effect signature from 2500 deg(2) of South Pole Telescope (SPT) data. This work represents the complete sample of clusters detected at high significance in the 2500 deg(2) SPT-SZ survey, which was completed in 2011. A total of 677 (409) cluster candidates are identified above a signal-to-noise threshold of ξ = 4.5 (5.0). Ground- and space-based optical and near-infrared (NIR) imaging confirms overdensities of similarly colored galaxies in the direction of 516 (or 76%) of the ξ > 4.5 candidates and 387 (or 95%) of the ξ > 5 candidates, the measured purity is consistent with expectations from simulations. Of these confirmed clusters, 415 were first identified in SPT data, including 251 new discoveries reported in this work. We estimate photometric redshifts for all candidates with identified optical and/or NIR counterparts, we additionally report redshifts derived from spectroscopic observations for 141 of these systems. The mass threshold of the catalog is roughly independent of redshift above z ~ 0.25 leading to a sample of massive clusters that extends to high redshift. The median mass of the sample is M (500c)(ρ(crit))more » $$\\sim 3.5\\times 10^{14}\\,M_\\odot \\,h_{70}^{-1}$$, the median redshift is z (med) = 0.55, and the highest-redshift systems are at z > 1.4. The combination of large redshift extent, clean selection, and high typical mass makes this cluster sample of particular interest for cosmological analyses and studies of cluster formation and evolution.« less
GALAXY CLUSTERS DISCOVERED VIA THE SUNYAEV-ZEL'DOVICH EFFECT IN THE 2500-SQUARE-DEGREE SPT-SZ SURVEY
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bleem, L. E.; Stalder, B.; de Haan, T.
2015-01-29
We present a catalog of galaxy clusters selected via their Sunyaev-Zel'dovich (SZ) effect signature from 2500 deg(2) of South Pole Telescope (SPT) data. This work represents the complete sample of clusters detected at high significance in the 2500 deg(2) SPT-SZ survey, which was completed in 2011. A total of 677 (409) cluster candidates are identified above a signal-to-noise threshold of xi = 4.5 (5.0). Ground-and space-based optical and near-infrared (NIR) imaging confirms overdensities of similarly colored galaxies in the direction of 516 (or 76%) of the xi > 4.5 candidates and 387 (or 95%) of the xi > 5 candidates;more » the measured purity is consistent with expectations from simulations. Of these confirmed clusters, 415 were first identified in SPT data, including 251 new discoveries reported in this work. We estimate photometric redshifts for all candidates with identified optical and/or NIR counterparts; we additionally report redshifts derived from spectroscopic observations for 141 of these systems. The mass threshold of the catalog is roughly independent of redshift above z similar to 0.25 leading to a sample of massive clusters that extends to high redshift. The median mass of the sample is M-500c(rho(crit)) similar to 3.5 x 10(14) M-circle dot h(70)(-1) 70, the median redshift is z(med) = 0.55, and the highest-redshift systems are at z > 1.4. The combination of large redshift extent, clean selection, and high typical mass makes this cluster sample of particular interest for cosmological analyses and studies of cluster formation and evolution.« less
GALAXY CLUSTERS DISCOVERED VIA THE SUNYAEV-ZEL'DOVICH EFFECT IN THE 2500-SQUARE-DEGREE SPT-SZ SURVEY
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bleem, L. E.; Carlstrom, J. E.; Chang, C. L.
2015-02-01
We present a catalog of galaxy clusters selected via their Sunyaev-Zel'dovich (SZ) effect signature from 2500 deg{sup 2} of South Pole Telescope (SPT) data. This work represents the complete sample of clusters detected at high significance in the 2500 deg{sup 2} SPT-SZ survey, which was completed in 2011. A total of 677 (409) cluster candidates are identified above a signal-to-noise threshold of ξ = 4.5 (5.0). Ground- and space-based optical and near-infrared (NIR) imaging confirms overdensities of similarly colored galaxies in the direction of 516 (or 76%) of the ξ > 4.5 candidates and 387 (or 95%) of the ξ > 5more » candidates; the measured purity is consistent with expectations from simulations. Of these confirmed clusters, 415 were first identified in SPT data, including 251 new discoveries reported in this work. We estimate photometric redshifts for all candidates with identified optical and/or NIR counterparts; we additionally report redshifts derived from spectroscopic observations for 141 of these systems. The mass threshold of the catalog is roughly independent of redshift above z ∼ 0.25 leading to a sample of massive clusters that extends to high redshift. The median mass of the sample is M {sub 500c}(ρ{sub crit}) ∼3.5×10{sup 14} M{sub ⊙} h{sub 70}{sup −1}, the median redshift is z {sub med} = 0.55, and the highest-redshift systems are at z > 1.4. The combination of large redshift extent, clean selection, and high typical mass makes this cluster sample of particular interest for cosmological analyses and studies of cluster formation and evolution.« less
Carrol, N V; Gagon, J P
1983-01-01
Because of increasing competition, it is becoming more important that health care providers pursue consumer-based market segmentation strategies. This paper presents a methodology for identifying and describing consumer segments in health service markets, and demonstrates the use of the methodology by presenting a study of consumer segments in the ambulatory care pharmacy market.
Catchment classification by runoff behaviour with self-organizing maps (SOM)
NASA Astrophysics Data System (ADS)
Ley, R.; Casper, M. C.; Hellebrand, H.; Merz, R.
2011-09-01
Catchments show a wide range of response behaviour, even if they are adjacent. For many purposes it is necessary to characterise and classify them, e.g. for regionalisation, prediction in ungauged catchments, model parameterisation. In this study, we investigate hydrological similarity of catchments with respect to their response behaviour. We analyse more than 8200 event runoff coefficients (ERCs) and flow duration curves of 53 gauged catchments in Rhineland-Palatinate, Germany, for the period from 1993 to 2008, covering a huge variability of weather and runoff conditions. The spatio-temporal variability of event-runoff coefficients and flow duration curves are assumed to represent how different catchments "transform" rainfall into runoff. From the runoff coefficients and flow duration curves we derive 12 signature indices describing various aspects of catchment response behaviour to characterise each catchment. Hydrological similarity of catchments is defined by high similarities of their indices. We identify, analyse and describe hydrologically similar catchments by cluster analysis using Self-Organizing Maps (SOM). As a result of the cluster analysis we get five clusters of similarly behaving catchments where each cluster represents one differentiated class of catchments. As catchment response behaviour is supposed to be dependent on its physiographic and climatic characteristics, we compare groups of catchments clustered by response behaviour with clusters of catchments based on catchment properties. Results show an overlap of 67% between these two pools of clustered catchments which can be improved using the topologic correctness of SOMs.
Catchment classification by runoff behaviour with self-organizing maps (SOM)
NASA Astrophysics Data System (ADS)
Ley, R.; Casper, M. C.; Hellebrand, H.; Merz, R.
2011-03-01
Catchments show a wide range of response behaviour, even if they are adjacent. For many purposes it is necessary to characterise and classify them, e.g. for regionalisation, prediction in ungauged catchments, model parameterisation. In this study, we investigate hydrological similarity of catchments with respect to their response behaviour. We analyse more than 8200 event runoff coefficients (ERCs) and flow duration curves of 53 gauged catchments in Rhineland-Palatinate, Germany, for the period from 1993 to 2008, covering a huge variability of weather and runoff conditions. The spatio-temporal variability of event-runoff coefficients and flow duration curves are assumed to represent how different catchments "transform" rainfall into runoff. From the runoff coefficients and flow duration curves we derive 12 signature indices describing various aspects of catchment response behaviour to characterise each catchment. Hydrological similarity of catchments is defined by high similarities of their indices. We identify, analyse and describe hydrologically similar catchments by cluster analysis using Self-Organizing Maps (SOM). As a result of the cluster analysis we get five clusters of similarly behaving catchments where each cluster represents one differentiated class of catchments. As catchment response behaviour is supposed to be dependent on its physiographic and climatic characteristics, we compare groups of catchments clustered by response behaviour with clusters of catchments based on catchment properties. Results show an overlap of 67% between these two pools of clustered catchments which can be improved using the topologic correctness of SOMs.
KinFin: Software for Taxon-Aware Analysis of Clustered Protein Sequences.
Laetsch, Dominik R; Blaxter, Mark L
2017-10-05
The field of comparative genomics is concerned with the study of similarities and differences between the information encoded in the genomes of organisms. A common approach is to define gene families by clustering protein sequences based on sequence similarity, and analyze protein cluster presence and absence in different species groups as a guide to biology. Due to the high dimensionality of these data, downstream analysis of protein clusters inferred from large numbers of species, or species with many genes, is nontrivial, and few solutions exist for transparent, reproducible, and customizable analyses. We present KinFin, a streamlined software solution capable of integrating data from common file formats and delivering aggregative annotation of protein clusters. KinFin delivers analyses based on systematic taxonomy of the species analyzed, or on user-defined, groupings of taxa, for example, sets based on attributes such as life history traits, organismal phenotypes, or competing phylogenetic hypotheses. Results are reported through graphical and detailed text output files. We illustrate the utility of the KinFin pipeline by addressing questions regarding the biology of filarial nematodes, which include parasites of veterinary and medical importance. We resolve the phylogenetic relationships between the species and explore functional annotation of proteins in clusters in key lineages and between custom taxon sets, identifying gene families of interest. KinFin can easily be integrated into existing comparative genomic workflows, and promotes transparent and reproducible analysis of clustered protein data. Copyright © 2017 Laetsch and Blaxter.
Salient concerns in using analgesia for cancer pain among outpatients: A cluster analysis study.
Meghani, Salimah H; Knafl, George J
2017-02-10
To identify unique clusters of patients based on their concerns in using analgesia for cancer pain and predictors of the cluster membership. This was a 3-mo prospective observational study ( n = 207). Patients were included if they were adults (≥ 18 years), diagnosed with solid tumors or multiple myelomas, and had at least one prescription of around-the-clock pain medication for cancer or cancer-treatment-related pain. Patients were recruited from two outpatient medical oncology clinics within a large health system in Philadelphia. A choice-based conjoint (CBC) analysis experiment was used to elicit analgesic treatment preferences (utilities). Patients employed trade-offs based on five analgesic attributes (percent relief from analgesics, type of analgesic, type of side-effects, severity of side-effects, out of pocket cost). Patients were clustered based on CBC utilities using novel adaptive statistical methods. Multiple logistic regression was used to identify predictors of cluster membership. The analyses found 4 unique clusters: Most patients made trade-offs based on the expectation of pain relief (cluster 1, 41%). For a subset, the main underlying concern was type of analgesic prescribed, i.e ., opioid vs non-opioid (cluster 2, 11%) and type of analgesic side effects (cluster 4, 21%), respectively. About one in four made trade-offs based on multiple concerns simultaneously including pain relief, type of side effects, and severity of side effects (cluster 3, 28%). In multivariable analysis, to identify predictors of cluster membership, clinical and socioeconomic factors (education, health literacy, income, social support) rather than analgesic attitudes and beliefs were found important; only the belief, i.e ., pain medications can mask changes in health or keep you from knowing what is going on in your body was found significant in predicting two of the four clusters [cluster 1 (-); cluster 4 (+)]. Most patients appear to be driven by a single salient concern in using analgesia for cancer pain. Addressing these concerns, perhaps through real time clinical assessments, may improve patients' analgesic adherence patterns and cancer pain outcomes.
Koren, Omry; Knights, Dan; Gonzalez, Antonio; Waldron, Levi; Segata, Nicola; Knight, Rob; Huttenhower, Curtis; Ley, Ruth E
2013-01-01
Recent analyses of human-associated bacterial diversity have categorized individuals into 'enterotypes' or clusters based on the abundances of key bacterial genera in the gut microbiota. There is a lack of consensus, however, on the analytical basis for enterotypes and on the interpretation of these results. We tested how the following factors influenced the detection of enterotypes: clustering methodology, distance metrics, OTU-picking approaches, sequencing depth, data type (whole genome shotgun (WGS) vs.16S rRNA gene sequence data), and 16S rRNA region. We included 16S rRNA gene sequences from the Human Microbiome Project (HMP) and from 16 additional studies and WGS sequences from the HMP and MetaHIT. In most body sites, we observed smooth abundance gradients of key genera without discrete clustering of samples. Some body habitats displayed bimodal (e.g., gut) or multimodal (e.g., vagina) distributions of sample abundances, but not all clustering methods and workflows accurately highlight such clusters. Because identifying enterotypes in datasets depends not only on the structure of the data but is also sensitive to the methods applied to identifying clustering strength, we recommend that multiple approaches be used and compared when testing for enterotypes.
Waldron, Levi; Segata, Nicola; Knight, Rob; Huttenhower, Curtis; Ley, Ruth E.
2013-01-01
Recent analyses of human-associated bacterial diversity have categorized individuals into ‘enterotypes’ or clusters based on the abundances of key bacterial genera in the gut microbiota. There is a lack of consensus, however, on the analytical basis for enterotypes and on the interpretation of these results. We tested how the following factors influenced the detection of enterotypes: clustering methodology, distance metrics, OTU-picking approaches, sequencing depth, data type (whole genome shotgun (WGS) vs.16S rRNA gene sequence data), and 16S rRNA region. We included 16S rRNA gene sequences from the Human Microbiome Project (HMP) and from 16 additional studies and WGS sequences from the HMP and MetaHIT. In most body sites, we observed smooth abundance gradients of key genera without discrete clustering of samples. Some body habitats displayed bimodal (e.g., gut) or multimodal (e.g., vagina) distributions of sample abundances, but not all clustering methods and workflows accurately highlight such clusters. Because identifying enterotypes in datasets depends not only on the structure of the data but is also sensitive to the methods applied to identifying clustering strength, we recommend that multiple approaches be used and compared when testing for enterotypes. PMID:23326225
Deep spectroscopy of nearby galaxy clusters - II. The Hercules cluster
NASA Astrophysics Data System (ADS)
Agulli, I.; Aguerri, J. A. L.; Diaferio, A.; Dominguez Palmero, L.; Sánchez-Janssen, R.
2017-06-01
We carried out the deep spectroscopic observations of the nearby cluster A 2151 with AF2/WYFFOS@WHT. The caustic technique enables us to identify 360 members brighter than Mr = -16 and within 1.3R200. We separated the members into subsamples according to photometrical and dynamical properties such as colour, local environment and infall time. The completeness of the catalogue and our large sample allow us to analyse the velocity dispersion and the luminosity functions (LFs) of the identified populations. We found evidence of a cluster still in its collapsing phase. The LF of the red population of A 2151 shows a deficit of dwarf red galaxies. Moreover, the normalized LFs of the red and blue populations of A 2151 are comparable to the red and blue LFs of the field, even if the blue galaxies start dominating 1 mag fainter and the red LF is well represented by a single Schechter function rather than a double Schechter function. We discuss how the evolution of cluster galaxies depends on their mass: bright and intermediate galaxies are mainly affected by dynamical friction and internal/mass quenching, while the evolution of dwarfs is driven by environmental processes that need time and a hostile cluster environment to remove the gas reservoirs and halt the star formation.
Health and disease phenotyping in old age using a cluster network analysis.
Valenzuela, Jesus Felix; Monterola, Christopher; Tong, Victor Joo Chuan; Ng, Tze Pin; Larbi, Anis
2017-11-15
Human ageing is a complex trait that involves the synergistic action of numerous biological processes that interact to form a complex network. Here we performed a network analysis to examine the interrelationships between physiological and psychological functions, disease, disability, quality of life, lifestyle and behavioural risk factors for ageing in a cohort of 3,270 subjects aged ≥55 years. We considered associations between numerical and categorical descriptors using effect-size measures for each variable pair and identified clusters of variables from the resulting pairwise effect-size network and minimum spanning tree. We show, by way of a correspondence analysis between the two sets of clusters, that they correspond to coarse-grained and fine-grained structure of the network relationships. The clusters obtained from the minimum spanning tree mapped to various conceptual domains and corresponded to physiological and syndromic states. Hierarchical ordering of these clusters identified six common themes based on interactions with physiological systems and common underlying substrates of age-associated morbidity and disease chronicity, functional disability, and quality of life. These findings provide a starting point for indepth analyses of ageing that incorporate immunologic, metabolomic and proteomic biomarkers, and ultimately offer low-level-based typologies of healthy and unhealthy ageing.
Nucleation of Hydrogen Deficient Carbon Clusters in Circumstellar Envelopes of Carbon Stars
NASA Astrophysics Data System (ADS)
Chiong, C. C.; Asvany, O.; Balucani, N.; Lee, Y. T.; Kaiser, R. I.
2001-04-01
Hydrogen deficient carbon clusters HCn and H2Cn are thought to resemble the crucial link between naked carbon clusters such as C2/C3, polycyclic aromatic hydrocarbons, and carbon rich interstellar/circumstellar grains. To fully understand the astrophysical significance of these grain nuclei condensation processes, it is of paramount significance to elucidate first detailed mechanism how these simple precursors are formed in outflow of carbon rich stars. Due to this importance, we initiated in our laboratory a systematic research program to investigate reactions of C2 and C3 clusters in their singlet X1Σ
Interpersonal Subtypes Within Social Anxiety: The Identification of Distinct Social Features.
Cooper, Danielle; Anderson, Timothy
2017-10-05
Although social anxiety disorder is defined by anxiety-related symptoms, little research has focused on the interpersonal features of social anxiety. Prior studies (Cain, Pincus, & Grosse Holtforth, 2010; Kachin, Newman, & Pincus, 2001) identified distinct subgroups of socially anxious individuals' interpersonal circumplex problems that were blends of agency and communion, and yet inconsistencies remain. We predicted 2 distinct interpersonal subtypes would exist for individuals with high social anxiety, and that these social anxiety subtypes would differ on empathetic concern, paranoia, received peer victimization, perspective taking, and emotional suppression. From a sample of 175 undergraduate participants, 51 participants with high social anxiety were selected as above a clinical cutoff on the social phobia scale. Cluster analyses identified 2 interpersonal subtypes of socially anxious individuals: low hostility-high submissiveness (Cluster 1) and high hostility-high submissiveness (Cluster 2). Cluster 1 reported higher levels of empathetic concern, lower paranoia, less peer victimization, and lower emotional suppression compared to Cluster 2. There were no differences between subtypes on perspective taking or cognitive reappraisal. Findings are consistent with an interpersonal conceptualization of social anxiety, and provide evidence of distinct social features between these subtypes. Findings have implications for the etiology, classification, and treatment of social anxiety.
The use of hierarchical clustering for the design of optimized monitoring networks
NASA Astrophysics Data System (ADS)
Soares, Joana; Makar, Paul Andrew; Aklilu, Yayne; Akingunola, Ayodeji
2018-05-01
Associativity analysis is a powerful tool to deal with large-scale datasets by clustering the data on the basis of (dis)similarity and can be used to assess the efficacy and design of air quality monitoring networks. We describe here our use of Kolmogorov-Zurbenko filtering and hierarchical clustering of NO2 and SO2 passive and continuous monitoring data to analyse and optimize air quality networks for these species in the province of Alberta, Canada. The methodology applied in this study assesses dissimilarity between monitoring station time series based on two metrics: 1 - R, R being the Pearson correlation coefficient, and the Euclidean distance; we find that both should be used in evaluating monitoring site similarity. We have combined the analytic power of hierarchical clustering with the spatial information provided by deterministic air quality model results, using the gridded time series of model output as potential station locations, as a proxy for assessing monitoring network design and for network optimization. We demonstrate that clustering results depend on the air contaminant analysed, reflecting the difference in the respective emission sources of SO2 and NO2 in the region under study. Our work shows that much of the signal identifying the sources of NO2 and SO2 emissions resides in shorter timescales (hourly to daily) due to short-term variation of concentrations and that longer-term averages in data collection may lose the information needed to identify local sources. However, the methodology identifies stations mainly influenced by seasonality, if larger timescales (weekly to monthly) are considered. We have performed the first dissimilarity analysis based on gridded air quality model output and have shown that the methodology is capable of generating maps of subregions within which a single station will represent the entire subregion, to a given level of dissimilarity. We have also shown that our approach is capable of identifying different sampling methodologies as well as outliers (stations' time series which are markedly different from all others in a given dataset).
Network Analysis to Risk Stratify Patients With Exercise Intolerance.
Oldham, William M; Oliveira, Rudolf K F; Wang, Rui-Sheng; Opotowsky, Alexander R; Rubins, David M; Hainer, Jon; Wertheim, Bradley M; Alba, George A; Choudhary, Gaurav; Tornyos, Adrienn; MacRae, Calum A; Loscalzo, Joseph; Leopold, Jane A; Waxman, Aaron B; Olschewski, Horst; Kovacs, Gabor; Systrom, David M; Maron, Bradley A
2018-03-16
Current methods assessing clinical risk because of exercise intolerance in patients with cardiopulmonary disease rely on a small subset of traditional variables. Alternative strategies incorporating the spectrum of factors underlying prognosis in at-risk patients may be useful clinically, but are lacking. Use unbiased analyses to identify variables that correspond to clinical risk in patients with exercise intolerance. Data from 738 consecutive patients referred for invasive cardiopulmonary exercise testing at a single center (2011-2015) were analyzed retrospectively (derivation cohort). A correlation network of invasive cardiopulmonary exercise testing parameters was assembled using |r|>0.5. From an exercise network of 39 variables (ie, nodes) and 98 correlations (ie, edges) corresponding to P <9.5e -46 for each correlation, we focused on a subnetwork containing peak volume of oxygen consumption (pVo 2 ) and 9 linked nodes. K-mean clustering based on these 10 variables identified 4 novel patient clusters characterized by significant differences in 44 of 45 exercise measurements ( P <0.01). Compared with a probabilistic model, including 23 independent predictors of pVo 2 and pVo 2 itself, the network model was less redundant and identified clusters that were more distinct. Cluster assignment from the network model was predictive of subsequent clinical events. For example, a 4.3-fold ( P <0.0001; 95% CI, 2.2-8.1) and 2.8-fold ( P =0.0018; 95% CI, 1.5-5.2) increase in hazard for age- and pVo 2 -adjusted all-cause 3-year hospitalization, respectively, were observed between the highest versus lowest risk clusters. Using these data, we developed the first risk-stratification calculator for patients with exercise intolerance. When applying the risk calculator to patients in 2 independent invasive cardiopulmonary exercise testing cohorts (Boston and Graz, Austria), we observed a clinical risk profile that paralleled the derivation cohort. Network analyses were used to identify novel exercise groups and develop a point-of-care risk calculator. These data expand the range of useful clinical variables beyond pVo 2 that predict hospitalization in patients with exercise intolerance. © 2018 American Heart Association, Inc.
Marques, Elisa A; Pizarro, Andreia N; Figueiredo, Pedro; Mota, Jorge; Santos, Maria P
2013-06-01
To analyze how modifiable health-related variables are clustered and associated with children's participation in play, active travel and structured exercise and sport among boys and girls. Data were collected from 9 middle-schools in Porto (Portugal) area. A total of 636 children in the 6th grade (340 girls and 296 boys) with a mean age of 11.64 years old participated in the study. Cluster analyses were used to identify patterns of lifestyle and healthy/unhealthy behaviors. Multinomial logistic regression analysis was used to estimate associations between cluster allocation, sedentary time and participation in three different physical activity (PA) contexts: play, active travel, and structured exercise/sport. Four distinct clusters were identified based on four lifestyle risk factors. The most disadvantaged cluster was characterized by high body mass index, low high-density lipoprotein cholesterol and cardiorespiratory fitness and a moderate level of moderate to vigorous PA. Everyday outdoor play (OR=1.85, 95%CI 0.318-0.915) and structured exercise/sport (OR=1.85, 95%CI 0.291-0.990) were associated with healthier lifestyle patterns. There were no significant associations between health patterns and sedentary time or travel mode. Outdoor play and sport/exercise participation seem more important than active travel from school in influencing children's healthy cluster profiles. Copyright © 2013 Elsevier Inc. All rights reserved.
The spatial clustering of obesity: does the built environment matter?
Huang, R; Moudon, A V; Cook, A J; Drewnowski, A
2015-12-01
Obesity rates in the USA show distinct geographical patterns. The present study used spatial cluster detection methods and individual-level data to locate obesity clusters and to analyse them in relation to the neighbourhood built environment. The 2008-2009 Seattle Obesity Study provided data on the self-reported height, weight, and sociodemographic characteristics of 1602 King County adults. Home addresses were geocoded. Clusters of high or low body mass index were identified using Anselin's Local Moran's I and a spatial scan statistic with regression models that searched for unmeasured neighbourhood-level factors from residuals, adjusting for measured individual-level covariates. Spatially continuous values of objectively measured features of the local neighbourhood built environment (SmartMaps) were constructed for seven variables obtained from tax rolls and commercial databases. Both the Local Moran's I and a spatial scan statistic identified similar spatial concentrations of obesity. High and low obesity clusters were attenuated after adjusting for age, gender, race, education and income, and they disappeared once neighbourhood residential property values and residential density were included in the model. Using individual-level data to detect obesity clusters with two cluster detection methods, the present study showed that the spatial concentration of obesity was wholly explained by neighbourhood composition and socioeconomic characteristics. These characteristics may serve to more precisely locate obesity prevention and intervention programmes. © 2014 The British Dietetic Association Ltd.
Cognitive Clusters in Specific Learning Disorder.
Poletti, Michele; Carretta, Elisa; Bonvicini, Laura; Giorgi-Rossi, Paolo
The heterogeneity among children with learning disabilities still represents a barrier and a challenge in their conceptualization. Although a dimensional approach has been gaining support, the categorical approach is still the most adopted, as in the recent fifth edition of the Diagnostic and Statistical Manual of Mental Disorders. The introduction of the single overarching diagnostic category of specific learning disorder (SLD) could underemphasize interindividual clinical differences regarding intracategory cognitive functioning and learning proficiency, according to current models of multiple cognitive deficits at the basis of neurodevelopmental disorders. The characterization of specific cognitive profiles associated with an already manifest SLD could help identify possible early cognitive markers of SLD risk and distinct trajectories of atypical cognitive development leading to SLD. In this perspective, we applied a cluster analysis to identify groups of children with a Diagnostic and Statistical Manual-based diagnosis of SLD with similar cognitive profiles and to describe the association between clusters and SLD subtypes. A sample of 205 children with a diagnosis of SLD were enrolled. Cluster analyses (agglomerative hierarchical and nonhierarchical iterative clustering technique) were used successively on 10 core subtests of the Wechsler Intelligence Scale for Children-Fourth Edition. The 4-cluster solution was adopted, and external validation found differences in terms of SLD subtype frequencies and learning proficiency among clusters. Clinical implications of these findings are discussed, tracing directions for further studies.
A new physical performance classification system for elite handball players: cluster analysis
Chirosa, Ignacio J.; Robinson, Joseph E.; van der Tillaar, Roland; Chirosa, Luis J.; Martín, Isidoro Martínez
2016-01-01
Abstract The aim of the present study was to identify different cluster groups of handball players according to their physical performance level assessed in a series of physical assessments, which could then be used to design a training program based on individual strengths and weaknesses, and to determine which of these variables best identified elite performance in a group of under-19 [U19] national level handball players. Players of the U19 National Handball team (n=16) performed a set of tests to determine: 10 m (ST10) and 20 m (ST20) sprint time, ball release velocity (BRv), countermovement jump (CMJ) height and squat jump (SJ) height. All players also performed an incremental-load bench press test to determine the 1 repetition maximum (1RMest), the load corresponding to maximum mean power (LoadMP), the mean propulsive phase power at LoadMP (PMPPMP) and the peak power at LoadMP (PPEAKMP). Cluster analyses of the test results generated four groupings of players. The variables best able to discriminate physical performance were BRv, ST20, 1RMest, PPEAKMP and PMPPMP. These variables could help coaches identify talent or monitor the physical performance of athletes in their team. Each cluster of players has a particular weakness related to physical performance and therefore, the cluster results can be applied to a specific training programmed based on individual needs. PMID:28149376
NASA Astrophysics Data System (ADS)
Barrett, Samuel; Tjallingii, Rik; Bloemsma, Menno; Brauer, Achim; Starnberger, Reinhard; Spötl, Christoph; Dulski, Peter
2015-04-01
The outcrop at Baumkirchen (Austria) encloses part of a unique sequence of laminated lacustrine sediments deposited during the last glacial cycle. A ~250m long composite sediment record recovered at this location now continuously covers the periods ~33 to ~45 ka BP (MIS 3) and ~59 to ~73 ka BP (MIS 4), which are separated by a hiatus. The well-laminated (mm-cm scale) and almost entirely clastic sediments reveal alternations of clayey silt and medium silt to very-fine sand layers. Although radiocarbon and optically stimulated luminescence (OSL) dating provide a robust chronology, accurate dating of the sediment laminations appears to be problematic due to very high sedimentation rates (3-8 cm/yr). X-ray fluorescence (XRF) core scanning provided a detailed ~150m long record of compositional changes of the sediments at Baumkirchen. Changes in the sediments are subtle and classification into different facies based on individual elements is therefore subjective. We applied a statistically robust clustering analysis to provide an objective compositional classification without prior knowledge, based on XRF measurements for 15 analysed elements (all those with an acceptable signal-noise ratio: Zr, Sr, Ca, Mn, Cu, Zn, Rb, Ni, Fe, K, Cr, V, Si, Ba, T). The clustering analysis indicates a distinct compositional change between sediments deposited below and above the stratigraphic hiatus, but also differentiates between individual different laminae. Preliminary results suggest variations in the sequence are largely controlled by the relative occurrence of different kinds of sediment represented by different clusters. Three clusters identify well-laminated sediments, visually similar in appearance, each dominated by an anti-correlation between Ca and one or more of the detrital elements K, Zr, Ti, Si and Fe. Two of these clusters occur throughout the entire sequence, one frequently and the other restricted to short sections, while the third occurs almost exclusively below the hiatus, indicating a geochemically distinct component that possibly represents a specific sediment source. In a similar manner, three other clusters identify event layers with different compositions of which two occur exclusively above the hiatus and one exclusively below. The variations in the occurrence of these clusters revealing distinct event layers suggest variations in dominant sediment source both above and below the hiatus and within the section above it. More detailed comparisons between compositional variations of the individual clusters obtained from biplots and microscopic observations on thin sections, grain-size analyses, and mineralogical analyses are needed to further differentiate between sediment sources and transport mechanisms.
DMINDA: an integrated web server for DNA motif identification and analyses.
Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying
2014-07-01
DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Rosychuk, Rhonda J; Mariathas, Hensley H; Graham, Michelle M; Holroyd, Brian R; Rowe, Brian H
2015-08-01
Atrial fibrillation and flutter (AFF) are the most common arrhythmias seen in the outpatient setting, and they affect more than 300,000 adult Canadians. The aims of this study were to examine temporal and geographic trends in emergency department (ED) presentations made by adults (age ≥ 35 years) for AFF in Alberta, Canada, from 1999 to 2011. Statistical disease cluster detection techniques were used to identify geographic areas with higher numbers of individuals presenting with AFF and higher numbers of ED presentations for AFF than expected by chance alone. Geographic clusters of individuals with stroke or heart failure follow-up within 365 days of ED presentations for AFF were also identified. All ED presentations for AFF made by individuals aged ≥35 years were extracted from Alberta's Ambulatory Care Classification System. The Alberta Health Care Insurance Plan provided population counts and demographics for the patients presenting (age, sex, year, geographic unit). The Physician Claims File provided non-ED physician claims data after a patient's ED presentation. Statistical analyses included numerical and graphical summaries, directly standardized rates, and statistical disease cluster detection tests. During 12 years, there were 63,395 ED presentations for AFF made by 32,101 individuals. Standardized rates remained relatively stable over time, at about two per 1,000 for individuals presenting to the ED for AFF and about three per 1,000 for ED presentations for AFF. The northern and southeastern parts of the province were identified as clusters of individuals presenting for AFF, and ED presentations for AFF, and several of the areas demonstrated clusters in multiple years. Further, several of the geographic clusters were also identified as potential clusters for stroke or heart failure within 365 days after the ED presentations for AFF. This population-based study spanned 12 fiscal years and showed variations in the number of people presenting to EDs for AFF and the number of ED presentations for AFF over geography. The potential clusters identified may represent geographic areas with higher disease severity or a lower availability of non-ED health services. The clusters are not all likely to have occurred by chance, and further investigation and intervention could occur to reduce ED presentations for AFF. © 2015 by the Society for Academic Emergency Medicine.
Ecological characteristics of Simulium breeding sites in West Africa.
Cheke, Robert A; Young, Stephen; Garms, Rolf
2017-03-01
Twenty-nine taxa of Simulium were identified amongst 527 collections of larvae and pupae from untreated rivers and streams in Liberia (362 collections in 1967-71 & 1989), Togo (125 in 1979-81), Benin (35 in 1979-81) and Ghana (5 in 1980-81). Presence or absence of associations between different taxa were used to group them into six clusters using Ward agglomerative hierarchical cluster analysis. Environmental data associated with the pre-imaginal habitats were then analysed in relation to the six clusters by one way ANOVA. The results revealed significant effects in determining the clusters of maximum river width (all P<0.001 unless stated otherwise), water temperature, dry bulb air temperature, relative humidity, altitude, type of water (on a range from trickle to large river), water level, slope, current, vegetation, light conditions, discharge, length of breeding area, environs, terrain, river bed type (P<0.01), and the supports to which the insects were attached (P<0.01). When four non-significant contributors (wet bulb temperature, river features, height of waterfall and depth) were excluded and the reduced data-set analysed by principal components analysis (PCA), the first two principal components (PCs) accounted for 87% of the variance, with geographical features dominant in PC1 and hydrological characteristics in PC2. The analyses also revealed the ecological characteristics of each taxon's pre-imaginal habitats, which are discussed with particular reference to members of the Simulium damnosum species complex, whose breeding site distributions were further analysed by canonical correspondence analysis (CCA), a method also applied to the data on non-vector species. Copyright © 2016 Elsevier B.V. All rights reserved.
Cluster analysis of European Y-chromosomal STR haplotypes using the discrete Laplace method.
Andersen, Mikkel Meyer; Eriksen, Poul Svante; Morling, Niels
2014-07-01
The European Y-chromosomal short tandem repeat (STR) haplotype distribution has previously been analysed in various ways. Here, we introduce a new way of analysing population substructure using a new method based on clustering within the discrete Laplace exponential family that models the probability distribution of the Y-STR haplotypes. Creating a consistent statistical model of the haplotypes enables us to perform a wide range of analyses. Previously, haplotype frequency estimation using the discrete Laplace method has been validated. In this paper we investigate how the discrete Laplace method can be used for cluster analysis to further validate the discrete Laplace method. A very important practical fact is that the calculations can be performed on a normal computer. We identified two sub-clusters of the Eastern and Western European Y-STR haplotypes similar to results of previous studies. We also compared pairwise distances (between geographically separated samples) with those obtained using the AMOVA method and found good agreement. Further analyses that are impossible with AMOVA were made using the discrete Laplace method: analysis of the homogeneity in two different ways and calculating marginal STR distributions. We found that the Y-STR haplotypes from e.g. Finland were relatively homogeneous as opposed to the relatively heterogeneous Y-STR haplotypes from e.g. Lublin, Eastern Poland and Berlin, Germany. We demonstrated that the observed distributions of alleles at each locus were similar to the expected ones. We also compared pairwise distances between geographically separated samples from Africa with those obtained using the AMOVA method and found good agreement. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Crowe, Michael L; LoPilato, Alexander C; Campbell, W Keith; Miller, Joshua D
2016-12-01
The present study hypothesized that there exist two distinct groups of entitled individuals: grandiose-entitled, and vulnerable-entitled. Self-report scores of entitlement were collected for 916 individuals using an online platform. Model-based cluster analyses were conducted on the individuals with scores one standard deviation above mean (n = 159) using the five-factor model dimensions as clustering variables. The results support the existence of two groups of entitled individuals categorized as emotionally stable and emotionally vulnerable. The emotionally stable cluster reported emotional stability, high self-esteem, more positive affect, and antisocial behavior. The emotionally vulnerable cluster reported low self-esteem and high levels of neuroticism, disinhibition, conventionality, psychopathy, negative affect, childhood abuse, intrusive parenting, and attachment difficulties. Compared to the control group, both clusters reported being more antagonistic, extraverted, Machiavellian, and narcissistic. These results suggest important differences are missed when simply examining the linear relationships between entitlement and various aspects of its nomological network.
Akar, Servet; Solmaz, Dilek; Kasifoglu, Timucin; Bilge, Sule Yasar; Sari, Ismail; Gumus, Zeynep Zehra; Tunca, Mehmet
2016-02-01
The aim of this study was to evaluate whether there are clinical subgroups that may have different prognoses among FMF patients. The cumulative clinical features of a large group of FMF patients [1168 patients, 593 (50.8%) male, mean age 35.3 years (s.d. 12.4)] were studied. To analyse our data and identify groups of FMF patients with similar clinical characteristics, a two-step cluster analysis using log-likelihood distance measures was performed. For clustering the FMF patients, we evaluated the following variables: gender, current age, age at symptom onset, age at diagnosis, presence of major clinical features, variables related with therapy and family history for FMF, renal failure and carriage of M694V. Three distinct groups of FMF patients were identified. Cluster 1 was characterized by a high prevalence of arthritis, pleuritis, erysipelas-like erythema (ELE) and febrile myalgia. The dosage of colchicine and the frequency of amyloidosis were lower in cluster 1. Patients in cluster 2 had an earlier age of disease onset and diagnosis. M694V carriage and amyloidosis prevalence were the highest in cluster 2. This group of patients was using the highest dose of colchicine. Patients in cluster 3 had the lowest prevalence of arthritis, ELE and febrile myalgia. The frequencies of M694V carriage and amyloidosis were lower in cluster 3 than the overall FMF patients. Non-response to colchicine was also slightly lower in cluster 3. Patients with FMF can be clustered into distinct patterns of clinical and genetic manifestations and these patterns may have different prognostic significance. © The Author 2015. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Proteogenomics connects somatic mutations to signalling in breast cancer.
Mertins, Philipp; Mani, D R; Ruggles, Kelly V; Gillette, Michael A; Clauser, Karl R; Wang, Pei; Wang, Xianlong; Qiao, Jana W; Cao, Song; Petralia, Francesca; Kawaler, Emily; Mundt, Filip; Krug, Karsten; Tu, Zhidong; Lei, Jonathan T; Gatza, Michael L; Wilkerson, Matthew; Perou, Charles M; Yellapantula, Venkata; Huang, Kuan-lin; Lin, Chenwei; McLellan, Michael D; Yan, Ping; Davies, Sherri R; Townsend, R Reid; Skates, Steven J; Wang, Jing; Zhang, Bing; Kinsinger, Christopher R; Mesri, Mehdi; Rodriguez, Henry; Ding, Li; Paulovich, Amanda G; Fenyö, David; Ellis, Matthew J; Carr, Steven A
2016-06-02
Somatic mutations have been extensively characterized in breast cancer, but the effects of these genetic alterations on the proteomic landscape remain poorly understood. Here we describe quantitative mass-spectrometry-based proteomic and phosphoproteomic analyses of 105 genomically annotated breast cancers, of which 77 provided high-quality data. Integrated analyses provided insights into the somatic cancer genome including the consequences of chromosomal loss, such as the 5q deletion characteristic of basal-like breast cancer. Interrogation of the 5q trans-effects against the Library of Integrated Network-based Cellular Signatures, connected loss of CETN3 and SKP1 to elevated expression of epidermal growth factor receptor (EGFR), and SKP1 loss also to increased SRC tyrosine kinase. Global proteomic data confirmed a stromal-enriched group of proteins in addition to basal and luminal clusters, and pathway analysis of the phosphoproteome identified a G-protein-coupled receptor cluster that was not readily identified at the mRNA level. In addition to ERBB2, other amplicon-associated highly phosphorylated kinases were identified, including CDK12, PAK1, PTK2, RIPK2 and TLK2. We demonstrate that proteogenomic analysis of breast cancer elucidates the functional consequences of somatic mutations, narrows candidate nominations for driver genes within large deletions and amplified regions, and identifies therapeutic targets.
Genetic Relatedness of North American Populations of Tomicus piniperda (Coleoptera: Scolytidae)
M. Carol Alosi Carter; Jacqueline L. Robertson; Robert A. Haack; Robert K. Lawrence; Jane L. Hayes
1996-01-01
We used DNA fingerprinting by random amplified polymorphic (RAPD) DNA and electrophoretic characterization of esteraseisozymesto investigate the genetic relatedness of North American populations of the exotic bark beetle Tombspiniperda (L.). Cluster analyses of genetic distances among populations identified the Illinois population as an outlier population with mean...
Scott, Barry; Young, Carolyn A.; Saikia, Sanjay; McMillan, Lisa K.; Monahan, Brendon J.; Koulman, Albert; Astin, Jonathan; Eaton, Carla J.; Bryant, Andrea; Wrenn, Ruth E.; Finch, Sarah C.; Tapper, Brian A.; Parker, Emily J.; Jameson, Geoffrey B.
2013-01-01
The indole-diterpene paxilline is an abundant secondary metabolite synthesized by Penicillium paxilli. In total, 21 genes have been identified at the PAX locus of which six have been previously confirmed to have a functional role in paxilline biosynthesis. A combination of bioinformatics, gene expression and targeted gene replacement analyses were used to define the boundaries of the PAX gene cluster. Targeted gene replacement identified seven genes, paxG, paxA, paxM, paxB, paxC, paxP and paxQ that were all required for paxilline production, with one additional gene, paxD, required for regular prenylation of the indole ring post paxilline synthesis. The two putative transcription factors, PP104 and PP105, were not co-regulated with the pax genes and based on targeted gene replacement, including the double knockout, did not have a role in paxilline production. The relationship of indole dimethylallyl transferases involved in prenylation of indole-diterpenes such as paxilline or lolitrem B, can be found as two disparate clades, not supported by prenylation type (e.g., regular or reverse). This paper provides insight into the P. paxilli indole-diterpene locus and reviews the recent advances identified in paxilline biosynthesis. PMID:23949005
Scarpassa, Vera Margarete; Cunha-Machado, Antonio Saulo; Saraiva, José Ferreira
2016-04-12
Anopheles nuneztovari sensu lato comprises cryptic species in northern South America, and the Brazilian populations encompass distinct genetic lineages within the Brazilian Amazon region. This study investigated, based on two molecular markers, whether these lineages might actually deserve species status. Specimens were collected in five localities of the Brazilian Amazon, including Manaus, Careiro Castanho and Autazes, in the State of Amazonas; Tucuruí, in the State of Pará; and Abacate da Pedreira, in the State of Amapá, and analysed for the COI gene (Barcode region) and 12 microsatellite loci. Phylogenetic analyses were performed using the maximum likelihood (ML) approach. Intra and inter samples genetic diversity were estimated using population genetics analyses, and the genetic groups were identified by means of the ML, Bayesian and factorial correspondence analyses and the Bayesian analysis of population structure. The Barcode region dataset (N = 103) generated 27 haplotypes. The haplotype network suggested three lineages. The ML tree retrieved five monophyletic groups. Group I clustered all specimens from Manaus and Careiro Castanho, the majority of Autazes and a few from Abacate da Pedreira. Group II clustered most of the specimens from Abacate da Pedreira and a few from Autazes and Tucuruí. Group III clustered only specimens from Tucuruí (lineage III), strongly supported (97 %). Groups IV and V clustered specimens of A. nuneztovari s.s. and A. dunhami, strongly (98 %) and weakly (70 %) supported, respectively. In the second phylogenetic analysis, the sequences from GenBank, identified as A. goeldii, clustered to groups I and II, but not to group III. Genetic distances (Kimura-2 parameters) among the groups ranged from 1.60 % (between I and II) to 2.32 % (between I and III). Microsatellite data revealed very high intra-population genetic variability. Genetic distances showed the highest and significant values (P = 0.005) between Tucuruí and all the other samples, and between Abacate da Pedreira and all the other samples. Genetic distances, Bayesian (Structure and BAPS) analyses and FCA suggested three distinct biological groups, supporting the barcode region results. The two markers revealed three genetic lineages for A. nuneztovari s.l. in the Brazilian Amazon region. Lineages I and II may represent genetically distinct groups or species within A. goeldii. Lineage III may represent a new species, distinct from the A. goeldii group, and may be the most ancestral in the Brazilian Amazon. They may have differences in Plasmodium susceptibility and should therefore be investigated further.
Evolutionary conservation of regulatory elements in vertebrate HOX gene clusters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Santini, Simona; Boore, Jeffrey L.; Meyer, Axel
2003-12-31
Due to their high degree of conservation, comparisons of DNA sequences among evolutionarily distantly-related genomes permit to identify functional regions in noncoding DNA. Hox genes are optimal candidate sequences for comparative genome analyses, because they are extremely conserved in vertebrates and occur in clusters. We aligned (Pipmaker) the nucleotide sequences of HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human and mouse (over 500 million years of evolutionary distance). We identified several highly conserved intergenic sequences, likely to be important in gene regulation. Only a few of these putative regulatory elements have been previously described as being involvedmore » in the regulation of Hox genes, while several others are new elements that might have regulatory functions. The majority of these newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac). The conserved intergenic regions located between the most rostrally expressed genes in the developing embryo are longer and better retained through evolution. We document that presumed regulatory sequences are retained differentially in either A or A clusters resulting from a genome duplication in the fish lineage. This observation supports both the hypothesis that the conserved elements are involved in gene regulation and the Duplication-Deletion-Complementation model.« less
Wotton, Karl R; Shimeld, Sebastian M
2011-12-01
In the human genome, members of the FoxC, FoxF, FoxL1, and FoxQ1 gene families are found in two paralagous clusters. One cluster contains the genes FOXQ1, FOXF2, FOXC1 and the second consists of FOXF1, FOXC2, and FOXL1. In jawed vertebrates these genes are known to be expressed in different pharyngeal tissues and all, except FoxQ1, are involved in patterning the early embryonic mesoderm. We have previously traced the evolution of this cluster in the bony vertebrates, and the gene content is identical in the dogfish, a member of the most basally branching lineage of the jawed vertebrates. Here we extend these analyses to jawless vertebrates. Using genomic searches and molecular approaches we have identified homologues of these genes from lampreys. We identify two FoxC genes, two FoxF genes, two FoxQ1 genes and single FoxL1 gene. We examine the embryonic expression of one predominantly mesodermally expressed gene family, FoxC, and the endodermally expressed member of the cluster, FoxQ1. We identified FoxQ1 transcripts in the pharyngeal endoderm, while the two FoxC genes are differentially expressed in the pharyngeal mesenchyme and ectoderm. Furthermore we identify conserved expression of lamprey FoxC genes in the paraxial and intermediate mesoderms. We interpret our results through a chordate-wide comparison of expression patterns and discuss gene content in the context of theories on the evolution of the vertebrate genome. 2011 Elsevier B.V. All rights reserved.
Geospatial Characterization of Fluvial Wood Arrangement in a Semi-confined Alluvial River
NASA Astrophysics Data System (ADS)
Martin, D. J.; Harden, C. P.; Pavlowsky, R. T.
2014-12-01
Large woody debris (LWD) has become universally recognized as an integral component of fluvial systems, and as a result, has become increasingly common as a river restoration tool. However, "natural" processes of wood recruitment and the subsequent arrangement of LWD within the river network are poorly understood. This research used a suite of spatial statistics to investigate longitudinal arrangement patterns of LWD in a low-gradient, Midwestern river. First, a large-scale GPS inventory of LWD, performed on the Big River in the eastern Missouri Ozarks, resulted in over 4,000 logged positions of LWD along seven river segments that covered nearly 100 km of the 237 km river system. A global Moran's I analysis indicates that LWD density is spatially autocorrelated and displays a clustering tendency within all seven river segments (P-value range = 0.000 to 0.054). A local Moran's I analysis identified specific locations along the segments where clustering occurs and revealed that, on average, clusters of LWD density (high or low) spanned 400 m. Spectral analyses revealed that, in some segments, LWD density is spatially periodic. Two segments displayed strong periodicity, while the remaining segments displayed varying degrees of noisiness. Periodicity showed a positive association with gravel bar spacing and meander wavelength, although there were insufficient data to statistically confirm the relationship. A wavelet analysis was then performed to investigate periodicity relative to location along the segment. The wavelet analysis identified significant (α = 0.05) periodicity at discrete locations along each of the segments. Those reaches yielding strong periodicity showed stronger relationships between LWD density and the geomorphic/riparian independent variables tested. Analyses consistently identified valley width and sinuosity as being associated with LWD density. The results of these analyses contribute a new perspective on the longitudinal distribution of LWD in a river system, which should help identify physical and/or riparian control mechanisms of LWD arrangement and support the development of models of LWD arrangement. Additionally, the spatial statistical tools presented here have shown to be valuable for identifying longitudinal patterns in river system components.
Sequence variation and phylogenetic analysis of envelope glycoprotein of hepatitis G virus.
Lim, M Y; Fry, K; Yun, A; Chong, S; Linnen, J; Fung, K; Kim, J P
1997-11-01
A transfusion-transmissible agent provisionally designated hepatitis G virus (HGV) was recently identified. In this study, we examined the variability of the HGV genome by analysing sequences in the putative envelope region from 72 isolates obtained from diverse geographical sources. The 1561 nucleotide sequence of the E1/E2/NS2a region of HGV was determined from 12 isolates, and compared with three published sequences. The most variability was observed in 400 nucleotides at the N terminus of E2. We next analysed this 400 nucleotide envelope variable region (EV) from an additional 60 HGV isolates. This sequence varied considerably among the 75 isolates, with overall identity ranging from 79.3% to 99.5% at the nucleotide level, and from 83.5% to 100% at the amino acid level. However, hypervariable regions were not identified. Phylogenetic analyses indicated that the 75 HGV isolates belong to a single genotype. A single-tier distribution of evolutionary distances was observed among the 15 E1/E2/NS2a sequences and the 75 EV sequences. In contrast, 11 isolates of HCV were analysed and showed a three-tiered distribution, representing genotypes, subtypes, and isolates. The 75 isolates of HGV fell into four clusters on the phylogenetic tree. Tight geographical clustering was observed among the HGV isolates from Japan and Korea.
Interpersonal Pathoplasticity in Individuals with Generalized Anxiety Disorder
Przeworski, Amy; Newman, Michelle G.; Pincus, Aaron L.; Kasoff, Michele B.; Yamasaki, Alissa S.; Castonguay, Louis G.; Berlin, Kristoffer S.
2011-01-01
Recent theories of Generalized Anxiety Disorder (GAD) have emphasized interpersonal and personality functioning as important aspects of the disorder. The current paper examines heterogeneity in interpersonal problems in two studies of individuals with GAD (n = 47 and n = 83). Interpersonal subtypes were assessed using the Inventory of Interpersonal Problems (IIP-C; Alden, Wiggins, & Pincus, 1990). Across both studies, individuals with GAD exhibited heterogeneous interpersonal problems, and cluster analyses of these patients' interpersonal characteristics yielded four replicable clusters identified as intrusive, exploitable, cold, and nonassertive subtypes. Consistent with our pathoplasticity hypotheses, clusters did not differ in GAD severity, anxiety severity, depression severity. Clusters in study two differed on rates of personality disorders, including avoidant personality disorder, further providing support for the validity of interpersonal subtypes. The presence of interpersonal subtypes in GAD may have important implications for treatment planning and efficacy. PMID:21553942
2011-01-01
Background Community-dwelling older people aged 65+ years sustain falls frequently; these can result in physical injuries necessitating medical attention including emergency department care and hospitalisation. Certain health conditions and impairments have been shown to contribute independently to the risk of falling or experiencing a fall injury, suggesting that individuals with these conditions or impairments should be the focus of falls prevention. Since older people commonly have multiple conditions/impairments, knowledge about which conditions/impairments coexist in at-risk individuals would be valuable in the implementation of a targeted prevention approach. The objective of this study was therefore to examine the prevalence and patterns of comorbidity in this population group. Methods We analysed hospitalisation data from Victoria, Australia's second most populous state, to estimate the prevalence of comorbidity in patients hospitalised at least once between 2005-6 and 2007-8 for treatment of acute fall-related injuries. In patients with two or more comorbid conditions (multicomorbidity) we used an agglomerative hierarchical clustering method to cluster comorbidity variables and identify constellations of conditions. Results More than one in four patients had at least one comorbid condition and among patients with comorbidity one in three had multicomorbidity (range 2-7). The prevalence of comorbidity varied by gender, age group, ethnicity and injury type; it was also associated with a significant increase in the average cumulative length of stay per patient. The cluster analysis identified five distinct, biologically plausible clusters of comorbidity: cardiopulmonary/metabolic, neurological, sensory, stroke and cancer. The cardiopulmonary/metabolic cluster was the largest cluster among the clusters identified. Conclusions The consequences of comorbidity clustering in terms of falls and/or injury outcomes of hospitalised patients should be investigated by future studies. Our findings have particular relevance for falls prevention strategies, clinical practice and planning of follow-up services for these patients. PMID:21851627
Cluster Analysis on Longitudinal Data of Patients with Adult-Onset Asthma.
Ilmarinen, Pinja; Tuomisto, Leena E; Niemelä, Onni; Tommola, Minna; Haanpää, Jussi; Kankaanranta, Hannu
Previous cluster analyses on asthma are based on cross-sectional data. To identify phenotypes of adult-onset asthma by using data from baseline (diagnostic) and 12-year follow-up visits. The Seinäjoki Adult Asthma Study is a 12-year follow-up study of patients with new-onset adult asthma. K-means cluster analysis was performed by using variables from baseline and follow-up visits on 171 patients to identify phenotypes. Five clusters were identified. Patients in cluster 1 (n = 38) were predominantly nonatopic males with moderate smoking history at baseline. At follow-up, 40% of these patients had developed persistent obstruction but the number of patients with uncontrolled asthma (5%) and rhinitis (10%) was the lowest. Cluster 2 (n = 19) was characterized by older men with heavy smoking history, poor lung function, and persistent obstruction at baseline. At follow-up, these patients were mostly uncontrolled (84%) despite daily use of inhaled corticosteroid (ICS) with add-on therapy. Cluster 3 (n = 50) consisted mostly of nonsmoking females with good lung function at diagnosis/follow-up and well-controlled/partially controlled asthma at follow-up. Cluster 4 (n = 25) had obese and symptomatic patients at baseline/follow-up. At follow-up, these patients had several comorbidities (40% psychiatric disease) and were treated daily with ICS and add-on therapy. Patients in cluster 5 (n = 39) were mostly atopic and had the earliest onset of asthma, the highest blood eosinophils, and FEV 1 reversibility at diagnosis. At follow-up, these patients used the lowest ICS dose but 56% were well controlled. Results can be used to predict outcomes of patients with adult-onset asthma and to aid in development of personalized therapy (NCT02733016 at ClinicalTrials.gov). Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Vu, Trang; Finch, Caroline F; Day, Lesley
2011-08-18
Community-dwelling older people aged 65+ years sustain falls frequently; these can result in physical injuries necessitating medical attention including emergency department care and hospitalisation. Certain health conditions and impairments have been shown to contribute independently to the risk of falling or experiencing a fall injury, suggesting that individuals with these conditions or impairments should be the focus of falls prevention. Since older people commonly have multiple conditions/impairments, knowledge about which conditions/impairments coexist in at-risk individuals would be valuable in the implementation of a targeted prevention approach. The objective of this study was therefore to examine the prevalence and patterns of comorbidity in this population group. We analysed hospitalisation data from Victoria, Australia's second most populous state, to estimate the prevalence of comorbidity in patients hospitalised at least once between 2005-6 and 2007-8 for treatment of acute fall-related injuries. In patients with two or more comorbid conditions (multicomorbidity) we used an agglomerative hierarchical clustering method to cluster comorbidity variables and identify constellations of conditions. More than one in four patients had at least one comorbid condition and among patients with comorbidity one in three had multicomorbidity (range 2-7). The prevalence of comorbidity varied by gender, age group, ethnicity and injury type; it was also associated with a significant increase in the average cumulative length of stay per patient. The cluster analysis identified five distinct, biologically plausible clusters of comorbidity: cardiopulmonary/metabolic, neurological, sensory, stroke and cancer. The cardiopulmonary/metabolic cluster was the largest cluster among the clusters identified. The consequences of comorbidity clustering in terms of falls and/or injury outcomes of hospitalised patients should be investigated by future studies. Our findings have particular relevance for falls prevention strategies, clinical practice and planning of follow-up services for these patients.
Psychiatrist-patient verbal and nonverbal communications during split-treatment appointments.
Cruz, Mario; Roter, Debra; Cruz, Robyn Flaum; Wieland, Melissa; Cooper, Lisa A; Larson, Susan; Pincus, Harold Alan
2011-11-01
This study characterized psychiatrist and patient communication behaviors and affective voice tones during pharmacotherapy appointments with depressed patients at four community-based mental health clinics where psychiatrists provided medication management and other mental health professionals provided therapy ("split treatment"). Audiorecordings of 84 unique pairs of psychiatrists and patients with a depressive disorder were analyzed with the Roter Interaction Analysis System, which identifies 41 discrete speech categories that can be grouped into composites representing broad conceptual communication domains. Cluster analysis identified psychiatrist communication patterns. T test and chi square analyses compared the clusters for verbal dominance, affective voice tone, and characteristics of psychiatrist and patients. On average, 53% of psychiatrist talk was devoted to partnering and relationship building, and 67% of patient talk was about biomedical subjects, such as depression symptoms, and psychosocial information giving. Psychiatrist communication patterns were characterized by two clusters, a biomedical-centered cluster that emphasized biomedical questions (η²=.22, df=82, p<.001) and education or counseling (η²=.20, df=82, p<.001) and a patient-centered cluster focused on psychosocial and lifestyle questions (η²=.24, df=82, p<.001) and information giving (η²=.17, df=82, p<.001). The patient-centered cluster was associated with patients' expression of distress, anger, or other negative affects (t=3.22, df= 82, p=.002). Psychiatrists devoted much of their talk to partnering and relationship building while maintaining a focus on symptoms or psychosocial issues. However, patient behaviors did not reflect a similar level of partnering. Future studies should identify psychiatrist communication behaviors that activate collaborative patient communications or improve treatment outcomes.
NASA Astrophysics Data System (ADS)
Mukhopadhyay, Sayak; Saha, Rohini; Palanisamy, Anbarasi; Ghosh, Madhurima; Biswas, Anupriya; Roy, Saheli; Pal, Arijit; Sarkar, Kathakali; Bagh, Sangram
2016-05-01
Microgravity is a prominent health hazard for astronauts, yet we understand little about its effect at the molecular systems level. In this study, we have integrated a set of systems-biology tools and databases and have analysed more than 8000 molecular pathways on published global gene expression datasets of human cells in microgravity. Hundreds of new pathways have been identified with statistical confidence for each dataset and despite the difference in cell types and experiments, around 100 of the new pathways are appeared common across the datasets. They are related to reduced inflammation, autoimmunity, diabetes and asthma. We have identified downregulation of NfκB pathway via Notch1 signalling as new pathway for reduced immunity in microgravity. Induction of few cancer types including liver cancer and leukaemia and increased drug response to cancer in microgravity are also found. Increase in olfactory signal transduction is also identified. Genes, based on their expression pattern, are clustered and mathematically stable clusters are identified. The network mapping of genes within a cluster indicates the plausible functional connections in microgravity. This pipeline gives a new systems level picture of human cells under microgravity, generates testable hypothesis and may help estimating risk and developing medicine for space missions.
Mukhopadhyay, Sayak; Saha, Rohini; Palanisamy, Anbarasi; Ghosh, Madhurima; Biswas, Anupriya; Roy, Saheli; Pal, Arijit; Sarkar, Kathakali; Bagh, Sangram
2016-05-17
Microgravity is a prominent health hazard for astronauts, yet we understand little about its effect at the molecular systems level. In this study, we have integrated a set of systems-biology tools and databases and have analysed more than 8000 molecular pathways on published global gene expression datasets of human cells in microgravity. Hundreds of new pathways have been identified with statistical confidence for each dataset and despite the difference in cell types and experiments, around 100 of the new pathways are appeared common across the datasets. They are related to reduced inflammation, autoimmunity, diabetes and asthma. We have identified downregulation of NfκB pathway via Notch1 signalling as new pathway for reduced immunity in microgravity. Induction of few cancer types including liver cancer and leukaemia and increased drug response to cancer in microgravity are also found. Increase in olfactory signal transduction is also identified. Genes, based on their expression pattern, are clustered and mathematically stable clusters are identified. The network mapping of genes within a cluster indicates the plausible functional connections in microgravity. This pipeline gives a new systems level picture of human cells under microgravity, generates testable hypothesis and may help estimating risk and developing medicine for space missions.
Fens, Niki; van Rossum, Annelot G J; Zanen, Pieter; van Ginneken, Bram; van Klaveren, Rob J; Zwinderman, Aeilko H; Sterk, Peter J
2013-06-01
Classification of COPD is currently based on the presence and severity of airways obstruction. However, this may not fully reflect the phenotypic heterogeneity of COPD in the (ex-) smoking community. We hypothesized that factor analysis followed by cluster analysis of functional, clinical, radiological and exhaled breath metabolomic features identifies subphenotypes of COPD in a community-based population of heavy (ex-) smokers. Adults between 50-75 years with a smoking history of at least 15 pack-years derived from a random population-based survey as part of the NELSON study underwent detailed assessment of pulmonary function, chest CT scanning, questionnaires and exhaled breath molecular profiling using an electronic nose. Factor and cluster analyses were performed on the subgroup of subjects fulfilling the GOLD criteria for COPD (post-BD FEV1/FVC < 0.70). Three hundred subjects were recruited, of which 157 fulfilled the criteria for COPD and were included in the factor and cluster analysis. Four clusters were identified: cluster 1 (n = 35; 22%): mild COPD, limited symptoms and good quality of life. Cluster 2 (n = 48; 31%): low lung function, combined emphysema and chronic bronchitis and a distinct breath molecular profile. Cluster 3 (n = 60; 38%): emphysema predominant COPD with preserved lung function. Cluster 4 (n = 14; 9%): highly symptomatic COPD with mildly impaired lung function. In a leave-one-out validation analysis an accuracy of 97.4% was reached. This unbiased taxonomy for mild to moderate COPD reinforces clusters found in previous studies and thereby allows better phenotyping of COPD in the general (ex-) smoking population.
Schulz, Marcus; Neumann, Daniel; Fleet, David M; Matthies, Michael
2013-12-01
During the last decades, marine pollution with anthropogenic litter has become a worldwide major environmental concern. Standardized monitoring of litter since 2001 on 78 beaches selected within the framework of the Convention for the Protection of the Marine Environment of the North-East Atlantic (OSPAR) has been used to identify temporal trends of marine litter. Based on statistical analyses of this dataset a two-part multi-criteria evaluation system for beach litter pollution of the North-East Atlantic and the North Sea is proposed. Canonical correlation analyses, linear regression analyses, and non-parametric analyses of variance were used to identify different temporal trends. A classification of beaches was derived from cluster analyses and served to define different states of beach quality according to abundances of 17 input variables. The evaluation system is easily applicable and relies on the above-mentioned classification and on significant temporal trends implied by significant rank correlations. Copyright © 2013 Elsevier Ltd. All rights reserved.
McGuire, Anthony W; Eastwood, Jo-Ann; Hays, Ron D; Macabasco-O'Connell, Aurelia; Doering, Lynn V
2014-03-01
Assessing depression in patients hospitalized with coronary heart disease is clinically challenging because depressive symptoms are often confounded by poor somatic health. To identify symptom clusters associated with clinical depression in patients hospitalized with coronary heart disease. Secondary analyses of 3 similar data sets for hospitalized patients with coronary heart disease who had diagnostic screening for depression (99 depressed, 224 not depressed) were done. Depressive symptoms were assessed by using the Hamilton Depression Rating Scale or the Beck Depression Inventory. Hierarchical cluster analysis was performed on 11 symptom variables: anhedonia, dysphoria, loss of appetite, sleep disturbance, fatigue, guilt, suicidal symptoms, hypochondriasis, loss of libido, psychomotor impairment, and nervous irritability. Associations between symptom clusters and presence or absence of clinical depression were estimated by using logistic regression. Fatigue (69%) and sleep disturbance (55%) were the most prevalent symptoms. Guilt (25%) and suicidal symptoms (9%) were the least common. Three symptom clusters (cognitive/affective, somatic/affective, and somatic) were identified. Compared with patients without cognitive/affective symptoms, patients with the cognitive/affective symptom cluster (anhedonia, dysphoria, guilt, suicidal symptoms, nervous irritability) had an odds ratio of 1.41 (P<.001; 95% CI, 1.223-1.631) for clinical depression. Clinicians should be alert for clinical depression in hospitalized patients with coronary heart disease who have the cognitive/affective symptom cluster.
Jaimes-Bautista, A G; Rodríguez-Camacho, M; Martínez-Juárez, I E; Rodríguez-Agudelo, Y
2017-08-29
Patients with temporal lobe epilepsy (TLE) perform poorly on semantic verbal fluency (SVF) tasks. Completing these tasks successfully involves multiple cognitive processes simultaneously. Therefore, quantitative analysis of SVF (number of correct words in one minute), conducted in most studies, has been found to be insufficient to identify cognitive dysfunction underlying SVF difficulties in TLE. To determine whether a sample of patients with TLE had SVF difficulties compared with a control group (CG), and to identify the cognitive components associated with SVF difficulties using quantitative and qualitative analysis. SVF was evaluated in 25 patients with TLE and 24 healthy controls; the semantic verbal fluency test included 5 semantic categories: animals, fruits, occupations, countries, and verbs. All 5 categories were analysed quantitatively (number of correct words per minute and interval of execution: 0-15, 16-30, 31-45, and 46-60seconds); the categories animals and fruits were also analysed qualitatively (clusters, cluster size, switches, perseverations, and intrusions). Patients generated fewer words for all categories and intervals and fewer clusters and switches for animals and fruits than the CG (P<.01). Differences between groups were not significant in terms of cluster size and number of intrusions and perseverations (P>.05). Our results suggest an association between SVF difficulties in TLE and difficulty activating semantic networks, impaired strategic search, and poor cognitive flexibility. Attention, inhibition, and working memory are preserved in these patients. Copyright © 2017 Sociedad Española de Neurología. Publicado por Elsevier España, S.L.U. All rights reserved.
Ecogenomics and potential biogeochemical impacts of globally abundant ocean viruses.
Roux, Simon; Brum, Jennifer R; Dutilh, Bas E; Sunagawa, Shinichi; Duhaime, Melissa B; Loy, Alexander; Poulos, Bonnie T; Solonenko, Natalie; Lara, Elena; Poulain, Julie; Pesant, Stéphane; Kandels-Lewis, Stefanie; Dimier, Céline; Picheral, Marc; Searson, Sarah; Cruaud, Corinne; Alberti, Adriana; Duarte, Carlos M; Gasol, Josep M; Vaqué, Dolors; Bork, Peer; Acinas, Silvia G; Wincker, Patrick; Sullivan, Matthew B
2016-09-29
Ocean microbes drive biogeochemical cycling on a global scale. However, this cycling is constrained by viruses that affect community composition, metabolic activity, and evolutionary trajectories. Owing to challenges with the sampling and cultivation of viruses, genome-level viral diversity remains poorly described and grossly understudied, with less than 1% of observed surface-ocean viruses known. Here we assemble complete genomes and large genomic fragments from both surface- and deep-ocean viruses sampled during the Tara Oceans and Malaspina research expeditions, and analyse the resulting 'global ocean virome' dataset to present a global map of abundant, double-stranded DNA viruses complete with genomic and ecological contexts. A total of 15,222 epipelagic and mesopelagic viral populations were identified, comprising 867 viral clusters (defined as approximately genus-level groups). This roughly triples the number of known ocean viral populations and doubles the number of candidate bacterial and archaeal virus genera, providing a near-complete sampling of epipelagic communities at both the population and viral-cluster level. We found that 38 of the 867 viral clusters were locally or globally abundant, together accounting for nearly half of the viral populations in any global ocean virome sample. While two-thirds of these clusters represent newly described viruses lacking any cultivated representative, most could be computationally linked to dominant, ecologically relevant microbial hosts. Moreover, we identified 243 viral-encoded auxiliary metabolic genes, of which only 95 were previously known. Deeper analyses of four of these auxiliary metabolic genes (dsrC, soxYZ, P-II (also known as glnB) and amoC) revealed that abundant viruses may directly manipulate sulfur and nitrogen cycling throughout the epipelagic ocean. This viral catalog and functional analyses provide a necessary foundation for the meaningful integration of viruses into ecosystem models where they act as key players in nutrient cycling and trophic networks.
Patterns of Gender Equality at Workplaces and Psychological Distress
Bolin, Malin; Hammarström, Anne
2013-01-01
Research in the field of occupational health often uses a risk factor approach which has been criticized by feminist researchers for not considering the combination of many different variables that are at play simultaneously. To overcome this shortcoming this study aims to identify patterns of gender equality at workplaces and to investigate how these patterns are associated with psychological distress. Questionnaire data from the Northern Swedish Cohort (n = 715) have been analysed and supplemented with register data about the participants' workplaces. The register data were used to create gender equality indicators of women/men ratios of number of employees, educational level, salary and parental leave. Cluster analysis was used to identify patterns of gender equality at the workplaces. Differences in psychological distress between the clusters were analysed by chi-square test and logistic regression analyses, adjusting for individual socio-demographics and previous psychological distress. The cluster analysis resulted in six distinctive clusters with different patterns of gender equality at the workplaces that were associated to psychological distress for women but not for men. For women the highest odds of psychological distress was found on traditionally gender unequal workplaces. The lowest overall occurrence of psychological distress as well as same occurrence for women and men was found on the most gender equal workplaces. The results from this study support the convergence hypothesis as gender equality at the workplace does not only relate to better mental health for women, but also more similar occurrence of mental ill-health between women and men. This study highlights the importance of utilizing a multidimensional view of gender equality to understand its association to health outcomes. Health policies need to consider gender equality at the workplace level as a social determinant of health that is of importance for reducing differences in health outcomes for women and men. PMID:23326404
Ecogenomics and potential biogeochemical impacts of globally abundant ocean viruses
NASA Astrophysics Data System (ADS)
2016-09-01
Ocean microbes drive biogeochemical cycling on a global scale. However, this cycling is constrained by viruses that affect community composition, metabolic activity, and evolutionary trajectories. Owing to challenges with the sampling and cultivation of viruses, genome-level viral diversity remains poorly described and grossly understudied, with less than 1% of observed surface-ocean viruses known. Here we assemble complete genomes and large genomic fragments from both surface- and deep-ocean viruses sampled during the Tara Oceans and Malaspina research expeditions, and analyse the resulting ‘global ocean virome’ dataset to present a global map of abundant, double-stranded DNA viruses complete with genomic and ecological contexts. A total of 15,222 epipelagic and mesopelagic viral populations were identified, comprising 867 viral clusters (defined as approximately genus-level groups). This roughly triples the number of known ocean viral populations and doubles the number of candidate bacterial and archaeal virus genera, providing a near-complete sampling of epipelagic communities at both the population and viral-cluster level. We found that 38 of the 867 viral clusters were locally or globally abundant, together accounting for nearly half of the viral populations in any global ocean virome sample. While two-thirds of these clusters represent newly described viruses lacking any cultivated representative, most could be computationally linked to dominant, ecologically relevant microbial hosts. Moreover, we identified 243 viral-encoded auxiliary metabolic genes, of which only 95 were previously known. Deeper analyses of four of these auxiliary metabolic genes (dsrC, soxYZ, P-II (also known as glnB) and amoC) revealed that abundant viruses may directly manipulate sulfur and nitrogen cycling throughout the epipelagic ocean. This viral catalog and functional analyses provide a necessary foundation for the meaningful integration of viruses into ecosystem models where they act as key players in nutrient cycling and trophic networks.
Spatial Analysis of HIV Positive Injection Drug Users in San Francisco, 1987 to 2005
Martinez, Alexis N.; Mobley, Lee R.; Lorvick, Jennifer; Novak, Scott P.; Lopez, Andrea M.; Kral, Alex H.
2014-01-01
Spatial analyses of HIV/AIDS related outcomes are growing in popularity as a tool to understand geographic changes in the epidemic and inform the effectiveness of community-based prevention and treatment programs. The Urban Health Study was a serial, cross-sectional epidemiological study of injection drug users (IDUs) in San Francisco between 1987 and 2005 (N = 29,914). HIV testing was conducted for every participant. Participant residence was geocoded to the level of the United States Census tract for every observation in dataset. Local indicator of spatial autocorrelation (LISA) tests were used to identify univariate and bivariate Census tract clusters of HIV positive IDUs in two time periods. We further compared three tract level characteristics (% poverty, % African Americans, and % unemployment) across areas of clustered and non-clustered tracts. We identified significant spatial clustering of high numbers of HIV positive IDUs in the early period (1987–1995) and late period (1996–2005). We found significant bivariate clusters of Census tracts where HIV positive IDUs and tract level poverty were above average compared to the surrounding areas. Our data suggest that poverty, rather than race, was an important neighborhood characteristic associated with the spatial distribution of HIV in SF and its spatial diffusion over time. PMID:24722543
Mashruwala, Ameya A.; Pang, Yun Y.; Rosario-Cruz, Zuelay; Chahal, Harsimranjit K.; Benson, Meredith A.; Anzaldi-Mike, Laura L.; Skaar, Eric P.; Torres, Victor J.; Nauseef, William M.; Boyd, Jeffrey M.
2015-01-01
Summary The acquisition and metabolism of iron (Fe) by the human pathogen Staphylococcus aureus is critical for disease progression. S. aureus requires Fe to synthesize inorganic cofactors called iron-sulfur (Fe-S) clusters, which are required for functional Fe-S proteins. In this study we investigated the mechanisms utilized by S. aureus to metabolize Fe-S clusters. We identified that S. aureus utilizes the Suf biosynthetic system to synthesize Fe-S clusters and we provide genetic evidence suggesting that the sufU and sufB gene products are essential. Additional biochemical and genetic analyses identified Nfu as a Fe-S cluster carrier, which aids in the maturation of Fe-S proteins. We find that deletion of the nfu gene negatively impacts staphylococcal physiology and pathogenicity. A nfu mutant accumulates both increased intracellular non-incorporated Fe and endogenous reactive oxygen species (ROS) resulting in DNA damage. In addition, a strain lacking Nfu is sensitive to exogenously supplied ROS and reactive nitrogen species. Congruous with ex vivo findings, a nfu mutant strain is more susceptible to oxidative killing by human polymorphonuclear leukocytes and displays decreased tissue colonization in a murine model of infection. We conclude that Nfu is necessary for staphylococcal pathogenesis and establish Fe-S cluster metabolism as an attractive antimicrobial target. PMID:25388433
Clustering of food and activity preferences in primary school children.
Rodenburg, Gerda; Oenema, Anke; Pasma, Marleen; Kremers, Stef P J; van de Mheen, Dike
2013-01-01
This study examined clustering of food and activity preferences in Dutch primary school children. It also explored whether the preference clusters are associated with child and parental background characteristics and with parenting practices. Data were used from 1480 parent-child dyads participating in the IVO Nutrition and Physical Activity Child cohort (INPACT). Children aged 8-11years reported their preferences for food (e.g. fruit and sweet snacks) and activities (e.g. biking and watching television) at school with a newly-developed, visual instrument designed for primary school children. Parents completed a questionnaire at home. Principal component analysis was used to identify preference clusters. Backward regression analyses were used to examine the relationship between child and parental characteristics with cluster scores. We found (1) a clustering of preferences for unhealthy foods and unhealthy drinks, (2) a clustering of preferences for various physical activity behaviours, and (3) a clustering of preferences for unhealthy drinks and sedentary behaviour. Boys had a higher cluster score than girls on all three preference clusters. In addition, physical activity-related parenting practices were negatively related to unhealthy preference clusters and positively to the physical-activity-preference cluster. The next step is to relate our preference clusters to child dietary and activity behaviours, with special attention to gender differences. This may help in the development of interventions aimed at improving children's food and activity preferences. Copyright © 2012 Elsevier Ltd. All rights reserved.
A dual role for a polyketide synthase in dynemicin enediyne and anthraquinone biosynthesis
NASA Astrophysics Data System (ADS)
Cohen, Douglas R.; Townsend, Craig A.
2018-02-01
Dynemicin A is a member of a subfamily of enediyne antitumour antibiotics characterized by a 10-membered carbocycle fused to an anthraquinone, both of polyketide origin. Sequencing of the dynemicin biosynthetic gene cluster in Micromonospora chersina previously identified an enediyne polyketide synthase (PKS), but no anthraquinone PKS, suggesting gene(s) for biosynthesis of the latter were distant from the core dynemicin cluster. To identify these gene(s), we sequenced and analysed the genome of M. chersina. Sequencing produced a short list of putative PKS candidates, yet CRISPR-Cas9 mutants of each locus retained dynemicin production. Subsequently, deletion of two cytochromes P450 in the dynemicin cluster suggested that the dynemicin enediyne PKS, DynE8, may biosynthesize the anthraquinone. Together with 18O-labelling studies, we now present evidence that DynE8 produces the core scaffolds of both the enediyne and anthraquinone, and provide a working model to account for their formation from the programmed octaketide of the enediyne PKS.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pettengill, Emily A.; Pettengill, James B.; Binet, Rachel
As a leading cause of bacterial dysentery, Shigella represents a significant threat to public health and food safety. Related, but often overlooked, enteroinvasive Escherichia coli (EIEC) can also cause dysentery. Current typing methods have limited ability to identify and differentiate between these pathogens despite the need for rapid and accurate identification of pathogens for clinical treatment and outbreak response. We present a comprehensive phylogeny of Shigella and EIEC using whole genome sequencing of 169 samples, constituting unparalleled strain diversity, and observe a lack of monophyly between Shigella and EIEC and among Shigella taxonomic groups. The evolutionary relationships in the phylogenymore » are supported by analyses of population structure and hierarchical clustering patterns of translated gene homolog abundance. Lastly, we identified a panel of 404 single nucleotide polymorphism (SNP) markers specific to each phylogenetic cluster for more accurate identification of Shigella and EIEC. Our findings show that Shigella and EIEC are not distinct evolutionary groups within the E. coli genus and, thus, EIEC as a group is not the ancestor to Shigella. The multiple analyses presented provide evidence for reconsidering the taxonomic placement of Shigella. The SNP markers offer more discriminatory power to molecular epidemiological typing methods involving these bacterial pathogens.« less
Pettengill, Emily A.; Pettengill, James B.; Binet, Rachel
2016-01-19
As a leading cause of bacterial dysentery, Shigella represents a significant threat to public health and food safety. Related, but often overlooked, enteroinvasive Escherichia coli (EIEC) can also cause dysentery. Current typing methods have limited ability to identify and differentiate between these pathogens despite the need for rapid and accurate identification of pathogens for clinical treatment and outbreak response. We present a comprehensive phylogeny of Shigella and EIEC using whole genome sequencing of 169 samples, constituting unparalleled strain diversity, and observe a lack of monophyly between Shigella and EIEC and among Shigella taxonomic groups. The evolutionary relationships in the phylogenymore » are supported by analyses of population structure and hierarchical clustering patterns of translated gene homolog abundance. Lastly, we identified a panel of 404 single nucleotide polymorphism (SNP) markers specific to each phylogenetic cluster for more accurate identification of Shigella and EIEC. Our findings show that Shigella and EIEC are not distinct evolutionary groups within the E. coli genus and, thus, EIEC as a group is not the ancestor to Shigella. The multiple analyses presented provide evidence for reconsidering the taxonomic placement of Shigella. The SNP markers offer more discriminatory power to molecular epidemiological typing methods involving these bacterial pathogens.« less
Su, Junhu; Ji, Weihong; Wei, Yanming; Zhang, Yanping; Gleeson, Dianne M; Lou, Zhongyu; Ren, Jing
2014-08-01
The endangered schizothoracine fish Gymnodiptychus pachycheilus is endemic to the Qinghai-Tibetan Plateau (QTP), but very little genetic information is available for this species. Here, we accessed the current genetic divergence of G. pachycheilus population to evaluate their distributions modulated by contemporary and historical processes. Population structure and demographic history were assessed by analyzing 1811-base pairs of mitochondrial DNA from 61 individuals across a large proportion of its geographic range. Our results revealed low nucleotide diversity, suggesting severe historical bottleneck events. Analyses of molecular variance and the conventional population statistic FST (0.0435, P = 0.0215) confirmed weak genetic structure. The monophyly of G. pachycheilus was statistically well-supported, while two divergent evolutionary clusters were identified by phylogenetic analyses, suggesting a microgeographic population structure. The consistent scenario of recent population expansion of two clusters was identified based on several complementary analyses of demographic history (0.096 Ma and 0.15 Ma). This genetic divergence and evolutionary process are likely to have resulted from a series of drainage arrangements triggered by the historical tectonic events of the region. The results obtained here provide the first insights into the evolutionary history and genetic status of this little-known fish.
Using GIS and coarse-scale, publicly-available data, the GLEI project has defined the landscape character of areas draining to 76d2 shoreline segments - the entire US portion of the Great Lakes basin. Using principal components and clustering analyses to discriminate among the se...
ERIC Educational Resources Information Center
Wilson, Anna C.; Lengua, Liliana J.; Tininenko, Jennifer; Taylor, Adam; Trancik, Anika
2009-01-01
This longitudinal study utilized a community sample of children (N = 91, 45% female, 8-11 years at time 1) to investigate physiological responses (heart rate reactivity [HRR] and electrodermal responding [EDR]) during delay of gratification in relation to emotionality, self-regulation, and adjustment problems. Cluster analyses identified three…
Optimization-Based Model Fitting for Latent Class and Latent Profile Analyses
ERIC Educational Resources Information Center
Huang, Guan-Hua; Wang, Su-Mei; Hsu, Chung-Chu
2011-01-01
Statisticians typically estimate the parameters of latent class and latent profile models using the Expectation-Maximization algorithm. This paper proposes an alternative two-stage approach to model fitting. The first stage uses the modified k-means and hierarchical clustering algorithms to identify the latent classes that best satisfy the…
Identifying contextual influences of community reintegration among injured servicemembers.
Hawkins, Brent L; McGuire, Francis A; Britt, Thomas W; Linder, Sandra M
2015-01-01
Research suggests that community reintegration (CR) after injury and rehabilitation is difficult for many injured servicemembers. However, little is known about the influence of the contextual factors, both personal and environmental, that influence CR. Framed within the International Classification of Functioning, Disability and Health and Social Cognitive Theory, the quantitative portion of a larger mixed-methods study of 51 injured, community-dwelling servicemembers compared the relative contribution of contextual factors between groups of servicemembers with different levels of CR. Cluster analysis indicated three groups of servicemembers showing low, moderate, and high levels of CR. Statistical analyses identified contextual factors (e.g., personal and environmental factors) that significantly discriminated between CR clusters. Multivariate analysis of variance and discriminant analysis indicated significant contributions of general self-efficacy, services and assistance barriers, physical and structural barriers, attitudes and support barriers, perceived level of disability and/or handicap, work and school barriers, and policy barriers on CR scores. Overall, analyses indicated that injured servicemembers with lower CR scores had lower general self-efficacy scores, reported more difficulty with environmental barriers, and reported their injuries as more disabling.
Sizing the star cluster population of the Large Magellanic Cloud
NASA Astrophysics Data System (ADS)
Piatti, Andrés E.
2018-04-01
The number of star clusters that populate the Large Magellanic Cloud (LMC) at deprojected distances <4 deg has been recently found to be nearly double the known size of the system. Because of the unprecedented consequences of this outcome in our knowledge of the LMC cluster formation and dissolution histories, we closely revisited such a compilation of objects and found that only ˜35 per cent of the previously known catalogued clusters have been included. The remaining entries are likely related to stellar overdensities of the LMC composite star field, because there is a remarkable enhancement of objects with assigned ages older than log(t yr-1) ˜ 9.4, which contrasts with the existence of the LMC cluster age gap; the assumption of a cluster formation rate similar to that of the LMC star field does not help to conciliate so large amount of clusters either; and nearly 50 per cent of them come from cluster search procedures known to produce more than 90 per cent of false detections. The lack of further analyses to confirm the physical reality as genuine star clusters of the identified overdensities also glooms those results. We support that the actual size of the LMC main body cluster population is close to that previously known.
Patterns of breast cancer mortality trends in Europe.
Amaro, Joana; Severo, Milton; Vilela, Sofia; Fonseca, Sérgio; Fontes, Filipa; La Vecchia, Carlo; Lunet, Nuno
2013-06-01
To identify patterns of variation in breast cancer mortality in Europe (1980-2010), using a model-based approach. Mortality data were obtained from the World Health Organization database and mixed models were used to describe the time trends in the age-standardized mortality rates (ASMR). Model-based clustering was used to identify clusters of countries with homogeneous variation in ASMR. Three patterns were identified. Patterns 1 and 2 are characterized by stable or slightly increasing trends in ASMR in the first half of the period analysed, and a clear decline is observed thereafter; in pattern 1 the median of the ASMR is higher, and the highest rates were achieved sooner. Pattern 3 is characterised by a rapid increase in mortality until 1999, declining slowly thereafter. This study provides a general model for the description and interpretation of the variation in breast cancer mortality in Europe, based in three main patterns. Copyright © 2013 Elsevier Ltd. All rights reserved.
Integrating Gene Transcription-Based Biomarkers to Understand Desert Tortoise and Ecosystem Health.
Bowen, Lizabeth; Miles, A Keith; Drake, K Kristina; Waters, Shannon C; Esque, Todd C; Nussear, Kenneth E
2015-09-01
Tortoises are susceptible to a wide variety of environmental stressors, and the influence of human disturbances on health and survival of tortoises is difficult to detect. As an addition to current diagnostic methods for desert tortoises, we have developed the first leukocyte gene transcription biomarker panel for the desert tortoise (Gopherus agassizii), enhancing the ability to identify specific environmental conditions potentially linked to declining animal health. Blood leukocyte transcript profiles have the potential to identify physiologically stressed animals in lieu of clinical signs. For desert tortoises, the gene transcript profile included a combination of immune or detoxification response genes with the potential to be modified by biological or physical injury and consequently provide information on the type and magnitude of stressors present in the animal's habitat. Blood from 64 wild adult tortoises at three sites in Clark County, NV, and San Bernardino, CA, and from 19 captive tortoises in Clark County, NV, was collected and evaluated for genes indicative of physiological status. Statistical analysis using a priori groupings indicated significant differences among groups for several genes, while multidimensional scaling and cluster analyses of transcription C T values indicated strong differentiation of a large cluster and multiple outlying individual tortoises or small clusters in multidimensional space. These analyses highlight the effectiveness of the gene panel at detecting environmental perturbations as well as providing guidance in determining the health of the desert tortoise.
Integrating gene transcription-based biomarkers to understand desert tortoise and ecosystem health
Bowen, Lizabeth; Miles, A. Keith; Drake, Karla K.; Waters, Shannon C.; Esque, Todd C.; Nussear, Kenneth E.
2015-01-01
Tortoises are susceptible to a wide variety of environmental stressors, and the influence of human disturbances on health and survival of tortoises is difficult to detect. As an addition to current diagnostic methods for desert tortoises, we have developed the first leukocyte gene transcription biomarker panel for the desert tortoise (Gopherus agassizii), enhancing the ability to identify specific environmental conditions potentially linked to declining animal health. Blood leukocyte transcript profiles have the potential to identify physiologically stressed animals in lieu of clinical signs. For desert tortoises, the gene transcript profile included a combination of immune or detoxification response genes with the potential to be modified by biological or physical injury and consequently provide information on the type and magnitude of stressors present in the animal’s habitat. Blood from 64 wild adult tortoises at three sites in Clark County, NV, and San Bernardino, CA, and from 19 captive tortoises in Clark County, NV, was collected and evaluated for genes indicative of physiological status. Statistical analysis using a priori groupings indicated significant differences among groups for several genes, while multidimensional scaling and cluster analyses of transcriptionC T values indicated strong differentiation of a large cluster and multiple outlying individual tortoises or small clusters in multidimensional space. These analyses highlight the effectiveness of the gene panel at detecting environmental perturbations as well as providing guidance in determining the health of the desert tortoise.
Jacob, Benjamin G; Novak, Robert J; Toe, Laurent; Sanfo, Moussa S; Afriyie, Abena N; Ibrahim, Mohammed A; Griffith, Daniel A; Unnasch, Thomas R
2012-01-01
The standard methods for regression analyses of clustered riverine larval habitat data of Simulium damnosum s.l. a major black-fly vector of Onchoceriasis, postulate models relating observational ecological-sampled parameter estimators to prolific habitats without accounting for residual intra-cluster error correlation effects. Generally, this correlation comes from two sources: (1) the design of the random effects and their assumed covariance from the multiple levels within the regression model; and, (2) the correlation structure of the residuals. Unfortunately, inconspicuous errors in residual intra-cluster correlation estimates can overstate precision in forecasted S.damnosum s.l. riverine larval habitat explanatory attributes regardless how they are treated (e.g., independent, autoregressive, Toeplitz, etc). In this research, the geographical locations for multiple riverine-based S. damnosum s.l. larval ecosystem habitats sampled from 2 pre-established epidemiological sites in Togo were identified and recorded from July 2009 to June 2010. Initially the data was aggregated into proc genmod. An agglomerative hierarchical residual cluster-based analysis was then performed. The sampled clustered study site data was then analyzed for statistical correlations using Monthly Biting Rates (MBR). Euclidean distance measurements and terrain-related geomorphological statistics were then generated in ArcGIS. A digital overlay was then performed also in ArcGIS using the georeferenced ground coordinates of high and low density clusters stratified by Annual Biting Rates (ABR). This data was overlain onto multitemporal sub-meter pixel resolution satellite data (i.e., QuickBird 0.61m wavbands ). Orthogonal spatial filter eigenvectors were then generated in SAS/GIS. Univariate and non-linear regression-based models (i.e., Logistic, Poisson and Negative Binomial) were also employed to determine probability distributions and to identify statistically significant parameter estimators from the sampled data. Thereafter, Durbin-Watson test statistics were used to test the null hypothesis that the regression residuals were not autocorrelated against the alternative that the residuals followed an autoregressive process in AUTOREG. Bayesian uncertainty matrices were also constructed employing normal priors for each of the sampled estimators in PROC MCMC. The residuals revealed both spatially structured and unstructured error effects in the high and low ABR-stratified clusters. The analyses also revealed that the estimators, levels of turbidity and presence of rocks were statistically significant for the high-ABR-stratified clusters, while the estimators distance between habitats and floating vegetation were important for the low-ABR-stratified cluster. Varying and constant coefficient regression models, ABR- stratified GIS-generated clusters, sub-meter resolution satellite imagery, a robust residual intra-cluster diagnostic test, MBR-based histograms, eigendecomposition spatial filter algorithms and Bayesian matrices can enable accurate autoregressive estimation of latent uncertainity affects and other residual error probabilities (i.e., heteroskedasticity) for testing correlations between georeferenced S. damnosum s.l. riverine larval habitat estimators. The asymptotic distribution of the resulting residual adjusted intra-cluster predictor error autocovariate coefficients can thereafter be established while estimates of the asymptotic variance can lead to the construction of approximate confidence intervals for accurately targeting productive S. damnosum s.l habitats based on spatiotemporal field-sampled count data.
Russian consumers' motives for food choice.
Honkanen, Pirjo; Frewer, Lynn
2009-04-01
Knowledge about food choice motives which have potential to influence consumer consumption decisions is important when designing food and health policies, as well as marketing strategies. Russian consumers' food choice motives were studied in a survey (1081 respondents across four cities), with the purpose of identifying consumer segments based on these motives. These segments were then profiled using consumption, attitudinal and demographic variables. Face-to-face interviews were used to sample the data, which were analysed with two-step cluster analysis (SPSS). Three clusters emerged, representing 21.5%, 45.8% and 32.7% of the sample. The clusters were similar in terms of the order of motivations, but differed in motivational level. Sensory factors and availability were the most important motives for food choice in all three clusters, followed by price. This may reflect the turbulence which Russia has recently experienced politically and economically. Cluster profiles differed in relation to socio-demographic factors, consumption patterns and attitudes towards health and healthy food.
Li, Xin; Kroin, Jeffrey S; Kc, Ranjan; Gibson, Gary; Chen, Di; Corbett, Grant T; Pahan, Kalipada; Fayyaz, Sana; Kim, Jae-Sung; van Wijnen, Andre J; Suh, Joon; Kim, Su-Gwan; Im, Hee-Jeong
2013-12-01
The objective of this study was to examine whether altered expression of microRNAs in central nervous system components is pathologically linked to chronic knee joint pain in osteoarthritis. A surgical animal model for knee joint OA was generated by medial meniscus transection in rats followed by behavioral pain tests. Relationships between pathological changes in knee joint and development of chronic joint pain were examined by histology and imaging analyses. Alterations in microRNAs associated with OA-evoked pain sensation were determined in bilateral lumbar dorsal root ganglia (DRG) and the spinal dorsal horn by microRNA array followed by individual microRNA analyses. Gain- and loss-of-function studies of selected microRNAs (miR-146a and miR-183 cluster) were conducted to identify target pain mediators regulated by these selective microRNAs in glial cells. The ipsilateral hind leg displayed significantly increased hyperalgesia after 4 weeks of surgery, and sensitivity was sustained for the remainder of the 8-week experimental period (F = 341, p < 0.001). The development of OA-induced chronic pain was correlated with pathological changes in the knee joints as assessed by histological and imaging analyses. MicroRNA analyses showed that miR-146a and the miR-183 cluster were markedly reduced in the sensory neurons in DRG (L4/L5) and spinal cord from animals experiencing knee joint OA pain. The downregulation of miR-146a and/or the miR-183 cluster in the central compartments (DRG and spinal cord) are closely associated with the upregulation of inflammatory pain mediators. The corroboration between decreases in these signature microRNAs and their specific target pain mediators were further confirmed by gain- and loss-of-function analyses in glia, the major cellular component of the central nervous system (CNS). MicroRNA therapy using miR-146a and the miR-183 cluster could be powerful therapeutic intervention for OA in alleviating joint pain and concomitantly regenerating peripheral knee joint cartilage. © 2013 American Society for Bone and Mineral Research.
NASA Astrophysics Data System (ADS)
Cusick, K. D.; Dale, J.; Little, B.; Cockrell, A.; Biffinger, J.
2016-02-01
Alteromonas macleodii is a ubiquitous marine bacterium that clusters by molecular analyses into two ecotypes: surface and deep-water. Our group isolated a marine bacterium from copper coupons that generates nanoparticles (NPs) at elevated copper concentrations. Sequencing of the 16S rRNA gene identified it as an A. macleodii strain. In phylogenetic analyses based on the gyrB gene, it clustered with other surface isolates; however, it formed a unique cluster separate from that of other surface isolates based on rpoB gene sequences. Copper is commonly employed as an antifouling agent on the hulls of ships, and so copper tolerance and NP generation is under investigation in this strain. The overall goals of this study were: (1) to determine if copper tolerance is the result of changes at the genetic or transcriptional level and (2) to identify the genes involved in NP formation. Sub-cultures were established from the initial isolate in which copper concentrations were increased in .25 mM increments through multiple generations. These sub-cultures were assayed for NP formation in seawater medium supplemented with 3-4 mM copper. Scanning electron microscopy revealed large aggregates of NPs on the exterior surface of all sub-cultures. Additionally, a portion of the cells in all sub-cultures displayed an elongated morphology in comparison to the wild-type. No NPs were observed in wild-type controls grown without the addition of increased copper. Metagenomic sequencing of natural populations of A. macleodii revealed extreme divergence in several large genomic regions whose content includes genes coding for exopolysaccharide production and metal resistance. High-throughput sequencing is being used to determine whether copper tolerance and NP generation is the result of genetic or transcriptional changes. These results will be extended to natural communities to gain insights into the role of bacterial NPs during conditions of elevated metal concentrations in coastal systems.
Franklyn-Miller, A; Richter, C; King, E; Gore, S; Moran, K; Strike, S; Falvey, E C
2017-01-01
Background Athletic groin pain (AGP) is prevalent in sports involving repeated accelerations, decelerations, kicking and change-of-direction movements. Clinical and radiological examinations lack the ability to assess pathomechanics of AGP, but three-dimensional biomechanical movement analysis may be an important innovation. Aim The primary aim was to describe and analyse movements used by patients with AGP during a maximum effort change-of-direction task. The secondary aim was to determine if specific anatomical diagnoses were related to a distinct movement strategy. Methods 322 athletes with a current symptom of chronic AGP participated. Structured and standardised clinical assessments and radiological examinations were performed on all participants. Additionally, each participant performed multiple repetitions of a planned maximum effort change-of-direction task during which whole body kinematics were recorded. Kinematic and kinetic data were examined using continuous waveform analysis techniques in combination with a subgroup design that used gap statistic and hierarchical clustering. Results Three subgroups (clusters) were identified. Kinematic and kinetic measures of the clusters differed strongly in patterns observed in thorax, pelvis, hip, knee and ankle. Cluster 1 (40%) was characterised by increased ankle eversion, external rotation and knee internal rotation and greater knee work. Cluster 2 (15%) was characterised by increased hip flexion, pelvis contralateral drop, thorax tilt and increased hip work. Cluster 3 (45%) was characterised by high ankle dorsiflexion, thorax contralateral drop, ankle work and prolonged ground contact time. No correlation was observed between movement clusters and clinically palpated location of the participant's pain. Conclusions We identified three distinct movement strategies among athletes with long-standing groin pain during a maximum effort change-of-direction task These movement strategies were not related to clinical assessment findings but highlighted targets for rehabilitation in response to possible propagative mechanisms. Trial registration number NCT02437942, pre results. PMID:28209597
Syphilis Networks in Louisiana: An Analysis of Network Configuration and Disease Transmission
NASA Astrophysics Data System (ADS)
Desmarais, Catherine Theresa
Background: In 2009, Louisiana had the highest rate of primary and secondary syphilis in the country. Recent partner notification approaches have been insufficient in addressing Louisiana's deeply entrenched areas of syphilis infection. Prior researchers have suggested that surveillance systems may benefit from utilizing social and spatial network analysis in syphilis control efforts. Objective: To expand the understanding of the spread of syphilis in Louisiana, and to add new tools to the state's case finding resources through the description of the characteristics of cases of early syphilis and their partners in Louisiana, the socio-sexual networks of these cases, and the geospatial clustering of cases and partners. Methods: Utilizing state surveillance data, all cases of primary, secondary, and early latent syphilis that were diagnosed in 2009 and data on their sexual or needle sharing partners were analyzed using a combination of descriptive, network, and geospatial measures. Results: In 2009, Louisiana experienced a high rate of heterosexual syphilis transmission. Within syphilis transmission networks, 50.8% of all cases were female and 84.2% of all cases were black. The average and median ages of males with reactive syphilis tests were higher than that of females in Louisiana, and in 88.9% of regions, older individuals were more likely to have a syphilis test than no test. A greater proportion of males (11.4%) refused to discuss partners than females (7.4%) and a greater proportion of males (5.5%) refused testing and prophylactic treatment than females (2.8%). No distinct patterns were seen in disease prevalence between regions based upon demographic data. Classic summary network measures such as density, degree, centrality, and betweenness provided little information on similarities and differences between the different regions in Louisiana. All measures indicated low density and extreme fragmentation of networks in Louisiana. The majority of network structures were dendritic in nature. A total of 121 cases did not report any partners and an additional 15.9% reported only being involved in dyads. Several large connected components were also observed in syphilis transmission networks in Louisiana. The average age of persons in these large components was greater than in the regional network. A total of 27.3% of male partnerships were with other males in Louisiana, and 4.0% of female pairings were with other females. Blacks practiced assortative mixing, with 93.1% of contacts with a reported race/ethnicity also being black, while 58.9% of contacts reported by whites were also white. Visualization of networks at the regional level illustrated different patterns, with some regions having large disconnected networks while others had more highly connected components. Visualization also uncovered a high concentration of males that had sex with males and females within Region 2 that were note detected during descriptive analyses. Graphs also highlighted highly connected persons within networks that did not have reactive syphilis tests. Upon geocoding the addresses of network members, it was found that most persons lived adjacent to major highways and in major urban areas. Cluster analyses detected a large number of geographic clusters of study subjects throughout the state. Several different patterns were identified; regions with many clusters in a small geographic area, regions with many clusters over a wide geographic area, regions with few clusters, and regions with no small clusters. Most regions had geographically small clusters of a size that could benefit from targeted interventions. Conclusion: This study provides a more in-depth understanding of syphilis spread in the state of Louisiana and demonstrates the feasibility of using network and geospatial methods in future state surveillance and prevention activities. By tying these two approaches together with the addition of basic demographic information, regional patterns can be identified to improve syphilis prevention practices. In areas with disconnected networks but close geographic clustering, community and street level testing has the potential to reduce morbidity more than partner elicitation, which has resulted in highly fragmented networks. The combination of these analyses also identified subpopulations in need of special messaging or intervention, such as sex workers and males that have sex with both males and females, and identified persons within the networks that would not typically be targeted for cluster interviews but, due to network position, may be helpful in finding additional morbidity in the region. These analyses are feasible to be carried out in an ongoing manner at the state level using current data collection processes and have the potential to inform syphilis elimination activities in each region and within the state as a whole. The addition of risk behaviors, HIV status, venues where partners are met, and the use of the internet and apps in finding partners to future analyses will further improve these disease elimination approaches in Louisiana.
Application of diffusion maps to identify human factors of self-reported anomalies in aviation.
Andrzejczak, Chris; Karwowski, Waldemar; Mikusinski, Piotr
2012-01-01
A study investigating what factors are present leading to pilots submitting voluntary anomaly reports regarding their flight performance was conducted. Diffusion Maps (DM) were selected as the method of choice for performing dimensionality reduction on text records for this study. Diffusion Maps have seen successful use in other domains such as image classification and pattern recognition. High-dimensionality data in the form of narrative text reports from the NASA Aviation Safety Reporting System (ASRS) were clustered and categorized by way of dimensionality reduction. Supervised analyses were performed to create a baseline document clustering system. Dimensionality reduction techniques identified concepts or keywords within records, and allowed the creation of a framework for an unsupervised document classification system. Results from the unsupervised clustering algorithm performed similarly to the supervised methods outlined in the study. The dimensionality reduction was performed on 100 of the most commonly occurring words within 126,000 text records describing commercial aviation incidents. This study demonstrates that unsupervised machine clustering and organization of incident reports is possible based on unbiased inputs. Findings from this study reinforced traditional views on what factors contribute to civil aviation anomalies, however, new associations between previously unrelated factors and conditions were also found.
Spatial analysis of malaria in Anhui province, China
Zhang, Wenyi; Wang, Liping; Fang, Liqun; Ma, Jiaqi; Xu, Youfu; Jiang, Jiafu; Hui, Fengming; Wang, Jianjun; Liang, Song; Yang, Hong; Cao, Wuchun
2008-01-01
Background Malaria has re-emerged in Anhui Province, China, and this province was the most seriously affected by malaria during 2005–2006. It is necessary to understand the spatial distribution of malaria cases and to identify highly endemic areas for future public health planning and resource allocation in Anhui Province. Methods The annual average incidence at the county level was calculated using malaria cases reported between 2000 and 2006 in Anhui Province. GIS-based spatial analyses were conducted to detect spatial distribution and clustering of malaria incidence at the county level. Results The spatial distribution of malaria cases in Anhui Province from 2000 to 2006 was mapped at the county level to show crude incidence, excess hazard and spatial smoothed incidence. Spatial cluster analysis suggested 10 and 24 counties were at increased risk for malaria (P < 0.001) with the maximum spatial cluster sizes at < 50% and < 25% of the total population, respectively. Conclusion The application of GIS, together with spatial statistical techniques, provide a means to quantify explicit malaria risks and to further identify environmental factors responsible for the re-emerged malaria risks. Future public health planning and resource allocation in Anhui Province should be focused on the maximum spatial cluster region. PMID:18847489
Izquierdo, Javier A; Sizova, Maria V; Lynd, Lee R
2010-06-01
The enrichment from nature of novel microbial communities with high cellulolytic activity is useful in the identification of novel organisms and novel functions that enhance the fundamental understanding of microbial cellulose degradation. In this work we identify predominant organisms in three cellulolytic enrichment cultures with thermophilic compost as an inoculum. Community structure based on 16S rRNA gene clone libraries featured extensive representation of clostridia from cluster III, with minor representation of clostridial clusters I and XIV and a novel Lutispora species cluster. Our studies reveal different levels of 16S rRNA gene diversity, ranging from 3 to 18 operational taxonomic units (OTUs), as well as variability in community membership across the three enrichment cultures. By comparison, glycosyl hydrolase family 48 (GHF48) diversity analyses revealed a narrower breadth of novel clostridial genes associated with cultured and uncultured cellulose degraders. The novel GHF48 genes identified in this study were related to the novel clostridia Clostridium straminisolvens and Clostridium clariflavum, with one cluster sharing as little as 73% sequence similarity with the closest known relative. In all, 14 new GHF48 gene sequences were added to the known diversity of 35 genes from cultured species.
Papaleo, Elena; Mereghetti, Paolo; Fantucci, Piercarlo; Grandori, Rita; De Gioia, Luca
2009-01-01
Several molecular dynamics (MD) simulations were used to sample conformations in the neighborhood of the native structure of holo-myoglobin (holo-Mb), collecting trajectories spanning 0.22 micros at 300 K. Principal component (PCA) and free-energy landscape (FEL) analyses, integrated by cluster analysis, which was performed considering the position and structures of the individual helices of the globin fold, were carried out. The coherence between the different structural clusters and the basins of the FEL, together with the convergence of parameters derived by PCA indicates that an accurate description of the Mb conformational space around the native state was achieved by multiple MD trajectories spanning at least 0.14 micros. The integration of FEL, PCA, and structural clustering was shown to be a very useful approach to gain an overall view of the conformational landscape accessible to a protein and to identify representative protein substates. This method could be also used to investigate the conformational and dynamical properties of Mb apo-, mutant, or delete versions, in which greater conformational variability is expected and, therefore identification of representative substates from the simulations is relevant to disclose structure-function relationship.
Geographic atrophy phenotype identification by cluster analysis.
Monés, Jordi; Biarnés, Marc
2018-03-01
To identify ocular phenotypes in patients with geographic atrophy secondary to age-related macular degeneration (GA) using a data-driven cluster analysis. This was a retrospective analysis of data from a prospective, natural history study of patients with GA who were followed for ≥6 months. Cluster analysis was used to identify subgroups within the population based on the presence of several phenotypic features: soft drusen, reticular pseudodrusen (RPD), primary foveal atrophy, increased fundus autofluorescence (FAF), greyish FAF appearance and subfoveal choroidal thickness (SFCT). A comparison of features between the subgroups was conducted, and a qualitative description of the new phenotypes was proposed. The atrophy growth rate between phenotypes was then compared. Data were analysed from 77 eyes of 77 patients with GA. Cluster analysis identified three groups: phenotype 1 was characterised by high soft drusen load, foveal atrophy and slow growth; phenotype 3 showed high RPD load, extrafoveal and greyish FAF appearance and thin SFCT; the characteristics of phenotype 2 were midway between phenotypes 1 and 3. Phenotypes differed in all measured features (p≤0.013), with decreases in the presence of soft drusen, foveal atrophy and SFCT seen from phenotypes 1 to 3 and corresponding increases in high RPD load, high FAF and greyish FAF appearance. Atrophy growth rate differed between phenotypes 1, 2 and 3 (0.63, 1.91 and 1.73 mm 2 /year, respectively, p=0.0005). Cluster analysis identified three distinct phenotypes in GA. One of them showed a particularly slow growth pattern. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Hyde, J M; Cerezo, A; Williams, T J
2009-04-01
Statistical analysis of atom probe data has improved dramatically in the last decade and it is now possible to determine the size, the number density and the composition of individual clusters or precipitates such as those formed in reactor pressure vessel (RPV) steels during irradiation. However, the characterisation of the onset of clustering or co-segregation is more difficult and has traditionally focused on the use of composition frequency distributions (for detecting clustering) and contingency tables (for detecting co-segregation). In this work, the authors investigate the possibility of directly examining the neighbourhood of each individual solute atom as a means of identifying the onset of solute clustering and/or co-segregation. The methodology involves comparing the mean observed composition around a particular type of solute with that expected from the overall composition of the material. The methodology has been applied to atom probe data obtained from several irradiated RPV steels. The results show that the new approach is more sensitive to fine scale clustering and co-segregation than that achievable using composition frequency distribution and contingency table analyses.
Occurrence of different Canine distemper virus lineages in Italian dogs.
Balboni, Andrea; De Lorenzo Dandola, Giorgia; Scagliarini, Alessandra; Prosperi, Santino; Battilani, Mara
2014-01-01
This study describes the sequence analysis of the H gene of 7 Canine distemper virus (CDV) strains identified in dogs in Italy between years 2002-2012. The phylogenetic analysis showed that the CDV strains belonged to 2 clusters: 6 viruses were identified as Arctic-like lineage and 1 as Europe 1 lineage. These data show a considerable prevalence of Arctic-like-CDVs in the analysed dogs. The dogs and the 3 viruses more recently identified showed 4 distinctive amino acid mutations compared to all other Arctic CDVs.
Profiles of family needs of children and youth with cerebral palsy.
Almasri, N; Palisano, R J; Dunst, C; Chiarello, L A; O'Neil, M E; Polansky, M
2012-11-01
To identify profiles of family needs of families of children and youth with cerebral palsy (CP), and determine whether profile membership is related to child, family and service characteristics. Participants were mostly mothers (80%) of 579 children and youth with CP. A family member completed modified version of the Family Needs Survey and questionnaires about their child, family and services. Research assistants determined the Gross Motor Function Classification System levels. K-means cluster analysis identified profiles of needs. Cluster membership was analysed to examine differences in clusters based on selected characteristics. Four profiles of needs were identified: Low needs, Needs related to community and financial resources, Needs related to child health condition and High needs. Profile membership was differentiated based on child/youth gross motor function, adaptive behaviour, family relationships, family income, access and effort to co-ordinate services. Despite heterogeneity among individuals with CP and their families, four profiles of family needs were identified. In total, 51% of families had low needs suggesting that they are effectively managing their children's health conditions while 11% of families had high needs that may require high levels of services and supports. Service providers are encouraged to partner with families, provide anticipatory guidance and co-ordinate services. © 2011 Blackwell Publishing Ltd.
Eckert, Andrew J; van Heerwaarden, Joost; Wegrzyn, Jill L; Nelson, C Dana; Ross-Ibarra, Jeffrey; González-Martínez, Santíago C; Neale, David B
2010-07-01
Natural populations of forest trees exhibit striking phenotypic adaptations to diverse environmental gradients, thereby making them appealing subjects for the study of genes underlying ecologically relevant phenotypes. Here, we use a genome-wide data set of single nucleotide polymorphisms genotyped across 3059 functional genes to study patterns of population structure and identify loci associated with aridity across the natural range of loblolly pine (Pinus taeda L.). Overall patterns of population structure, as inferred using principal components and Bayesian cluster analyses, were consistent with three genetic clusters likely resulting from expansions out of Pleistocene refugia located in Mexico and Florida. A novel application of association analysis, which removes the confounding effects of shared ancestry on correlations between genetic and environmental variation, identified five loci correlated with aridity. These loci were primarily involved with abiotic stress response to temperature and drought. A unique set of 24 loci was identified as F(ST) outliers on the basis of the genetic clusters identified previously and after accounting for expansions out of Pleistocene refugia. These loci were involved with a diversity of physiological processes. Identification of nonoverlapping sets of loci highlights the fundamental differences implicit in the use of either method and suggests a pluralistic, yet complementary, approach to the identification of genes underlying ecologically relevant phenotypes.
An atypical anxious-impulsive pattern of social anxiety disorder in an adult clinical population.
Mörtberg, Ewa; Tillfors, Maria; van Zalk, Nejra; Kerr, Margaret
2014-08-01
An atypical subgroup of Social Anxiety Disorder (SAD) with impulsive rather than inhibited traits has recently been reported. The current study examined whether such an atypical subgroup could be identified in a clinical population of 84 adults with SAD. The temperament dimensions harm avoidance and novelty seeking of the Temperament and Character Inventory, and the Liebowitz Social Anxiety Scale were used in cluster analyses. The identified clusters were compared on depressive symptoms, the character dimension self-directedness, and treatment outcome. Among the six identified clusters, 24% of the sample had atypical characteristics, demonstrating mainly generalized SAD in combination with coexisting traits of inhibition and impulsivity. As additional signs of severity, this group showed low self-directedness and high levels of depressive symptoms. We also identified a typically inhibited subgroup comprising generalized SAD with high levels of harm avoidance and low levels of novelty seeking, with a similar clinical severity as the atypical subgroup. Thus, higher levels of harm avoidance and social anxiety in combination with higher or lower levels of novelty seeking and low self-directedness seem to contribute to a more severe clinical picture. Post hoc examination of the treatment outcome in these subgroups showed that only 20 to 30% achieved clinically significant change. © 2014 Scandinavian Psychological Associations and John Wiley & Sons Ltd.
Comparison of organs' shapes with geometric and Zernike 3D moments.
Broggio, D; Moignier, A; Ben Brahim, K; Gardumi, A; Grandgirard, N; Pierrat, N; Chea, M; Derreumaux, S; Desbrée, A; Boisserie, G; Aubert, B; Mazeron, J-J; Franck, D
2013-09-01
The morphological similarity of organs is studied with feature vectors based on geometric and Zernike 3D moments. It is particularly investigated if outliers and average models can be identified. For this purpose, the relative proximity to the mean feature vector is defined, principal coordinate and clustering analyses are also performed. To study the consistency and usefulness of this approach, 17 livers and 76 hearts voxel models from several sources are considered. In the liver case, models with similar morphological feature are identified. For the limited amount of studied cases, the liver of the ICRP male voxel model is identified as a better surrogate than the female one. For hearts, the clustering analysis shows that three heart shapes represent about 80% of the morphological variations. The relative proximity and clustering analysis rather consistently identify outliers and average models. For the two cases, identification of outliers and surrogate of average models is rather robust. However, deeper classification of morphological feature is subject to caution and can only be performed after cross analysis of at least two kinds of feature vectors. Finally, the Zernike moments contain all the information needed to re-construct the studied objects and thus appear as a promising tool to derive statistical organ shapes. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Schütte, B; El Hajj, N; Kuhtz, J; Nanda, I; Gromoll, J; Hahn, T; Dittrich, M; Schorsch, M; Müller, T; Haaf, T
2013-11-01
Aberrant sperm DNA methylation patterns, mainly in imprinted genes, have been associated with male subfertility and oligospermia. Here, we performed a genome-wide methylation analysis in sperm samples representing a wide range of semen parameters. Sperm DNA samples of 38 males attending a fertility centre were analysed with Illumina HumanMethylation27 BeadChips, which quantify methylation of >27 000 CpG sites in cis-regulatory regions of almost 15 000 genes. In an unsupervised analysis of methylation of all analysed sites, the patient samples clustered into a major and a minor group. The major group clustered with samples from normozoospermic healthy volunteers and, thus, may more closely resemble the normal situation. When correlating the clusters with semen and clinical parameters, the sperm counts were significantly different between groups with the minor group exhibiting sperm counts in the low normal range. A linear model identified almost 3000 CpGs with significant methylation differences between groups. Functional analysis revealed a broad gain of methylation in spermatogenesis-related genes and a loss of methylation in inflammation- and immune response-related genes. Quantitative bisulfite pyrosequencing validated differential methylation in three of five significant candidate genes on the array. Collectively, we identified a subgroup of sperm samples for assisted reproduction with sperm counts in the low normal range and broad methylation changes (affecting approximately 10% of analysed CpG sites) in specific pathways, most importantly spermatogenesis-related genes. We propose that epigenetic analysis can supplement traditional semen parameters and has the potential to provide new insights into the aetiology of male subfertility. © 2013 American Society of Andrology and European Academy of Andrology.
Tait, Luke; Wedgwood, Kyle; Tsaneva-Atanasova, Krasimira; Brown, Jon T; Goodfellow, Marc
2018-07-14
The entorhinal cortex is a crucial component of our memory and spatial navigation systems and is one of the first areas to be affected in dementias featuring tau pathology, such as Alzheimer's disease and frontotemporal dementia. Electrophysiological recordings from principle cells of medial entorhinal cortex (layer II stellate cells, mEC-SCs) demonstrate a number of key identifying properties including subthreshold oscillations in the theta (4-12 Hz) range and clustered action potential firing. These single cell properties are correlated with network activity such as grid firing and coupling between theta and gamma rhythms, suggesting they are important for spatial memory. As such, experimental models of dementia have revealed disruption of organised dorsoventral gradients in clustered action potential firing. To better understand the mechanisms underpinning these different dynamics, we study a conductance based model of mEC-SCs. We demonstrate that the model, driven by extrinsic noise, can capture quantitative differences in clustered action potential firing patterns recorded from experimental models of tau pathology and healthy animals. The differential equation formulation of our model allows us to perform numerical bifurcation analyses in order to uncover the dynamic mechanisms underlying these patterns. We show that clustered dynamics can be understood as subcritical Hopf/homoclinic bursting in a fast-slow system where the slow sub-system is governed by activation of the persistent sodium current and inactivation of the slow A-type potassium current. In the full system, we demonstrate that clustered firing arises via flip bifurcations as conductance parameters are varied. Our model analyses confirm the experimentally suggested hypothesis that the breakdown of clustered dynamics in disease occurs via increases in AHP conductance. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Paternal age related schizophrenia (PARS): Latent subgroups detected by k-means clustering analysis.
Lee, Hyejoo; Malaspina, Dolores; Ahn, Hongshik; Perrin, Mary; Opler, Mark G; Kleinhaus, Karine; Harlap, Susan; Goetz, Raymond; Antonius, Daniel
2011-05-01
Paternal age related schizophrenia (PARS) has been proposed as a subgroup of schizophrenia with distinct etiology, pathophysiology and symptoms. This study uses a k-means clustering analysis approach to generate hypotheses about differences between PARS and other cases of schizophrenia. We studied PARS (operationally defined as not having any family history of schizophrenia among first and second-degree relatives and fathers' age at birth ≥ 35 years) in a series of schizophrenia cases recruited from a research unit. Data were available on demographic variables, symptoms (Positive and Negative Syndrome Scale; PANSS), cognitive tests (Wechsler Adult Intelligence Scale-Revised; WAIS-R) and olfaction (University of Pennsylvania Smell Identification Test; UPSIT). We conducted a series of k-means clustering analyses to identify clusters of cases containing high concentrations of PARS. Two analyses generated clusters with high concentrations of PARS cases. The first analysis (N=136; PARS=34) revealed a cluster containing 83% PARS cases, in which the patients showed a significant discrepancy between verbal and performance intelligence. The mean paternal and maternal ages were 41 and 33, respectively. The second analysis (N=123; PARS=30) revealed a cluster containing 71% PARS cases, of which 93% were females; the mean age of onset of psychosis, at 17.2, was significantly early. These results strengthen the evidence that PARS cases differ from other patients with schizophrenia. Hypothesis-generating findings suggest that features of PARS may include a discrepancy between verbal and performance intelligence, and in females, an early age of onset. These findings provide a rationale for separating these phenotypes from others in future clinical, genetic and pathophysiologic studies of schizophrenia and in considering responses to treatment. Copyright © 2011 Elsevier B.V. All rights reserved.
Konkolÿ Thege, Barna; Hodgins, David C; Wild, T Cameron
2016-12-01
Background and aims The aims of this study were (a) to describe the prevalence of single versus multiple addiction problems in a large representative sample and (b) to identify distinct subgroups of people experiencing substance-related and behavioral addiction problems. Methods A random sample of 6,000 respondents from Alberta, Canada, completed survey items assessing self-attributed problems experienced in the past year with four substances (alcohol, tobacco, marijuana, and cocaine) and six behaviors (gambling, eating, shopping, sex, video gaming, and work). Hierarchical cluster analyses were used to classify patterns of co-occurring addiction problems on an analytic subsample of 2,728 respondents (1,696 women and 1032 men; M age = 45.1 years, SD age = 13.5 years) who reported problems with one or more of the addictive behaviors in the previous year. Results In the total sample, 49.2% of the respondents reported zero, 29.8% reported one, 13.1% reported two, and 7.9% reported three or more addiction problems in the previous year. Cluster-analytic results suggested a 7-group solution. Members of most clusters were characterized by multiple addiction problems; the average number of past year addictive behaviors in cluster members ranged between 1 (Cluster II: excessive eating only) and 2.5 (Cluster VII: excessive video game playing with the frequent co-occurrence of smoking, excessive eating and work). Discussion and conclusions Our findings replicate previous results indicating that about half of the adult population struggles with at least one excessive behavior in a given year; however, our analyses revealed a higher number of co-occurring addiction clusters than typically found in previous studies.
Konkolÿ Thege, Barna; Hodgins, David C.; Wild, T. Cameron
2016-01-01
Background and aims The aims of this study were (a) to describe the prevalence of single versus multiple addiction problems in a large representative sample and (b) to identify distinct subgroups of people experiencing substance-related and behavioral addiction problems. Methods A random sample of 6,000 respondents from Alberta, Canada, completed survey items assessing self-attributed problems experienced in the past year with four substances (alcohol, tobacco, marijuana, and cocaine) and six behaviors (gambling, eating, shopping, sex, video gaming, and work). Hierarchical cluster analyses were used to classify patterns of co-occurring addiction problems on an analytic subsample of 2,728 respondents (1,696 women and 1032 men; Mage = 45.1 years, SDage = 13.5 years) who reported problems with one or more of the addictive behaviors in the previous year. Results In the total sample, 49.2% of the respondents reported zero, 29.8% reported one, 13.1% reported two, and 7.9% reported three or more addiction problems in the previous year. Cluster-analytic results suggested a 7-group solution. Members of most clusters were characterized by multiple addiction problems; the average number of past year addictive behaviors in cluster members ranged between 1 (Cluster II: excessive eating only) and 2.5 (Cluster VII: excessive video game playing with the frequent co-occurrence of smoking, excessive eating and work). Discussion and conclusions Our findings replicate previous results indicating that about half of the adult population struggles with at least one excessive behavior in a given year; however, our analyses revealed a higher number of co-occurring addiction clusters than typically found in previous studies. PMID:27829288
Genetic Structure of Bluefin Tuna in the Mediterranean Sea Correlates with Environmental Variables
Riccioni, Giulia; Stagioni, Marco; Landi, Monica; Ferrara, Giorgia; Barbujani, Guido; Tinti, Fausto
2013-01-01
Background Atlantic Bluefin Tuna (ABFT) shows complex demography and ecological variation in the Mediterranean Sea. Genetic surveys have detected significant, although weak, signals of population structuring; catch series analyses and tagging programs identified complex ABFT spatial dynamics and migration patterns. Here, we tested the hypothesis that the genetic structure of the ABFT in the Mediterranean is correlated with mean surface temperature and salinity. Methodology We used six samples collected from Western and Central Mediterranean integrated with a new sample collected from the recently identified easternmost reproductive area of Levantine Sea. To assess population structure in the Mediterranean we used a multidisciplinary framework combining classical population genetics, spatial and Bayesian clustering methods and a multivariate approach based on factor analysis. Conclusions FST analysis and Bayesian clustering methods detected several subpopulations in the Mediterranean, a result also supported by multivariate analyses. In addition, we identified significant correlations of genetic diversity with mean salinity and surface temperature values revealing that ABFT is genetically structured along two environmental gradients. These results suggest that a preference for some spawning habitat conditions could contribute to shape ABFT genetic structuring in the Mediterranean. However, further studies should be performed to assess to what extent ABFT spawning behaviour in the Mediterranean Sea can be affected by environmental variation. PMID:24260341
Franz, M; Salize, H J; Lujic, C; Koch, E; Gallhofer, B; Jacke, C O
2014-02-01
To identify differences and similarities between immigrants of Turkish origin and native German patients in therapeutically relevant dimensions such as subjective illness perceptions and personality traits. Turkish and native German mentally disordered in-patients were interviewed in three psychiatric clinics in Hessen, Germany. The Revised Illness Perception Questionnaire (IPQ-Revised) and the Neuroticism-Extraversion-Openness Five-Factor Inventory (NEO-FFI) were used. Differences of scales and similarities by k-means cluster analyses were estimated. Of the 362 total patients, 227 (123 immigrants and 104 native Germans) were included. Neither demographic nor clinical differences were detected. Socioeconomic gradients and differences on IPQ-R scales were identified. For each ethnicity, the cluster analysis identified four different patient types based on NEO-FFI and IPQ-R scales. The patient types of each ethnicity appeared to be very similar in their structure, but they differed solely in the magnitude of the cluster means on included subscales according to ethnicity. When subjective illness perceptions and personality traits are considered together, basic patient types emerge independent of the ethnicity. Thus, the ethnical impact on patient types diminishes and a convergence was detected. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Detecting subject-specific activations using fuzzy clustering
Seghier, Mohamed L.; Friston, Karl J.; Price, Cathy J.
2007-01-01
Inter-subject variability in evoked brain responses is attracting attention because it may reflect important variability in structure–function relationships over subjects. This variability could be a signature of degenerate (many-to-one) structure–function mappings in normal subjects or reflect changes that are disclosed by brain damage. In this paper, we describe a non-iterative fuzzy clustering algorithm (FCP: fuzzy clustering with fixed prototypes) for characterizing inter-subject variability in between-subject or second-level analyses of fMRI data. The approach identifies the contribution of each subject to response profiles in voxels surviving a classical F-statistic criterion. The output identifies subjects who drive activation in specific cortical regions (local effects) or in voxels distributed across neural systems (global effects). The sensitivity of the approach was assessed in 38 normal subjects performing an overt naming task. FCP revealed that several subjects had either abnormally high or abnormally low responses. FCP may be particularly useful for characterizing outlier responses in rare patients or heterogeneous populations. In these cases, atypical activations may not be detected by standard tests, under parametric assumptions. The advantage of using FCP is that it searches all voxels systematically and can identify atypical activation patterns in a quantitative and unsupervised manner. PMID:17478103
Kent, Peter; Jensen, Rikke K; Kongsted, Alice
2014-10-02
There are various methodological approaches to identifying clinically important subgroups and one method is to identify clusters of characteristics that differentiate people in cross-sectional and/or longitudinal data using Cluster Analysis (CA) or Latent Class Analysis (LCA). There is a scarcity of head-to-head comparisons that can inform the choice of which clustering method might be suitable for particular clinical datasets and research questions. Therefore, the aim of this study was to perform a head-to-head comparison of three commonly available methods (SPSS TwoStep CA, Latent Gold LCA and SNOB LCA). The performance of these three methods was compared: (i) quantitatively using the number of subgroups detected, the classification probability of individuals into subgroups, the reproducibility of results, and (ii) qualitatively using subjective judgments about each program's ease of use and interpretability of the presentation of results.We analysed five real datasets of varying complexity in a secondary analysis of data from other research projects. Three datasets contained only MRI findings (n = 2,060 to 20,810 vertebral disc levels), one dataset contained only pain intensity data collected for 52 weeks by text (SMS) messaging (n = 1,121 people), and the last dataset contained a range of clinical variables measured in low back pain patients (n = 543 people). Four artificial datasets (n = 1,000 each) containing subgroups of varying complexity were also analysed testing the ability of these clustering methods to detect subgroups and correctly classify individuals when subgroup membership was known. The results from the real clinical datasets indicated that the number of subgroups detected varied, the certainty of classifying individuals into those subgroups varied, the findings had perfect reproducibility, some programs were easier to use and the interpretability of the presentation of their findings also varied. The results from the artificial datasets indicated that all three clustering methods showed a near-perfect ability to detect known subgroups and correctly classify individuals into those subgroups. Our subjective judgement was that Latent Gold offered the best balance of sensitivity to subgroups, ease of use and presentation of results with these datasets but we recognise that different clustering methods may suit other types of data and clinical research questions.
NASA Astrophysics Data System (ADS)
Mehmood, S.; Ashfaq, M.; Evans, K. J.; Black, R. X.; Hsu, H. H.
2017-12-01
Extreme precipitation during summer season has shown an increasing trend across South Asia in recent decades, causing an exponential increase in weather related losses. Here we combine a cluster analyses technique (Agglomerative Hierarchical Clustering) with a Lagrangian based moisture analyses technique to investigate potential commonalities in the characteristics of the large scale meteorological patterns (LSMP) and moisture anomalies associated with the observed extreme precipitation events, and their representation in the Department of Energy model ACME. Using precipitation observations from the Indian Meteorological Department (IMD) and Asian Precipitation Highly Resolved Observational Data Integration Towards Evaluation (APHRODITE), and atmospheric variables from Era-Interim Reanalysis, we first identify LSMP both in upper and lower troposphere that are responsible for wide spread precipitation extreme events during 1980-2015 period. For each of the selected extreme event, we perform moisture source analyses to identify major evaporative sources that sustain anomalous moisture supply during the course of the event, with a particular focus on local terrestrial moisture recycling. Further, we perform similar analyses on two sets of five-member ensemble of ACME model (1-degree and ¼ degree) to investigate the ability of ACME model in simulating precipitation extremes associated with each of the LSMP patterns and associated anomalous moisture sourcing from each of the terrestrial and oceanic evaporative region. Comparison of low and high-resolution model configurations provides insight about the influence of horizontal grid spacing in the simulation of extreme precipitation and the governing mechanisms.
Park, Nan Sook; Jang, Yuri; Lee, Beom S; Ko, Jung Eun; Haley, William E; Chiriboga, David A
2015-01-01
In the context of social convoy theory, the purposes of the study were (a) to identify an empirical typology of the social networks evident in older Korean immigrants and (b) to examine its association with self-rated health and depressive symptoms. The sample consisted of 1,092 community-dwelling older Korean immigrants in Florida and New York. Latent class analyses were conducted to identify the optimal social network typology based on 8 indicators of interpersonal relationships and activities. Bivariate and multivariate analyses were conducted to examine how the identified social network typology was associated with self-rating of health and depressive symptoms. Results from the latent class analysis identified 6 clusters as being most optimal, and they were named diverse, unmarried/diverse, married/coresidence, family focused, unmarried/restricted, and restricted. Memberships in the clusters of diverse and married/coresidence were significantly associated with more favorable ratings of health and lower levels of depressive symptoms. Notably, no distinct network solely composed of friends was identified in the present sample of older immigrants; this may reflect the disruptions in social convoys caused by immigration. The findings of this study promote our understanding of the unique patterns of social connectedness in older immigrants. © The Author 2013. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
McAllister, Christine A; Miller, Allison J
2016-07-01
Autopolyploidy, genome duplication within a single lineage, can result in multiple cytotypes within a species. Geographic distributions of cytotypes may reflect the evolutionary history of autopolyploid formation and subsequent population dynamics including stochastic (drift) and deterministic (differential selection among cytotypes) processes. Here, we used a population genomic approach to investigate whether autopolyploidy occurred once or multiple times in Andropogon gerardii, a widespread, North American grass with two predominant cytotypes. Genotyping by sequencing was used to identify single nucleotide polymorphisms (SNPs) in individuals collected from across the geographic range of A. gerardii. Two independent approaches to SNP calling were used: the reference-free UNEAK pipeline and a reference-guided approach based on the sequenced Sorghum bicolor genome. SNPs generated using these pipelines were analyzed independently with genetic distance and clustering. Analyses of the two SNP data sets showed very similar patterns of population-level clustering of A. gerardii individuals: a cluster of A. gerardii individuals from the southern Plains, a northern Plains cluster, and a western cluster. Groupings of individuals corresponded to geographic localities regardless of cytotype: 6x and 9x individuals from the same geographic area clustered together. SNPs generated using reference-guided and reference-free pipelines in A. gerardii yielded unique subsets of genomic data. Both data sets suggest that the 9x cytotype in A. gerardii likely evolved multiple times from 6x progenitors across the range of the species. Genomic approaches like GBS and diverse bioinformatics pipelines used here facilitate evolutionary analyses of complex systems with multiple ploidy levels. © 2016 Botanical Society of America.
ERIC Educational Resources Information Center
Witherspoon, Dawn; Schotland, Marieka; Way, Niobe; Hughes, Diane
2009-01-01
Cluster analyses and hierarchical linear modeling were used to investigate the impact of perceptions of connectedness to family, school, and neighborhood contexts on academic and psycho-social outcomes for 437 urban ethnically diverse adolescents. Five profiles of connectedness to family, school, and neighborhood were identified. Two profiles were…
Classifying Autism Spectrum Disorders by ADI-R: Subtypes or Severity Gradient?
ERIC Educational Resources Information Center
Cholemkery, Hannah; Medda, Juliane; Lempp, Thomas; Freitag, Christine M.
2016-01-01
To reduce phenotypic heterogeneity of Autism spectrum disorders (ASD) and add to the current diagnostic discussion this study aimed at identifying clinically meaningful ASD subgroups. Cluster analyses were used to describe empirically derived groups based on the Autism Diagnostic Interview-revised (ADI-R) in a large sample of n = 463 individuals…
Relationship between Attitudes and Indicators of Obesity for Midlife Women
ERIC Educational Resources Information Center
Sudo, Noriko; Degeneffe, Dennis; Vue, Houa; Merkle, Emily; Kinsey, Jean; Ghosh, Koel; Reicks, Marla
2009-01-01
This study uses segmentation analyses to identify five distinct subgroups of U.S. midlife women (n = 200) based on their prevailing attitudes toward food and its preparation and consumption. Mean age of the women is 46 years and they are mostly White (86%), highly educated, and employed. Attitude segments (clusters of women sharing similar…
ERIC Educational Resources Information Center
Collins, Brian Andrew; O'Connor, Erin Eileen; Supplee, Lauren; Shaw, Daniel S.
2017-01-01
The authors identified trajectories of teacher-child relationship conflict and closeness from Grades 1 to 6, and associations between these trajectories and externalizing and internalizing behaviors at 11 years old among low-income, urban boys (N = 262). There were three main findings. Nagin cluster analyses indicated five trajectories for…
Clarkson, John P.; Warmington, Rachel J.; Walley, Peter G.; Denton-Giles, Matthew; Barbetti, Martin J.; Brodal, Guro; Nordskog, Berit
2017-01-01
Sclerotinia species are important fungal pathogens of a wide range of crops and wild host plants. While the biology and population structure of Sclerotinia sclerotiorum has been well-studied, little information is available for the related species S. subarctica. In this study, Sclerotinia isolates were collected from different crop plants and the wild host Ranuculus ficaria (meadow buttercup) in England, Scotland, and Norway to determine the incidence of Sclerotinia subarctica and examine the population structure of this pathogen for the first time. Incidence was very low in England, comprising only 4.3% of isolates while moderate and high incidence of S. subarctica was identified in Scotland and Norway, comprising 18.3 and 48.0% of isolates respectively. Characterization with eight microsatellite markers identified 75 haplotypes within a total of 157 isolates over the three countries with a few haplotypes in Scotland and Norway sampled at a higher frequency than the rest across multiple locations and host plants. In total, eight microsatellite haplotypes were shared between Scotland and Norway while none were shared with England. Bayesian and principal component analyses revealed common ancestry and clustering of Scottish and Norwegian S. subarctica isolates while English isolates were assigned to a separate population cluster and exhibited low diversity indicative of isolation. Population structure was also examined for S. sclerotiorum isolates from England, Scotland, Norway, and Australia using microsatellite data, including some from a previous study in England. In total, 484 haplotypes were identified within 800 S. sclerotiorum isolates with just 15 shared between England and Scotland and none shared between any other countries. Bayesian and principal component analyses revealed a common ancestry and clustering of the English and Scottish isolates while Norwegian and Australian isolates were assigned to separate clusters. Furthermore, sequencing part of the intergenic spacer (IGS) region of the rRNA gene resulted in 26 IGS haplotypes within 870 S. sclerotiorum isolates, nine of which had not been previously identified and two of which were also widely distributed across different countries. S. subarctica therefore has a multiclonal population structure similar to S. sclerotiorum, but has a different ancestry and distribution across England, Scotland, and Norway. PMID:28421039
The most metal-poor Galactic globular cluster: the first spectroscopic observations of ESO280-SC06
NASA Astrophysics Data System (ADS)
Simpson, Jeffrey D.
2018-07-01
We present the first spectroscopic observations of the very metal-poor Milky Way globular cluster ESO280-SC06. Using spectra acquired with the 2dF/AAOmega spectrograph on the Anglo-Australian Telescope, we have identified 13 members of the cluster, and estimate from their infrared calcium triplet lines that the cluster has a metallicity of [Fe/H]=-2.48^{+0.06 }_{ -0.11}. This would make it the most metal-poor globular cluster known in the Milky Way. This result was verified with comparisons to three other metal-poor globular clusters that had been observed and analysed in the same manner. We also present new photometry of the cluster from EFOSC2 and SkyMapper and confirm that the cluster is located 22.9 ± 2.1 kpc from the Sun and 15.2 ± 2.1 kpc from the Galactic Centre, and has a radial velocity of 92.5^{+2.4 }_{ -1.6} km s-1. These new data finds the cluster to have a radius about half that previously estimated, and we find that the cluster has a dynamical mass of the cluster of (12 ± 2) × 103 M⊙. Unfortunately, we lack reliable proper motions to fully characterize its orbit about the Galaxy. Intriguingly, the photometry suggests that the cluster lacks a well-populated horizontal branch, something that has not been observed in a cluster so ancient or metal poor.
M Weerasekera, Manjula; H Sissons, Chris; Wong, Lisa; A Anderson, Sally; R Holmes, Ann; D Cannon, Richard
2017-10-01
The aim was to investigate the relationship between groups of bacteria identified by cluster analysis of the DGGE fingerprints and the amounts and diversity of yeast present. Bacterial and yeast populations in saliva samples from 24 adults were analysed using denaturing gradient gel electrophoresis (DGGE) of the bacteria present and by yeast culture. Eubacterial DGGE banding patterns showed considerable variation between individuals. Seventy one different amplicon bands were detected, the band number per saliva sample ranged from 21 to 39 (mean±SD=29.3±4.9). Cluster and principal component analysis of the bacterial DGGE patterns yielded three major clusters containing 20 of the samples. Seventeen of the 24 (71%) saliva samples were yeast positive with concentrations up to 10 3 cfu/mL. Candida albicans was the predominant species in saliva samples although six other yeast species, including Candida dubliniensis, Candida tropicalis, Candida krusei, Candida guilliermondii, Candida rugosa and Saccharomyces cerevisiae, were identified. The presence, concentration, and species of yeast in samples showed no clear relationship to the bacterial clusters. Despite indications of in vitro bacteria-yeast interactions, there was a lack of association between the presence, identity and diversity of yeasts and the bacterial DGGE fingerprint clusters in saliva. This suggests significant ecological individual-specificity of these associations in highly complex in vivo oral biofilm systems under normal oral conditions. Copyright © 2017 Elsevier Ltd. All rights reserved.
Montemagni, Cristiana; Frieri, Tiziana; Villari, Vincenzo; Rocca, Paola
2018-06-01
The purpose of the study was to identify homogenous subgroups, based upon achievement of two functional milestones (marriage and employment) and Global Assessment of Functioning (GAF) score in a sample of 848 acute patients admitted to the Psychiatric Emergency Service (PES) of the Città della Salute e della Scienza di Torino, during a 24-months period. A two-step cluster-analysis, using GAF total score and the achievements in the two milestones as input data was performed. In order to examine whether the identified subgroups differed in external variables that were not included in the clustering process, and consequently to validate the found functional profiles, chi-square tests for categorical variables and analyses of variance (ANOVA) for continuous variables were performed. Five clusters were found. Employed patients (Clusters 4 and 5) had more years of education, less illness chronicity (shorter duration of illness and lower proportion of previous voluntary hospitalizations), lower use of mental health resources in the last year yet higher treatment adherence, larger network size, and higher ordinary discharge. Married inpatients (Clusters 3 and 5) had lower frequencies of substance abuse. The remarkably high rate of unemployment in this inpatients' sample, and the evidence of associations between unemployment and poorer functioning, argue for further research and development of evidence-based supported employment programs, that put forth diligent effort in helping people obtain work quickly and sustain; they may also help to reduce health care service use among that clientele.
Typologies of Social Support and Associations with Mental Health Outcomes Among LGBT Youth.
McConnell, Elizabeth A; Birkett, Michelle A; Mustanski, Brian
2015-03-01
Lesbian, gay, bisexual, and transgender (LGBT) youth show increased risk for a number of negative mental health outcomes, which research has linked to minority stressors such as victimization. Further, social support promotes positive mental health outcomes for LGBT youth, and different sources of social support show differential relationships with mental health outcomes. However, little is known about how combinations of different sources of support impact mental health. In the present study, we identify clusters of family, peer, and significant other social support and then examine demographic and mental health differences by cluster in an analytic sample of 232 LGBT youth between the ages of 16 and 20 years. Using k-means cluster analysis, three social support cluster types were identified: high support (44.0% of participants), low support (21.6%), and non-family support (34.5%). A series of chi-square tests were used to examine demographic differences between these clusters, which were found for socio-economic status (SES). Regression analyses indicated that, while controlling for victimization, individuals within the three clusters showed different relationships with multiple mental health outcomes: loneliness, hopelessness, depression, anxiety, somatization, general symptom severity, and symptoms of major depressive disorder (MDD). Findings suggest the combinations of sources of support LGBT youth receive are related to their mental health. Higher SES youth are more likely to receive support from family, peers, and significant others. For most mental health outcomes, family support appears to be an especially relevant and important source of support to target for LGBT youth.
Genome-Wide Analysis of Type VI System Clusters and Effectors in Burkholderia Species.
Nguyen, Thao Thi; Lee, Hyun-Hee; Park, Inmyoung; Seo, Young-Su
2018-02-01
Type VI secretion system (T6SS) has been discovered in a variety of gram-negative bacteria as a versatile weapon to stimulate the killing of eukaryotic cells or prokaryotic competitors. Type VI secretion effectors (T6SEs) are well known as key virulence factors for important pathogenic bacteria. In many Burkholderia species, T6SS has evolved as the most complicated secretion pathway with distinguished types to translocate diverse T6SEs, suggesting their essential roles in this genus. Here we attempted to detect and characterize T6SSs and potential T6SEs in target genomes of plant-associated and environmental Burkholderia species based on computational analyses. In total, 66 potential functional T6SS clusters were found in 30 target Burkholderia bacterial genomes, of which 33% possess three or four clusters. The core proteins in each cluster were specified and phylogenetic trees of three components (i.e., TssC, TssD, TssL) were constructed to elucidate the relationship among the identified T6SS clusters. Next, we identified 322 potential T6SEs in the target genomes based on homology searches and explored the important domains conserved in effector candidates. In addition, using the screening approach based on the profile hidden Markov model (pHMM) of T6SEs that possess markers for type VI effectors (MIX motif) (MIX T6SEs), 57 revealed proteins that were not included in training datasets were recognized as novel MIX T6SE candidates from the Burkholderia species. This approach could be useful to identify potential T6SEs from other bacterial genomes.
Amro, Amin; Waldum, Bård; von der Lippe, Nanna; Brekke, Fredrik Barth; Dammen, Toril; Miaskowski, Christine; Os, Ingrid
2015-01-01
Patients with end-stage renal disease on dialysis have reduced survival rates compared with the general population. Symptoms are frequent in dialysis patients, and a symptom cluster is defined as two or more related co-occurring symptoms. The aim of this study was to explore the associations between symptom clusters and mortality in dialysis patients. In a prospective observational cohort study of dialysis patients (n = 301), Kidney Disease and Quality of Life Short Form and Beck Depression Inventory questionnaires were administered. To generate symptom clusters, principal component analysis with varimax rotation was used on 11 kidney-specific self-reported physical symptoms. A Beck Depression Inventory score of 16 or greater was defined as clinically significant depressive symptoms. Physical and mental component summary scores were generated from Short Form-36. Multivariate Cox regression analysis was used for the survival analysis, Kaplan-Meier curves and log-rank statistics were applied to compare survival rates between the groups. Three different symptom clusters were identified; one included loading of several uremic symptoms. In multivariate analyses and after adjustment for health-related quality of life and depressive symptoms, the worst perceived quartile of the "uremic" symptom cluster independently predicted all-cause mortality (hazard ratio 2.47, 95% CI 1.44-4.22, P = 0.001) compared with the other quartiles during a follow-up period that ranged from four to 52 months. The two other symptom clusters ("neuromuscular" and "skin") or the individual symptoms did not predict mortality. Clustering of uremic symptoms predicted mortality. Assessing co-occurring symptoms rather than single symptoms may help to identify dialysis patients at high risk for mortality. Copyright © 2015 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.
Ji, Yan-Bo; Bo, Chun-Lu; Xue, Xiu-Juan; Weng, En-Ming; Gao, Guang-Chao; Dai, Bei-Bei; Ding, Kai-Wen; Xu, Cui-Ping
2017-12-01
Pain, fatigue, depression, and sleep disturbance are common in patients with cancer and usually co-occur as a symptom cluster. However, the mechanism underlying this symptom cluster is unclear. This study aimed to identify subgroups of cluster symptoms, compare demographic and clinical characteristics between subgroups, and examine the associations between inflammatory cytokines and cluster symptoms. Participants were 170 Chinese inpatients with cancer from two tertiary hospitals. Inflammatory markers including interleukin-6 (IL-6), interleukin-1 receptor antagonist, and tumor necrosis factor alpha were measured. Intergroup differences and associations of inflammatory cytokines with the cluster symptoms were examined with one-way analyses of variance and logistic regression. Based on cluster analysis, participants were categorized into Subgroup 1 (all low symptoms), Subgroup 2 (low pain and moderate fatigue), or Subgroup 3 (moderate-to-high on all symptoms). The three subgroups differed significantly in Eastern Cooperative Oncology Group (ECOG) performance status, sex, residence, current treatment, education, economic status, and inflammatory cytokines levels (all P < 0.05). Compared with Subgroup 1, Subgroup 3 had a significantly poorer ECOG physical performance status and higher IL-6 levels, were more often treated with combined chemoradiotherapy, and were more likely to be rural residents. IL-6 and ECOG physical performance status were significantly associated with 1.246-fold (95% CI 1.114-1.396) and 31.831-fold (95% CI 6.017-168.385) increased risk of Subgroup 3. Our findings suggest that IL-6 levels are associated with cluster symptoms in cancer patients. Clinicians should identify patients at risk for more severe symptoms and formulate novel target interventions to improve symptom management. Copyright © 2017. Published by Elsevier Inc.
Borri, Marco; Schmidt, Maria A; Powell, Ceri; Koh, Dow-Mu; Riddell, Angela M; Partridge, Mike; Bhide, Shreerang A; Nutting, Christopher M; Harrington, Kevin J; Newbold, Katie L; Leach, Martin O
2015-01-01
To describe a methodology, based on cluster analysis, to partition multi-parametric functional imaging data into groups (or clusters) of similar functional characteristics, with the aim of characterizing functional heterogeneity within head and neck tumour volumes. To evaluate the performance of the proposed approach on a set of longitudinal MRI data, analysing the evolution of the obtained sub-sets with treatment. The cluster analysis workflow was applied to a combination of dynamic contrast-enhanced and diffusion-weighted imaging MRI data from a cohort of squamous cell carcinoma of the head and neck patients. Cumulative distributions of voxels, containing pre and post-treatment data and including both primary tumours and lymph nodes, were partitioned into k clusters (k = 2, 3 or 4). Principal component analysis and cluster validation were employed to investigate data composition and to independently determine the optimal number of clusters. The evolution of the resulting sub-regions with induction chemotherapy treatment was assessed relative to the number of clusters. The clustering algorithm was able to separate clusters which significantly reduced in voxel number following induction chemotherapy from clusters with a non-significant reduction. Partitioning with the optimal number of clusters (k = 4), determined with cluster validation, produced the best separation between reducing and non-reducing clusters. The proposed methodology was able to identify tumour sub-regions with distinct functional properties, independently separating clusters which were affected differently by treatment. This work demonstrates that unsupervised cluster analysis, with no prior knowledge of the data, can be employed to provide a multi-parametric characterization of functional heterogeneity within tumour volumes.
Application of Scan Statistics to Detect Suicide Clusters in Australia
Cheung, Yee Tak Derek; Spittal, Matthew J.; Williamson, Michelle Kate; Tung, Sui Jay; Pirkis, Jane
2013-01-01
Background Suicide clustering occurs when multiple suicide incidents take place in a small area or/and within a short period of time. In spite of the multi-national research attention and particular efforts in preparing guidelines for tackling suicide clusters, the broader picture of epidemiology of suicide clustering remains unclear. This study aimed to develop techniques in using scan statistics to detect clusters, with the detection of suicide clusters in Australia as example. Methods and Findings Scan statistics was applied to detect clusters among suicides occurring between 2004 and 2008. Manipulation of parameter settings and change of area for scan statistics were performed to remedy shortcomings in existing methods. In total, 243 suicides out of 10,176 (2.4%) were identified as belonging to 15 suicide clusters. These clusters were mainly located in the Northern Territory, the northern part of Western Australia, and the northern part of Queensland. Among the 15 clusters, 4 (26.7%) were detected by both national and state cluster detections, 8 (53.3%) were only detected by the state cluster detection, and 3 (20%) were only detected by the national cluster detection. Conclusions These findings illustrate that the majority of spatial-temporal clusters of suicide were located in the inland northern areas, with socio-economic deprivation and higher proportions of indigenous people. Discrepancies between national and state/territory cluster detection by scan statistics were due to the contrast of the underlying suicide rates across states/territories. Performing both small-area and large-area analyses, and applying multiple parameter settings may yield the maximum benefits for exploring clusters. PMID:23342098
The structure of DSM-IV-TR personality disorder diagnoses in NESARC: a reanalysis.
Trull, Timothy J; Vergés, Alvaro; Wood, Phillip K; Sher, Kenneth J
2013-12-01
Cox, Clara, Worobec, and Grant (2012) recently presented results from a series of analyses aimed at identifying the factor structure underlying the DSM-IV-TR (APA, 2000) personality diagnoses assessed in the large NESARC study. Cox et al. (2012) concluded that the best fitting model was one that modeled three lower-order factors (the three clusters of PDs as outlined by DSM-IV-TR), which in turn loaded on a single PD higher-order factor. Our reanalyses of the NESARC Wave 1 and Wave 2 data for personality disorder diagnoses revealed that the best fitting model was that of a general PD factor that spans each of the ten DSM-IV PD diagnoses, and our reanalyses do not support the three-cluster hierarchical structure outlined by Cox et al. (2012) and DSM-IV-TR. Finally, we note the importance of modeling the Wave 2 assessment method factor in analyses of NESARC PD data.
Artim-Esen, Bahar; Çene, Erhan; Şahinkaya, Yasemin; Ertan, Semra; Pehlivan, Özlem; Kamali, Sevil; Gül, Ahmet; Öcal, Lale; Aral, Orhan; Inanç, Murat
2014-07-01
Associations between autoantibodies and clinical features have been described in systemic lupus erythematosus (SLE). Herein, we aimed to define autoantibody clusters and their clinical correlations in a large cohort of patients with SLE. We analyzed 852 patients with SLE who attended our clinic. Seven autoantibodies were selected for cluster analysis: anti-DNA, anti-Sm, anti-RNP, anticardiolipin (aCL) immunoglobulin (Ig)G or IgM, lupus anticoagulant (LAC), anti-Ro, and anti-La. Two-step clustering and Kaplan-Meier survival analyses were used. Five clusters were identified. A cluster consisted of patients with only anti-dsDNA antibodies, a cluster of anti-Sm and anti-RNP, a cluster of aCL IgG/M and LAC, and a cluster of anti-Ro and anti-La antibodies. Analysis revealed 1 more cluster that consisted of patients who did not belong to any of the clusters formed by antibodies chosen for cluster analysis. Sm/RNP cluster had significantly higher incidence of pulmonary hypertension and Raynaud phenomenon. DsDNA cluster had the highest incidence of renal involvement. In the aCL/LAC cluster, there were significantly more patients with neuropsychiatric involvement, antiphospholipid syndrome, autoimmune hemolytic anemia, and thrombocytopenia. According to the Systemic Lupus International Collaborating Clinics damage index, the highest frequency of damage was in the aCL/LAC cluster. Comparison of 10 and 20 years survival showed reduced survival in the aCL/LAC cluster. This study supports the existence of autoantibody clusters with distinct clinical features in SLE and shows that forming clinical subsets according to autoantibody clusters may be useful in predicting the outcome of the disease. Autoantibody clusters in SLE may exhibit differences according to the clinical setting or population.
Stephens, D.W.; Wangsgard, J.B.
1988-01-01
A computer program, Numerical Taxonomy System of Multivariate Statistical Programs (NTSYS), was used with interfacing software to perform cluster analyses of phytoplankton data stored in the biological files of the U.S. Geological Survey. The NTSYS software performs various types of statistical analyses and is capable of handling a large matrix of data. Cluster analyses were done on phytoplankton data collected from 1974 to 1981 at four national Stream Quality Accounting Network stations in the Tennessee River basin. Analysis of the changes in clusters of phytoplankton genera indicated possible changes in the water quality of the French Broad River near Knoxville, Tennessee. At this station, the most common diatom groups indicated a shift in dominant forms with some of the less common diatoms being replaced by green and blue-green algae. There was a reduction in genera variability between 1974-77 and 1979-81 sampling periods. Statistical analysis of chloride and dissolved solids confirmed that concentrations of these substances were smaller in 1974-77 than in 1979-81. At Pickwick Landing Dam, the furthest downstream station used in the study, there was an increase in the number of genera of ' rare ' organisms with time. The appearance of two groups of green and blue-green algae indicated that an increase in temperature or nutrient concentrations occurred from 1974 to 1981, but this could not be confirmed using available water quality data. Associations of genera forming the phytoplankton communities at three stations on the Tennessee River were found to be seasonal. Nodal analysis of combined data from all four stations used in the study did not identify any seasonal or temporal patterns during 1974-81. Cluster analysis using the NYSYS programs was effective in reducing the large phytoplankton data set to a manageable size and provided considerable insight into the structure of phytoplankton communities in the Tennessee River basin. Problems encountered using cluster analysis were the subjectivity introduced in the definition of meaningful clusters, and the lack of taxonomic identification to the species level. (Author 's abstract)
Sudakin, Daniel L; Power, Laura E
2009-03-01
Geographic information systems and spatial scan statistics have been utilized to assess regional clustering of symptomatic pesticide exposures reported to a state Poison Control Center (PCC) during a single year. In the present study, we analyzed five subsequent years of PCC data to test whether there are significant geographic differences in pesticide exposure incidents resulting in serious (moderate, major, and fatal) medical outcomes. A PCC provided the data on unintentional pesticide exposures for the time period 2001-2005. The geographic location of the caller, the location where the exposure occurred, the exposure route, and the medical outcome were abstracted. There were 273 incidents resulting in moderate effects (n = 261), major effects (n = 10), or fatalities (n = 2). Spatial scan statistics identified a geographic area consisting of two adjacent counties (one urban, one rural), where statistically significant clustering of serious outcomes was observed. The relative risk of moderate, major, and fatal outcomes was 2.0 in this spatial cluster (p = 0.0005). PCC data, geographic information systems, and spatial scan statistics can identify clustering of serious outcomes from human exposure to pesticides. These analyses may be useful for public health officials to target preventive interventions. Further investigation is warranted to understand better the potential explanations for geographical clustering, and to assess whether preventive interventions have an impact on reducing pesticide exposure incidents resulting in serious medical outcomes.
Choque, Elodie; Klopp, Christophe; Valiere, Sophie; Raynal, José; Mathieu, Florence
2018-03-15
Black Aspergilli represent one of the most important fungal resources of primary and secondary metabolites for biotechnological industry. Having several black Aspergilli sequenced genomes should allow targeting the production of certain metabolites with bioactive properties. In this study, we report the draft genome of a black Aspergilli, A. tubingensis G131, isolated from a French Mediterranean vineyard. This 35 Mb genome includes 10,994 predicted genes. A genomic-based discovery identifies 80 secondary metabolites biosynthetic gene clusters. Genomic sequences of these clusters were blasted on 3 chosen black Aspergilli genomes: A. tubingensis CBS 134.48, A. niger CBS 513.88 and A. kawachii IFO 4308. This comparison highlights different levels of clusters conservation between the four strains. It also allows identifying seven unique clusters in A. tubingensis G131. Moreover, the putative secondary metabolites clusters for asperazine and naphtho-gamma-pyrones production were proposed based on this genomic analysis. Key biosynthetic genes required for the production of 2 mycotoxins, ochratoxin A and fumonisin, are absent from this draft genome. Even if intergenic sequences of these mycotoxins biosynthetic pathways are present, this could not lead to the production of those mycotoxins by A. tubingensis G131. Functional and bioinformatics analyses of A. tubingensis G131 genome highlight its potential for metabolites production in particular for TAN-1612, asperazine and naphtho-gamma-pyrones presenting antioxidant, anticancer or antibiotic properties.
Weller, Daniel; Andrus, Alexis; Wiedmann, Martin; den Bakker, Henk C
2015-01-01
Sampling of seafood and dairy processing facilities in the north-eastern USA produced 18 isolates of Listeria spp. that could not be identified at the species-level using traditional phenotypic and genotypic identification methods. Results of phenotypic and genotypic analyses suggested that the isolates represent two novel species with an average nucleotide blast identity of less than 92% with previously described species of the genus Listeria. Phylogenetic analyses based on whole genome sequences, 16S rRNA gene and sigB gene sequences confirmed that the isolates represented by type strain FSL M6-0635(T) and FSL A5-0209 cluster phylogenetically with Listeria cornellensis. Phylogenetic analyses also showed that the isolates represented by type strain FSL A5-0281(T) cluster phylogenetically with Listeria riparia. The name Listeria booriae sp. nov. is proposed for the species represented by type strain FSL A5-0281(T) ( =DSM 28860(T) =LMG 28311(T)), and the name Listeria newyorkensis sp. nov. is proposed for the species represented by type strain FSL M6-0635(T) ( =DSM 28861(T) =LMG 28310(T)). Phenotypic and genotypic analyses suggest that neither species is pathogenic. © 2015 IUMS.
Features of asthma which provide meaningful insights for understanding the disease heterogeneity.
Deliu, M; Yavuz, T S; Sperrin, M; Belgrave, D; Sahiner, U M; Sackesen, C; Kalayci, O; Custovic, A
2018-01-01
Data-driven methods such as hierarchical clustering (HC) and principal component analysis (PCA) have been used to identify asthma subtypes, with inconsistent results. To develop a framework for the discovery of stable and clinically meaningful asthma subtypes. We performed HC in a rich data set from 613 asthmatic children, using 45 clinical variables (Model 1), and after PCA dimensionality reduction (Model 2). Clinical experts then identified a set of asthma features/domains which informed clusters in the two analyses. In Model 3, we reclustered the data using these features to ascertain whether this improved the discovery process. Cluster stability was poor in Models 1 and 2. Clinical experts highlighted four asthma features/domains which differentiated the clusters in two models: age of onset, allergic sensitization, severity, and recent exacerbations. In Model 3 (HC using these four features), cluster stability improved substantially. The cluster assignment changed, providing more clinically interpretable results. In a 5-cluster model, we labelled the clusters as: "Difficult asthma" (n = 132); "Early-onset mild atopic" (n = 210); "Early-onset mild non-atopic: (n = 153); "Late-onset" (n = 105); and "Exacerbation-prone asthma" (n = 13). Multinomial regression demonstrated that lung function was significantly diminished among children with "Difficult asthma"; blood eosinophilia was a significant feature of "Difficult," "Early-onset mild atopic," and "Late-onset asthma." Children with moderate-to-severe asthma were present in each cluster. An integrative approach of blending the data with clinical expert domain knowledge identified four features, which may be informative for ascertaining asthma endotypes. These findings suggest that variables which are key determinants of asthma presence, severity, or control may not be the most informative for determining asthma subtypes. Our results indicate that exacerbation-prone asthma may be a separate asthma endotype and that severe asthma is not a single entity, but an extreme end of the spectrum of several different asthma endotypes. © 2017 The Authors. Clinical & Experimental Allergy published by John Wiley & Sons Ltd.
Structure and substructure analysis of DAFT/FADA galaxy clusters in the [0.4-0.9] redshift range
NASA Astrophysics Data System (ADS)
Guennou, L.; Adami, C.; Durret, F.; Lima Neto, G. B.; Ulmer, M. P.; Clowe, D.; LeBrun, V.; Martinet, N.; Allam, S.; Annis, J.; Basa, S.; Benoist, C.; Biviano, A.; Cappi, A.; Cypriano, E. S.; Gavazzi, R.; Halliday, C.; Ilbert, O.; Jullo, E.; Just, D.; Limousin, M.; Márquez, I.; Mazure, A.; Murphy, K. J.; Plana, H.; Rostagni, F.; Russeil, D.; Schirmer, M.; Slezak, E.; Tucker, D.; Zaritsky, D.; Ziegler, B.
2014-01-01
Context. The DAFT/FADA survey is based on the study of ~90 rich (masses found in the literature >2 × 1014 M⊙) and moderately distant clusters (redshifts 0.4 < z < 0.9), all with HST imaging data available. This survey has two main objectives: to constrain dark energy (DE) using weak lensing tomography on galaxy clusters and to build a database (deep multi-band imaging allowing photometric redshift estimates, spectroscopic data, X-ray data) of rich distant clusters to study their properties. Aims: We analyse the structures of all the clusters in the DAFT/FADA survey for which XMM-Newton and/or a sufficient number of galaxy redshifts in the cluster range are available, with the aim of detecting substructures and evidence for merging events. These properties are discussed in the framework of standard cold dark matter (ΛCDM) cosmology. Methods: In X-rays, we analysed the XMM-Newton data available, fit a β-model, and subtracted it to identify residuals. We used Chandra data, when available, to identify point sources. In the optical, we applied a Serna & Gerbal (SG) analysis to clusters with at least 15 spectroscopic galaxy redshifts available in the cluster range. We discuss the substructure detection efficiencies of both methods. Results: XMM-Newton data were available for 32 clusters, for which we derive the X-ray luminosity and a global X-ray temperature for 25 of them. For 23 clusters we were able to fit the X-ray emissivity with a β-model and subtract it to detect substructures in the X-ray gas. A dynamical analysis based on the SG method was applied to the clusters having at least 15 spectroscopic galaxy redshifts in the cluster range: 18 X-ray clusters and 11 clusters with no X-ray data. The choice of a minimum number of 15 redshifts implies that only major substructures will be detected. Ten substructures were detected both in X-rays and by the SG method. Most of the substructures detected both in X-rays and with the SG method are probably at their first cluster pericentre approach and are relatively recent infalls. We also find hints of a decreasing X-ray gas density profile core radius with redshift. Conclusions: The percentage of mass included in substructures was found to be roughly constant with redshift values of 5-15%, in agreement both with the general CDM framework and with the results of numerical simulations. Galaxies in substructures show the same general behaviour as regular cluster galaxies; however, in substructures, there is a deficiency of both late type and old stellar population galaxies. Late type galaxies with recent bursts of star formation seem to be missing in the substructures close to the bottom of the host cluster potential well. However, our sample would need to be increased to allow a more robust analysis. Tables 1, 2, 4 and Appendices A-C are available in electronic form at http://www.aanda.org
Characterizing cognitive heterogeneity on the schizophrenia-bipolar disorder spectrum.
Van Rheenen, T E; Lewandowski, K E; Tan, E J; Ospina, L H; Ongur, D; Neill, E; Gurvich, C; Pantelis, C; Malhotra, A K; Rossell, S L; Burdick, K E
2017-07-01
Current group-average analysis suggests quantitative but not qualitative cognitive differences between schizophrenia (SZ) and bipolar disorder (BD). There is increasing recognition that cognitive within-group heterogeneity exists in both disorders, but it remains unclear as to whether between-group comparisons of performance in cognitive subgroups emerging from within each of these nosological categories uphold group-average findings. We addressed this by identifying cognitive subgroups in large samples of SZ and BD patients independently, and comparing their cognitive profiles. The utility of a cross-diagnostic clustering approach to understanding cognitive heterogeneity in these patients was also explored. Hierarchical clustering analyses were conducted using cognitive data from 1541 participants (SZ n = 564, BD n = 402, healthy control n = 575). Three qualitatively and quantitatively similar clusters emerged within each clinical group: a severely impaired cluster, a mild-moderately impaired cluster and a relatively intact cognitive cluster. A cross-diagnostic clustering solution also resulted in three subgroups and was superior in reducing cognitive heterogeneity compared with disorder clustering independently. Quantitative SZ-BD cognitive differences commonly seen using group averages did not hold when cognitive heterogeneity was factored into our sample. Members of each corresponding subgroup, irrespective of diagnosis, might be manifesting the outcome of differences in shared cognitive risk factors.
Boxman, Ingeborg L A; Verhoef, Linda; Vennema, Harry; Ngui, Siew-Lin; Friesema, Ingrid H M; Whiteside, Chris; Lees, David; Koopmans, Marion
2016-01-01
This report describes an outbreak investigation starting with two closely related suspected food-borne clusters of Dutch hepatitis A cases, nine primary cases in total, with an unknown source in the Netherlands. The hepatitis A virus (HAV) genotype IA sequences of both clusters were highly similar (459/460 nt) and were not reported earlier. Food questionnaires and a case-control study revealed an association with consumption of mussels. Analysis of mussel supply chains identified the most likely production area. International enquiries led to identification of a cluster of patients near this production area with identical HAV sequences with onsets predating the first Dutch cluster of cases. The most likely source for this cluster was a case who returned from an endemic area in Central America, and a subsequent household cluster from which treated domestic sewage was discharged into the suspected mussel production area. Notably, mussels from this area were also consumed by a separate case in the United Kingdom sharing an identical strain with the second Dutch cluster. In conclusion, a small number of patients in a non-endemic area led to geographically dispersed hepatitis A outbreaks with food as vehicle. This link would have gone unnoticed without sequence analyses and international collaboration.
Career paths in physicians' postgraduate training - an eight-year follow-up study.
Buddeberg-Fischer, Barbara; Stamm, Martina; Klaghofer, Richard
2010-10-06
To date, there are hardly any studies on the choice of career path in medical school graduates. The present study aimed to investigate what career paths can be identified in the course of postgraduate training of physicians; what factors have an influence on the choice of a career path; and in what way the career paths are correlated with career-related factors as well as with work-life balance aspirations. The data reported originates from five questionnaire surveys of the prospective SwissMedCareer Study, beginning in 2001 (T1, last year of medical school). The study sample consisted of 358 physicians (197 females, 55%; 161 males, 45%) participating at each assessment from T2 (2003, first year of residency) to T5 (2009, seventh year of residency), answering the question: What career do you aspire to have? Furthermore, personal characteristics, chosen specialty, career motivation, mentoring experience, work-life balance as well as workload, career success and career satisfaction were assessed. Career paths were analysed with cluster analysis, and differences between clusters analysed with multivariate methods. The cluster analysis revealed four career clusters which discriminated distinctly between each other: (1) career in practice, (2) hospital career, (3) academic career, and (4) changing career goal. From T3 (third year of residency) to T5, respondents in Cluster 1-3 were rather stable in terms of their career path aspirations, while those assigned to Cluster 4 showed a high fluctuation in their career plans. Physicians in Cluster 1 showed high values in extraprofessional concerns and often consider part-time work. Cluster 2 and 3 were characterised by high instrumentality, intrinsic and extrinsic career motivation, career orientation and high career success. No cluster differences were seen in career satisfaction. In Cluster 1 and 4, females were overrepresented. Trainees should be supported to stay on the career path that best suits his/her personal and professional profile. Attention should be paid to the subgroup of physicians in Cluster 4 switching from one to another career goal in the course of their postgraduate training.
An Empirical Typology of Perfectionism in Academically Talented Children.
ERIC Educational Resources Information Center
Parker, Wayne D.
1997-01-01
A national sample of 820 academically talented children took the Multidimensional Perfectionism Scale. Cluster analyses of scores found a three-cluster solution. Further analyses indicated that these clusters were: nonperfectionistic (32.%), healthy perfectionistic (41.7%), and dysfunctional perfectionistic (25.5%). The construct of perfectionism…
Beverage consumption patterns of Canadian adults aged 19 to 65 years.
Nikpartow, Nooshin; Danyliw, Adrienne D; Whiting, Susan J; Lim, Hyun J; Vatanparast, Hassanali
2012-12-01
To investigate the beverage intake patterns of Canadian adults and explore characteristics of participants in different beverage clusters. Analyses of nationally representative data with cross-sectional complex stratified design. Canadian Community Health Survey, Cycle 2.2 (2004). A total of 14 277 participants aged 19-65 years, in whom dietary intake was assessed using a single 24 h recall, were included in the study. After determining total intake and the contribution of beverages to total energy intake among age/sex groups, cluster analysis (K-means method) was used to classify males and females into distinct clusters based on the dominant pattern of beverage intakes. To test differences across clusters, χ2 tests and 95 % confidence intervals of the mean intakes were used. Six beverage clusters in women and seven beverage clusters in men were identified. 'Sugar-sweetened' beverage clusters - regular soft drinks and fruit drinks - as well as a 'beer' cluster, appeared for both men and women. No 'milk' cluster appeared among women. The mean consumption of the dominant beverage in each cluster was higher among men than women. The 'soft drink' cluster in men had the lowest proportion of the higher levels of education, and in women the highest proportion of inactivity, compared with other beverage clusters. Patterns of beverage intake in Canadian women indicate high consumption of sugar-sweetened beverages particularly fruit drinks, low intake of milk and high intake of beer. These patterns in women have implications for poor bone health, risk of obesity and other morbidities.
Typology of people with first-episode psychosis.
Subramaniam, Mythily; Zheng, Huili; Soh, Pauline; Poon, Lye Yin; Vaingankar, Janhavi A; Chong, Siow Ann; Verma, Swapna
2016-08-01
The aim of the current study was to create a typology of patients with first-episode psychosis based on sociodemographic and clinical characteristics, service use and outcomes using cluster analysis. Data from all respondents who were accepted into the Early Psychosis Intervention Programme (EPIP), Singapore from 2007 to 2011 were analysed. A two-step clustering method was carried out to classify the patients into distinct clusters. Two clusters were identified. Cluster 1 comprised largely of younger people with mean age of 25.5 (6.0) years at treatment contact, who were predominantly male (55.3%), single (98.3%) and living with parents (86.3%). Cluster 1 had a higher proportion of people diagnosed with the schizophrenia spectrum disorder (71.4%) and with a positive family history of psychiatric illness. Patients in cluster 2 were generally older with a mean age of 33.6 (4.7) years and the majority were women (74.2%). Cluster 1 had people with higher Positive and Negative Syndrome Scale (PANSS) scores at baseline as compared with cluster 2. After a 1-year follow up, their scores were still poorer than their counterparts in cluster 2, especially for PANSS negative score. The functioning level of people in cluster 1 showed less improvement than the people in cluster 2 after a year of treatment. There is a compelling need to develop new therapies and intensively treat young people presenting with psychosis as this group tends to have poorer outcomes even after 1 year of treatment. © 2014 Wiley Publishing Asia Pty Ltd.
A systematic review of occupational health and safety interventions with economic analyses.
Tompa, Emile; Dolinschi, Roman; de Oliveira, Claire; Irvin, Emma
2009-09-01
We reviewed the occupational health and safety intervention literature to synthesize evidence on financial merits of such interventions. A literature search included journal databases, existing systematic reviews, and studies identified by content experts. Studies meeting inclusion criteria were assessed for quality. Evidence was synthesized within industry-intervention type clusters. We found strong evidence that ergonomic and other musculoskeletal injury prevention interventions in manufacturing and warehousing are worth undertaking in terms of their financial merits. We also found strong evidence that multisector disability management interventions are worth undertaking. While the economic evaluation of interventions in this literature warrants further expansion, we found a sufficient number of studies to identify strong, moderate, and limited evidence in certain industry-intervention clusters. The review also provided insights into how the methodological quality of economic evaluations in this literature could be improved.
Advanced Treatment Monitoring for Olympic-Level Athletes Using Unsupervised Modeling Techniques
Siedlik, Jacob A.; Bergeron, Charles; Cooper, Michael; Emmons, Russell; Moreau, William; Nabhan, Dustin; Gallagher, Philip; Vardiman, John P.
2016-01-01
Context Analysis of injury and illness data collected at large international competitions provides the US Olympic Committee and the national governing bodies for each sport with information to best prepare for future competitions. Research in which authors have evaluated medical contacts to provide the expected level of medical care and sports medicine services at international competitions is limited. Objective To analyze the medical-contact data for athletes, staff, and coaches who participated in the 2011 Pan American Games in Guadalajara, Mexico, using unsupervised modeling techniques to identify underlying treatment patterns. Design Descriptive epidemiology study. Setting Pan American Games. Patients or Other Participants A total of 618 US athletes (337 males, 281 females) participated in the 2011 Pan American Games. Main Outcome Measure(s) Medical data were recorded from the injury-evaluation and injury-treatment forms used by clinicians assigned to the central US Olympic Committee Sport Medicine Clinic and satellite locations during the operational 17-day period of the 2011 Pan American Games. We used principal components analysis and agglomerative clustering algorithms to identify and define grouped modalities. Lift statistics were calculated for within-cluster subgroups. Results Principal component analyses identified 3 components, accounting for 72.3% of the variability in datasets. Plots of the principal components showed that individual contacts focused on 4 treatment clusters: massage, paired manipulation and mobilization, soft tissue therapy, and general medical. Conclusions Unsupervised modeling techniques were useful for visualizing complex treatment data and provided insights for improved treatment modeling in athletes. Given its ability to detect clinically relevant treatment pairings in large datasets, unsupervised modeling should be considered a feasible option for future analyses of medical-contact data from international competitions. PMID:26794628
Azeredo, Catarina Machado; Levy, Renata Bertazzi; Peres, Maria Fernanda Tourinho; Menezes, Paulo Rossi; Araya, Ricardo
2016-01-01
Objectives The aim of this study was to analyse the clustering of multiple health-related behaviours among adolescents and describe which socio-demographic characteristics are associated with these patterns. Design Cross-sectional study. Setting Brazilian schools assessed by the National Survey of School Health (PeNSE, 2012). Participants 104 109 Brazilian ninth-grade students from public and private schools (response rate=82.7%). Methods Exploratory and confirmatory factor analyses were performed to identify behaviour clustering and linear regression models were used to identify socio-demographic characteristics associated with each one of these behaviour patterns. Results We identified a good fit model with three behaviour patterns. The first was labelled ‘problem-behaviour’ and included aggressive behaviour, alcohol consumption, smoking, drug use and unsafe sex; the second was labelled ‘health-compromising diet and sedentary behaviours’ and included unhealthy food indicators and sedentary behaviour; and the third was labelled ‘health-promoting diet and physical activity’ and included healthy food indicators and physical activity. No differences in behaviour patterns were found between genders. The problem-behaviour pattern was associated with male gender, older age, more developed region (socially and economically) and public schools (compared with private). The ‘health-compromising diet and sedentary behaviours’ pattern was associated with female gender, older age, mothers with higher education level and more developed region. The ‘health-promoting diet and physical activity’ pattern was associated with male gender and mothers with higher education level. Conclusions Three health-related behaviour patterns were found among Brazilian adolescents. Interventions to decrease those negative patterns should take into account how these behaviours cluster together and the individuals most at risk. PMID:28186927
Climer, Sharlee; Yang, Wei; de las Fuentes, Lisa; Dávila-Román, Victor G; Gu, C Charles
2014-11-01
Complex diseases are often associated with sets of multiple interacting genetic factors and possibly with unique sets of the genetic factors in different groups of individuals (genetic heterogeneity). We introduce a novel concept of custom correlation coefficient (CCC) between single nucleotide polymorphisms (SNPs) that address genetic heterogeneity by measuring subset correlations autonomously. It is used to develop a 3-step process to identify candidate multi-SNP patterns: (1) pairwise (SNP-SNP) correlations are computed using CCC; (2) clusters of so-correlated SNPs identified; and (3) frequencies of these clusters in disease cases and controls compared to identify disease-associated multi-SNP patterns. This method identified 42 candidate multi-SNP associations with hypertensive heart disease (HHD), among which one cluster of 22 SNPs (six genes) included 13 in SLC8A1 (aka NCX1, an essential component of cardiac excitation-contraction coupling) and another of 32 SNPs had 29 from a different segment of SLC8A1. While allele frequencies show little difference between cases and controls, the cluster of 22 associated alleles were found in 20% of controls but no cases and the other in 3% of controls but 20% of cases. These suggest that both protective and risk effects on HHD could be exerted by combinations of variants in different regions of SLC8A1, modified by variants from other genes. The results demonstrate that this new correlation metric identifies disease-associated multi-SNP patterns overlooked by commonly used correlation measures. Furthermore, computation time using CCC is a small fraction of that required by other methods, thereby enabling the analyses of large GWAS datasets. © 2014 WILEY PERIODICALS, INC.
Climer, Sharlee; Yang, Wei; de las Fuentes, Lisa; Dávila-Román, Victor G.; Gu, C. Charles
2014-01-01
Complex diseases are often associated with sets of multiple interacting genetic factors and possibly with unique sets of the genetic factors in different groups of individuals (genetic heterogeneity). We introduce a novel concept of Custom Correlation Coefficient (CCC) between single nucleotide polymorphisms (SNPs) that address genetic heterogeneity by measuring subset correlations autonomously. It is used to develop a 3-step process to identify candidate multi-SNP patterns: (1) pairwise (SNP-SNP) correlations are computed using CCC; (2) clusters of so-correlated SNPs identified; and (3) frequencies of these clusters in disease cases and controls compared to identify disease-associated multi-SNP patterns. This method identified 42 candidate multi-SNP associations with hypertensive heart disease (HHD), among which one cluster of 22 SNPs (6 genes) included 13 in SLC8A1 (aka NCX1, an essential component of cardiac excitation-contraction coupling) and another of 32 SNPs had 29 from a different segment of SLC8A1. While allele frequencies show little difference between cases and controls, the cluster of 22 associated alleles were found in 20% of controls but no cases and the other in 3% of controls but 20% of cases. These suggest that both protective and risk effects on HHD could be exerted by combinations of variants in different regions of SLC8A1, modified by variants from other genes. The results demonstrate that this new correlation metric identifies disease-associated multi-SNP patterns overlooked by commonly used correlation measures. Furthermore, computation time using CCC is a small fraction of that required by other methods, thereby enabling the analyses of large GWAS datasets. PMID:25168954
Zaccaron, Alex Z; Woloshuk, Charles P; Bluhm, Burton H
2017-11-01
Stenocarpella maydis is a plant pathogenic fungus that causes Diplodia ear rot, one of the most destructive diseases of maize. To date, little information is available regarding the molecular basis of pathogenesis in this organism, in part due to limited genomic resources. In this study, a 54.8 Mb draft genome assembly of S. maydis was obtained with Illumina and PacBio sequencing technologies, and analyzed. Comparative genomic analyses with the predominant maize ear rot pathogens Aspergillus flavus, Fusarium verticillioides, and Fusarium graminearum revealed an expanded set of carbohydrate-active enzymes for cellulose and hemicellulose degradation in S. maydis. Analyses of predicted genes involved in starch degradation revealed six putative α-amylases, four extracellular and two intracellular, and two putative γ-amylases, one of which appears to have been acquired from bacteria via horizontal transfer. Additionally, 87 backbone genes involved in secondary metabolism were identified, which represents one of the largest known assemblages among Pezizomycotina species. Numerous secondary metabolite gene clusters were identified, including two clusters likely involved in the biosynthesis of diplodiatoxin and chaetoglobosins. The draft genome of S. maydis presented here will serve as a useful resource for molecular genetics, functional genomics, and analyses of population diversity in this organism. Copyright © 2017 British Mycological Society. Published by Elsevier Ltd. All rights reserved.
Diekmann, Nina; Burghartz, Melanie; Remus, Lars; Kaufholz, Anna-Lena; Nawrath, Thorben; Rohde, Manfred; Schulz, Stefan; Roselius, Louisa; Schaper, Jörg; Mamber, Oliver; Jahn, Dieter; Jahn, Martina
2013-10-01
During operation of mobile air conditioning (MAC) systems in automobiles, malodours can occur. We studied the microbial communities found on contaminated heat exchanger fins of 45 evaporators from car MAC systems which were operated in seven different regions of the world and identified corresponding volatile organic compounds. Collected biofilms were examined by scanning electron microscopy and fluorescent in situ hybridization. The detected bacteria were loosely attached to the metal surface. Further analyses of the bacteria using PCR-based single-strand conformation polymorphism and sequencing of isolated 16S rRNA gene fragments identified highly divergent microbial communities with multiple members of the Alphaproteobacteriales, Methylobacteria were the prevalent bacteria. In addition, Sphingomonadales, Burkholderiales, Bacillales, Alcanivorax spp. and Stenotrophomonas spp. were found among many others depending on the location the evaporators were operated. Interestingly, typical pathogenic bacteria related to air conditioning systems including Legionella spp. were not found. In order to determine the nature of the chemical compounds produced by the bacteria, the volatile organic compounds were examined by closed loop stripping analysis and identified by combined gas chromatography/mass spectrometry. Sulphur compounds, i.e. di-, tri- and multiple sulphides, acetylthiazole, aromatic compounds and diverse substituted pyrazines were detected. Mathematical clustering of the determined microbial community structures against their origin identified a European/American/Arabic cluster versus two mainly tropical Asian clusters. Interestingly, clustering of the determined volatiles against the origin of the corresponding MAC revealed a highly similar pattern. A close relationship of microbial community structure and resulting malodours to the climate and air quality at the location of MAC operation was concluded.
Soule, Eric K; Maloney, Sarah F; Guy, Mignonne C; Eissenberg, Thomas; Fagan, Pebbles
2018-04-01
There is limited evidence on how cigarette smokers use electronic cigarettes (ECIGs) for smoking cessation and reduction. This study used concept mapping, a participatory mixed-methods research approach, to identify ECIG use behaviors and device characteristics perceived to be associated with cigarette smoking cessation or reduction. Current ECIG users aged 18-64 were recruited from seven cities selected randomly from U.S. census tract regions. Participants were invited to complete concept mapping tasks: brainstorming, sorting and rating (n=72). During brainstorming, participants generated statements in response to a focus prompt ("A SPECIFIC WAY I HAVE USED electronic cigarettes to reduce my cigarette smoking or a SPECIFIC WAY electronic cigarettes help me reduce my cigarette smoking is…") and then sorted and rated the statements. Multidimensional scaling and hierarchical cluster analyses were used to generate a cluster map that was interpreted by the research team. Eight thematic clusters were identified: Convenience, Perceived Health Effects, Ease of Use, Versatility and Variety, Advantages of ECIGs over Cigarettes, Cigarette Substitutability, Reducing Harms to Self and Others, and Social Benefits. Participants generated several statements that related to specific behavioral strategies used when using ECIGs for smoking reduction/complete switching behaviors such as making rapid transitions from smoking to ECIG use or using certain ECIG liquids or devices. Former smokers rated the Perceived Health Effects cluster and several behavioral strategy statements higher than current smokers. These results help to identify ECIG use behaviors and characteristics perceived by ECIG users to aid in cigarette smoking cessation or reduction. Copyright © 2017 Elsevier Ltd. All rights reserved.
"Gap hunting" to characterize clustered probe signals in Illumina methylation array data.
Andrews, Shan V; Ladd-Acosta, Christine; Feinberg, Andrew P; Hansen, Kasper D; Fallin, M Daniele
2016-01-01
The Illumina 450k array has been widely used in epigenetic association studies. Current quality-control (QC) pipelines typically remove certain sets of probes, such as those containing a SNP or with multiple mapping locations. An additional set of potentially problematic probes are those with DNA methylation distributions characterized by two or more distinct clusters separated by gaps. Data-driven identification of such probes may offer additional insights for downstream analyses. We developed a procedure, termed "gap hunting," to identify probes showing clustered distributions. Among 590 peripheral blood samples from the Study to Explore Early Development, we identified 11,007 "gap probes." The vast majority (9199) are likely attributed to an underlying SNP(s) or other variant in the probe, although SNP-affected probes exist that do not produce a gap signals. Specific factors predict which SNPs lead to gap signals, including type of nucleotide change, probe type, DNA strand, and overall methylation state. These expected effects are demonstrated in paired genotype and 450k data on the same samples. Gap probes can also serve as a surrogate for the local genetic sequence on a haplotype scale and can be used to adjust for population stratification. The characteristics of gap probes reflect potentially informative biology. QC pipelines may benefit from an efficient data-driven approach that "flags" gap probes, rather than filtering such probes, followed by careful interpretation of downstream association analyses. Our results should translate directly to the recently released Illumina EPIC array given the similar chemistry and content design.
Wang, Xihua; Zhang, Guangxin; Xu, Y Jun; Sun, Guangzhi
2015-11-01
Assessment on the interaction between groundwater and surface water (GW-SW) can generate information that is critical to regional water resource management, especially for regions that are highly dependent on groundwater resources for irrigation. This study investigated such interaction on China's Sanjiang Plain (10.9 × 10(4) km(2)) and produced results to assist sustainable regional water management for intensive agricultural activities. Methods of hierarchical cluster analysis (HCA), principal component analysis (PCA), and statistical analysis were used in this study. One hundred two water samplings (60 from shallow groundwater, 7 from deep groundwater, and 35 from surface water) were collected and grouped into three clusters and seven sub-clusters during the analyses. The PCA analysis identified four principal components of the interaction, which explained 85.9% variance of total database, attributed to the dissolution and evolution of gypsum, feldspar, and other natural minerals in the region that was affected by anthropic and geological (sedimentary rock mineral) activities. The analyses showed that surface water in the upper region of the Sanjiang Plain gained water from local shallow groundwater, indicating that the surface water in the upper region was relatively more resilient to withdrawal for usage, whereas in the middle region, there was only a weak interaction between shallow groundwater and surface water. In the lower region of the Sanjiang Plain, surface water lost water to shallow groundwater, indicating that the groundwater was vulnerable to pollution by pesticides and fertilizers from terrestrial sources.
Campa, Ana; Trabanco, Noemí; Ferreira, Juan José
2017-12-01
The correct identification of the anthracnose resistance systems present in the common bean cultivars AB136 and MDRK is important because both are included in the set of 12 differential cultivars proposed for use in classifying the races of the anthracnose causal agent, Colletrotrichum lindemuthianum. In this work, the responses against seven C. lindemuthianum races were analyzed in a recombinant inbred line population derived from the cross AB136 × MDRK. A genetic linkage map of 100 molecular markers distributed across the 11 bean chromosomes was developed in this population to locate the gene or genes conferring resistance against each race, based on linkage analyses and χ 2 tests of independence. The identified anthracnose resistance genes were organized in clusters. Two clusters were found in AB136: one located on linkage group Pv07, which corresponds to the anthracnose resistance cluster Co-5, and the other located at the end of linkage group Pv11, which corresponds to the Co-2 cluster. The presence of resistance genes at the Co-5 cluster in AB136 was validated through an allelism test conducted in the F 2 population TU × AB136. The presence of resistance genes at the Co-2 cluster in AB136 was validated through genetic dissection using the F 2:3 population ABM3 × MDRK, in which it was directly mapped to a genomic position between 46.01 and 47.77 Mb of chromosome Pv11. In MDRK, two independent clusters were identified: one located on linkage group Pv01, corresponding to the Co-1 cluster, and the second located on LG Pv04, corresponding to the Co-3 cluster. This report enhances the understanding of the race-specific Phaseolus vulgaris-C. lindemuthianum interactions and will be useful in breeding programs.
Jentsch, Franziska; Allen, Jennifer; Fuchs, Judith; von der Lippe, Elena
2017-04-04
Modifiable health risk factors (MHRFs) significantly affect morbidity and mortality rates and frequently occur in specific combinations or risk clusters. Using five MHRFs (smoking, high-risk alcohol consumption, physical inactivity, low intake of fruits and vegetables, and obesity) this study investigates the extent to which risk clusters are observed in a representative sample of women aged 65 and older in Germany. Additionally, the structural composition of the clusters is systematically compared with data and findings from other countries. A pooled data set of Germany's representative cross-sectional surveys GEDA09 and GEDA10 was used. The cohort comprised 4,617 women aged 65 and older. Specific risk clusters based on five MHRFs are identified, using hierarchical cluster analysis. The MHRFs were defined as current smoking (daily or occasionally), risk alcohol consumption (according to the Alcohol Use Disorders Identification Test, a sum score of 4 or more points), physical inactivity (less active than 5 days per week for at least 30 min and lack of sports-related activity in the last three months), low intake of fruits and vegetables (less than one serving of fruits and one of vegetables per day), and obesity (a body mass index equal to or greater than 30). A total of 4,292 cases with full information on these factors are included in the cluster analysis. Extended analyses were also performed to include the number of chronic diseases by age and socioeconomic status of group members. A total of seven risk clusters were identified. In a comparison with data from international studies, the seven risk clusters were found to be stable with a high degree of structural equivalency. Evidence of the stability of risk clusters across various study populations provides a useful starting point for long-term targeted health interventions. The structural clusters provide information through which various MHRFs can be evaluated simultaneously.
Clustering of dietary intake and sedentary behavior in 2-year-old children.
Gubbels, Jessica S; Kremers, Stef P J; Stafleu, Annette; Dagnelie, Pieter C; de Vries, Sanne I; de Vries, Nanne K; Thijs, Carel
2009-08-01
To examine clustering of energy balance-related behaviors (EBRBs) in young children. This is crucial because lifestyle habits are formed at an early age and track in later life. This study is the first to examine EBRB clustering in children as young as 2 years. Cross-sectional data originated from the Child, Parent and Health: Lifestyle and Genetic Constitution (KOALA) Birth Cohort Study. Parents of 2578 2-year-old children completed a questionnaire. Correlation analyses, principal component analyses, and linear regression analyses were performed to examine clustering of EBRBs. We found modest but consistent correlations in EBRBs. Two clusters emerged: a "sedentary-snacking cluster" and a "fiber cluster." Television viewing clustered with computer use and unhealthy dietary behaviors. Children who frequently consumed vegetables also consumed fruit and brown bread more often and white bread less often. Lower maternal education and maternal obesity were associated with high scores on the sedentary-snacking cluster, whereas higher educational level was associated with high fiber cluster scores. Obesity-prone behavioral clusters are already visible in 2-year-old children and are related to maternal characteristics. The findings suggest that obesity prevention should apply an integrated approach to physical activity and dietary intake in early childhood.
Definition and characterization of an extended social-affective default network.
Amft, Maren; Bzdok, Danilo; Laird, Angela R; Fox, Peter T; Schilbach, Leonhard; Eickhoff, Simon B
2015-03-01
Recent evidence suggests considerable overlap between the default mode network (DMN) and regions involved in social, affective and introspective processes. We considered these overlapping regions as the social-affective part of the DMN. In this study, we established a robust mapping of the underlying brain network formed by these regions and those strongly connected to them (the extended social-affective default network). We first seeded meta-analytic connectivity modeling and resting-state analyses in the meta-analytically defined DMN regions that showed statistical overlap with regions associated with social and affective processing. Consensus connectivity of each seed was subsequently delineated by a conjunction across both connectivity analyses. We then functionally characterized the ensuing regions and performed several cluster analyses. Among the identified regions, the amygdala/hippocampus formed a cluster associated with emotional processes and memory functions. The ventral striatum, anterior cingulum, subgenual cingulum and ventromedial prefrontal cortex formed a heterogeneous subgroup associated with motivation, reward and cognitive modulation of affect. Posterior cingulum/precuneus and dorsomedial prefrontal cortex were associated with mentalizing, self-reference and autobiographic information. The cluster formed by the temporo-parietal junction and anterior middle temporal sulcus/gyrus was associated with language and social cognition. Taken together, the current work highlights a robustly interconnected network that may be central to introspective, socio-affective, that is, self- and other-related mental processes.
Dynamic analysis and pattern visualization of forest fires.
Lopes, António M; Tenreiro Machado, J A
2014-01-01
This paper analyses forest fires in the perspective of dynamical systems. Forest fires exhibit complex correlations in size, space and time, revealing features often present in complex systems, such as the absence of a characteristic length-scale, or the emergence of long range correlations and persistent memory. This study addresses a public domain forest fires catalogue, containing information of events for Portugal, during the period from 1980 up to 2012. The data is analysed in an annual basis, modelling the occurrences as sequences of Dirac impulses with amplitude proportional to the burnt area. First, we consider mutual information to correlate annual patterns. We use visualization trees, generated by hierarchical clustering algorithms, in order to compare and to extract relationships among the data. Second, we adopt the Multidimensional Scaling (MDS) visualization tool. MDS generates maps where each object corresponds to a point. Objects that are perceived to be similar to each other are placed on the map forming clusters. The results are analysed in order to extract relationships among the data and to identify forest fire patterns.
Dynamic Analysis and Pattern Visualization of Forest Fires
Lopes, António M.; Tenreiro Machado, J. A.
2014-01-01
This paper analyses forest fires in the perspective of dynamical systems. Forest fires exhibit complex correlations in size, space and time, revealing features often present in complex systems, such as the absence of a characteristic length-scale, or the emergence of long range correlations and persistent memory. This study addresses a public domain forest fires catalogue, containing information of events for Portugal, during the period from 1980 up to 2012. The data is analysed in an annual basis, modelling the occurrences as sequences of Dirac impulses with amplitude proportional to the burnt area. First, we consider mutual information to correlate annual patterns. We use visualization trees, generated by hierarchical clustering algorithms, in order to compare and to extract relationships among the data. Second, we adopt the Multidimensional Scaling (MDS) visualization tool. MDS generates maps where each object corresponds to a point. Objects that are perceived to be similar to each other are placed on the map forming clusters. The results are analysed in order to extract relationships among the data and to identify forest fire patterns. PMID:25137393
Alessandri, Guido; Vecchione, Michele; Donnellan, Brent M; Eisenberg, Nancy; Caprara, Gian Vittorio; Cieciuch, Jan
2014-08-01
Personality types reflect typical configurations of personality attributes within individuals. Over the last 20 years, researchers have identified a set of three replicable personality types: resilient (R), undercontrolled (U), and overcontrolled (O) types. In this study, we examined the cross-cultural replicability of the RUO types in Italy, Poland, Spain, and the United States. Personality types were identified using cluster analyses of Big Five profiles in large samples of college students from Italy (n = 322), the United States (n = 499), Spain (n = 420), and Poland (n = 235). Prior to clustering the profiles, the measurement invariance of the Big Five measure across samples was tested. We found evidence for the RUO types in all four samples. The three-cluster solution showed a better fit over alternative solutions and had a relatively high degree of cross-cultural generalizability. The RUO types are evident in samples from four countries with distinct linguistic and cultural traditions. Results were discussed in light of the importance of considering how traits are organized within individuals for advancing contemporary personality psychology. © 2013 Wiley Periodicals, Inc.
Spike sorting based upon machine learning algorithms (SOMA).
Horton, P M; Nicol, A U; Kendrick, K M; Feng, J F
2007-02-15
We have developed a spike sorting method, using a combination of various machine learning algorithms, to analyse electrophysiological data and automatically determine the number of sampled neurons from an individual electrode, and discriminate their activities. We discuss extensions to a standard unsupervised learning algorithm (Kohonen), as using a simple application of this technique would only identify a known number of clusters. Our extra techniques automatically identify the number of clusters within the dataset, and their sizes, thereby reducing the chance of misclassification. We also discuss a new pre-processing technique, which transforms the data into a higher dimensional feature space revealing separable clusters. Using principal component analysis (PCA) alone may not achieve this. Our new approach appends the features acquired using PCA with features describing the geometric shapes that constitute a spike waveform. To validate our new spike sorting approach, we have applied it to multi-electrode array datasets acquired from the rat olfactory bulb, and from the sheep infero-temporal cortex, and using simulated data. The SOMA sofware is available at http://www.sussex.ac.uk/Users/pmh20/spikes.
Warden, Craig R
2008-01-01
Background With limited resources available, injury prevention efforts need to be targeted both geographically and to specific populations. As part of a pediatric injury prevention project, data was obtained on all pediatric medical and injury incidents in a fire district to evaluate geographical clustering of pediatric injuries. This will be the first step in attempting to prevent these injuries with specific interventions depending on locations and mechanisms. Results There were a total of 4803 incidents involving patients less than 15 years of age that the fire district responded to during 2001–2005 of which 1997 were categorized as injuries and 2806 as medical calls. The two cohorts (injured versus medical) differed in age distribution (7.7 ± 4.4 years versus 5.4 ± 4.8 years, p < 0.001) and location type of incident (school or church 12% versus 15%, multifamily residence 22% versus 13%, single family residence 51% versus 28%, sport, park or recreational facility 3% versus 8%, public building 8% versus 7%, and street or road 3% versus 30%, respectively, p < 0.001). Using the medical incident locations as controls, there was no significant clustering for environmental or assault injuries using the Bernoulli method while there were four significant clusters for all injury mechanisms combined, 13 clusters for motor vehicle collisions, one for falls, and two for pedestrian or bicycle injuries. Using the Poisson cluster method on incidence rates by census tract identified four clusters for all injuries, three for motor vehicle collisions, four for fall injuries, and one each for environmental and assault injuries. The two detection methods shared a minority of overlapping geographical clusters. Conclusion Significant clustering occurs overall for all injury mechanisms combined and for each mechanism depending on the cluster detection method used. There was some overlap in geographic clusters identified by both methods. The Bernoulli method allows more focused cluster mapping and evaluation since it directly uses location data. Once clusters are found, interventions can be targeted to specific geographic locations, location types, ages of victims, and mechanisms of injury. PMID:18808720
Warden, Craig R
2008-09-22
With limited resources available, injury prevention efforts need to be targeted both geographically and to specific populations. As part of a pediatric injury prevention project, data was obtained on all pediatric medical and injury incidents in a fire district to evaluate geographical clustering of pediatric injuries. This will be the first step in attempting to prevent these injuries with specific interventions depending on locations and mechanisms. There were a total of 4803 incidents involving patients less than 15 years of age that the fire district responded to during 2001-2005 of which 1997 were categorized as injuries and 2806 as medical calls. The two cohorts (injured versus medical) differed in age distribution (7.7 +/- 4.4 years versus 5.4 +/- 4.8 years, p < 0.001) and location type of incident (school or church 12% versus 15%, multifamily residence 22% versus 13%, single family residence 51% versus 28%, sport, park or recreational facility 3% versus 8%, public building 8% versus 7%, and street or road 3% versus 30%, respectively, p < 0.001). Using the medical incident locations as controls, there was no significant clustering for environmental or assault injuries using the Bernoulli method while there were four significant clusters for all injury mechanisms combined, 13 clusters for motor vehicle collisions, one for falls, and two for pedestrian or bicycle injuries. Using the Poisson cluster method on incidence rates by census tract identified four clusters for all injuries, three for motor vehicle collisions, four for fall injuries, and one each for environmental and assault injuries. The two detection methods shared a minority of overlapping geographical clusters. Significant clustering occurs overall for all injury mechanisms combined and for each mechanism depending on the cluster detection method used. There was some overlap in geographic clusters identified by both methods. The Bernoulli method allows more focused cluster mapping and evaluation since it directly uses location data. Once clusters are found, interventions can be targeted to specific geographic locations, location types, ages of victims, and mechanisms of injury.
Molecular Typing of Pneumococci for Investigation of Linked Cases of Invasive Pneumococcal Disease ▿
Pichon, Bruno; Moyce, Laura; Sheppard, Carmen; Slack, Mary; Turbitt, Deborah; Pebody, Richard; Spencer, David A.; Edwards, Justin; Krahé, Daniel; George, Robert
2010-01-01
In winter 2007-2008, an outbreak of pediatric pneumonia caused by serotype 5 pneumococci was identified in a northeast London suburb. Variable number of tandem repeat analyses clustered these pneumococci from the other serotype 5 pneumococci in the United Kingdom, highlighting the importance of this discriminative typing method in supporting epidemiological investigations. PMID:20164267
ERIC Educational Resources Information Center
Gorton, Matthew; Douarin, Elodie; Davidova, Sophia; Latruffe, Laure
2008-01-01
Farmers' attitudes, to agricultural production, diversification and policy support, and behavioural intentions in five Member States of the EU (France, Lithuania, Slovakia, Sweden, England) are analysed comparatively. Groups of farmers with similarly held attitudes are identified using cluster analysis to investigate whether differences in…
ERIC Educational Resources Information Center
Steinberg, Laurie S.
Forty-five third-grade and fourth-grade boys identified by their schools as being both normal in intelligence and severely disabled in reading were given a battery of tests of language, visual perception, silent reading comprehension, and finger agnosia. Three consistent groups of subjects emerged from cluster analyses of the results. One group…
Changes in Leisure Styles and Satisfaction of Older People: A Five Years Follow-Up
ERIC Educational Resources Information Center
Gagliardi, Cristina; Spazzafumo, Liana; Papa, Roberta; Marcellini, Fiorella
2012-01-01
The present study examines the leisure style and leisure satisfaction of a sample of older people at baseline and after a period of 5 years. Three groups were identified by factorial and cluster analyses and labelled under the headings of: Organised Style, Surrounding Style and Indoor Style. Each group represented a different typology of leisure,…
ERIC Educational Resources Information Center
Xenofontos, Constantinos; Papadopoulos, Christos E.
2015-01-01
In this paper, we examine the ways the history of mathematics is integrated in the national textbooks of Cyprus and Greece. Our data-driven analyses suggest that the references identified can be clustered in four categories: (a) biographical references about mathematicians or historical references regarding the origins of a mathematical concept…
Modular structural elements in the replication origin region of Tetrahymena rDNA.
Du, C; Sanzgiri, R P; Shaiu, W L; Choi, J K; Hou, Z; Benbow, R M; Dobbs, D L
1995-01-01
Computer analyses of the DNA replication origin region in the amplified rRNA genes of Tetrahymena thermophila identified a potential initiation zone in the 5'NTS [Dobbs, Shaiu and Benbow (1994), Nucleic Acids Res. 22, 2479-2489]. This region consists of a putative DNA unwinding element (DUE) aligned with predicted bent DNA segments, nuclear matrix or scaffold associated region (MAR/SAR) consensus sequences, and other common modular sequence elements previously shown to be clustered in eukaryotic chromosomal origin regions. In this study, two mung bean nuclease-hypersensitive sites in super-coiled plasmid DNA were localized within the major DUE-like element predicted by thermodynamic analyses. Three restriction fragments of the 5'NTS region predicted to contain bent DNA segments exhibited anomalous migration characteristic of bent DNA during electrophoresis on polyacrylamide gels. Restriction fragments containing the 5'NTS region bound Tetrahymena nuclear matrices in an in vitro binding assay, consistent with an association of the replication origin region with the nuclear matrix in vivo. The direct demonstration in a protozoan origin region of elements previously identified in Drosophila, chick and mammalian origin regions suggests that clusters of modular structural elements may be a conserved feature of eukaryotic chromosomal origins of replication. Images PMID:7784181
Chen, Yi; Luo, Yan; Curry, Phillip; Timme, Ruth; Melka, David; Doyle, Matthew; Parish, Mickey; Hammack, Thomas S; Allard, Marc W; Brown, Eric W; Strain, Errol A
2017-01-01
A listeriosis outbreak in the United States implicated contaminated ice cream produced by one company, which operated 3 facilities. We performed single nucleotide polymorphism (SNP)-based whole genome sequencing (WGS) analysis on Listeria monocytogenes from food, environmental and clinical sources, identifying two clusters and a single branch, belonging to PCR serogroup IIb and genetic lineage I. WGS Cluster I, representing one outbreak strain, contained 82 food and environmental isolates from Facility I and 4 clinical isolates. These isolates differed by up to 29 SNPs, exhibited 9 pulsed-field gel electrophoresis (PFGE) profiles and multilocus sequence typing (MLST) sequence type (ST) 5 of clonal complex 5 (CC5). WGS Cluster II contained 51 food and environmental isolates from Facility II, 4 food isolates from Facility I and 5 clinical isolates. Among them the isolates from Facility II and clinical isolates formed a clade and represented another outbreak strain. Isolates in this clade differed by up to 29 SNPs, exhibited 3 PFGE profiles and ST5. The only isolate collected from Facility III belonged to singleton ST489, which was in a single branch separate from Clusters I and II, and was not associated with the outbreak. WGS analyses clustered together outbreak-associated isolates exhibiting multiple PFGE profiles, while differentiating them from epidemiologically unrelated isolates that exhibited outbreak PFGE profiles. The complete genome of a Cluster I isolate allowed the identification and analyses of putative prophages, revealing that Cluster I isolates differed by the gain or loss of three putative prophages, causing the banding pattern differences among all 3 AscI-PFGE profiles observed in Cluster I isolates. WGS data suggested that certain ice cream varieties and/or production lines might have contamination sources unique to them. The SNP-based analysis was able to distinguish CC5 as a group from non-CC5 isolates and differentiate among CC5 isolates from different outbreaks/incidents.
Chen, Yi; Luo, Yan; Curry, Phillip; Timme, Ruth; Melka, David; Doyle, Matthew; Parish, Mickey; Hammack, Thomas S.; Allard, Marc W.; Brown, Eric W.; Strain, Errol A.
2017-01-01
A listeriosis outbreak in the United States implicated contaminated ice cream produced by one company, which operated 3 facilities. We performed single nucleotide polymorphism (SNP)-based whole genome sequencing (WGS) analysis on Listeria monocytogenes from food, environmental and clinical sources, identifying two clusters and a single branch, belonging to PCR serogroup IIb and genetic lineage I. WGS Cluster I, representing one outbreak strain, contained 82 food and environmental isolates from Facility I and 4 clinical isolates. These isolates differed by up to 29 SNPs, exhibited 9 pulsed-field gel electrophoresis (PFGE) profiles and multilocus sequence typing (MLST) sequence type (ST) 5 of clonal complex 5 (CC5). WGS Cluster II contained 51 food and environmental isolates from Facility II, 4 food isolates from Facility I and 5 clinical isolates. Among them the isolates from Facility II and clinical isolates formed a clade and represented another outbreak strain. Isolates in this clade differed by up to 29 SNPs, exhibited 3 PFGE profiles and ST5. The only isolate collected from Facility III belonged to singleton ST489, which was in a single branch separate from Clusters I and II, and was not associated with the outbreak. WGS analyses clustered together outbreak-associated isolates exhibiting multiple PFGE profiles, while differentiating them from epidemiologically unrelated isolates that exhibited outbreak PFGE profiles. The complete genome of a Cluster I isolate allowed the identification and analyses of putative prophages, revealing that Cluster I isolates differed by the gain or loss of three putative prophages, causing the banding pattern differences among all 3 AscI-PFGE profiles observed in Cluster I isolates. WGS data suggested that certain ice cream varieties and/or production lines might have contamination sources unique to them. The SNP-based analysis was able to distinguish CC5 as a group from non-CC5 isolates and differentiate among CC5 isolates from different outbreaks/incidents. PMID:28166293
Compositional variability in Mediterranean archaeofaunas from Upper Paleolithic Southwest Europe
NASA Astrophysics Data System (ADS)
Jones, Emily Lena
2018-03-01
Recent meta-analyses of Upper Paleolithic Southwestern European archaeofaunas (Jones, 2015, 2016) have identified a consistent "Mediterranean" cluster from the Last Glacial Maximum through the early Holocene, suggesting similarities in environment and/or consistency in hunting strategy across this region through time despite radical changes in climate. However, while these archaeofaunas from this cluster all derive from sites located within today's Mediterranean bioclimatic region, many of them are from locations far from the Mediterranean Sea - Atlantic Portugal, the Spanish Meseta - which today differ significantly from each other in biotic composition. In this paper, I explore clustering (through cluster analysis and non-metric multidimensional scaling) within the Mediterranean archaeofaunal group. I test for the influence of sample size as well as the geographic variables of site elevation, latitude, and longitude on variability in the large mammal portions of archaeofaunal assemblages. ANOVA shows no relationship between cluster-defined groups and site elevation or longitude; instead, site latitude appears to be a primary contributor to patterning. However, the overall compositional similarity of the Mediterranean archaeofaunas in this dataset suggests more consistency than variability in Upper Paleolithic hunting strategy in this region.
Probing the X-ray Emission from the Massive Star Cluster Westerlund 2
NASA Astrophysics Data System (ADS)
Lopez, Laura
2017-09-01
We propose a 300 ks Chandra ACIS-I observation of the massive star cluster Westerlund 2 (Wd2). This region is teeming with high-energy emission from a variety of sources: colliding wind binaries, OB and Wolf-Rayet stars, two young pulsars, and an unidentified source of very high-energy (VHE) gamma-rays. Our Chandra program is designed to achieve several goals: 1) to take a complete census of Wd2 X-ray point sources and monitor variability; 2) to probe the conditions of the colliding winds in the binary WR 20a; 3) to search for an X-ray counterpart of the VHE gamma-rays; 4) to identify diffuse X-ray emission; 5) to compare results to other massive star clusters observed by Chandra. Only Chandra has the spatial resolution and sensitivity necessary for our proposed analyses.
Diaz-Ordaz, Karla; Froud, Robert; Sheehan, Bart; Eldridge, Sandra
2013-10-22
Previous reviews of cluster randomised trials have been critical of the quality of the trials reviewed, but none has explored determinants of the quality of these trials in a specific field over an extended period of time. Recent work suggests that correct conduct and reporting of these trials may require more than published guidelines. In this review, our aim was to assess the quality of cluster randomised trials conducted in residential facilities for older people, and to determine whether (1) statistician involvement in the trial and (2) strength of journal endorsement of the Consolidated Standards of Reporting Trials (CONSORT) statement influence quality. We systematically identified trials randomising residential facilities for older people, or parts thereof, without language restrictions, up to the end of 2010, using National Library of Medicine (Medline) via PubMed and hand-searching. We based quality assessment criteria largely on the extended CONSORT statement for cluster randomised trials. We assessed statistician involvement based on statistician co-authorship, and strength of journal endorsement of the CONSORT statement from journal websites. 73 trials met our inclusion criteria. Of these, 20 (27%) reported accounting for clustering in sample size calculations and 54 (74%) in the analyses. In 29 trials (40%), methods used to identify/recruit participants were judged by us to have potentially caused bias or reporting was unclear to reach a conclusion. Some elements of quality improved over time but this appeared not to be related to the publication of the extended CONSORT statement for these trials. Trials with statistician/epidemiologist co-authors were more likely to account for clustering in sample size calculations (unadjusted odds ratio 5.4, 95% confidence interval 1.1 to 26.0) and analyses (unadjusted OR 3.2, 1.2 to 8.5). Journal endorsement of the CONSORT statement was not associated with trial quality. Despite international attempts to improve methods in cluster randomised trials, important quality limitations remain amongst these trials in residential facilities. Statistician involvement on trial teams may be more effective in promoting quality than further journal endorsement of the CONSORT statement. Funding bodies and journals should promote statistician involvement and co-authorship in addition to adherence to CONSORT guidelines.
The Everyday Moral Judge - Autobiographical Recollections of Moral Emotions.
Körner, André; Tscharaktschiew, Nadine; Schindler, Rose; Schulz, Katrin; Rudolph, Udo
2016-01-01
Moral emotions are typically elicited in everyday social interactions and regulate social behavior. Previous research in the field of attribution theory identified ought (the moral standard of a given situation or intended goal), goal-attainment (a goal can be attained vs. not attained) and effort (high vs. low effort expenditure) as cognitive antecedents of moral emotions. In contrast to earlier studies, mainly relying on thought experiments, we investigated autobiographical recollections of N = 312 participants by means of an online study. We analyzed a diverse range of moral emotions, i.e., admiration, anger, contempt, indignation, pride, respect, schadenfreude, and sympathy, by using a mixed-method approach. Qualitative and quantitative methods clearly corroborate the important role of ought, goal-attainment, and effort as eliciting conditions of moral emotions. Furthermore, we built categorical systems based on our participants' descriptions of real-life situations, allowing for more fine-grained distinctions between seemingly similar moral emotions. We thus identify additional prerequisites explaining more subtle differences between moral emotion clusters as they emerge from our analyses (i.e., cluster 1: admiration, pride, and respect; cluster 2: anger, contempt, and indignation; cluster 3: schadenfreude and sympathy). Results are discussed in the light of attributional theories of moral emotions, and implications for future research are derived.
Nguyen, Quang Ngoc; Pham, Son Thai; Do, Loi Doan; Nguyen, Viet Lan; Wall, Stig; Weinehall, Lars; Bonita, Ruth; Byass, Peter
2012-01-01
Background. Data on cardiovascular disease risk factors (CVDRFs) in Vietnam are limited. This study explores the prevalence of each CVDRF and how they cluster to evaluate CVDRF burdens and potential prevention strategies. Methods. A cross-sectional survey in 2009 (2,130 adults) was done to collect data on behavioural CVDRF, anthropometry and blood pressure, lipidaemia profiles, and oral glucose tolerance tests. Four metabolic CVDRFs (hypertension, dyslipidaemia, diabetes, and obesity) and five behavioural CVDRFs (smoking, excessive alcohol intake, unhealthy diet, physical inactivity, and stress) were analysed to identify their prevalence, cluster patterns, and social predictors. Framingham scores were applied to estimate the global 10-year CVD risks and potential benefits of CVD prevention strategies. Results. The age-standardised prevalence of having at least 2/4 metabolic, 2/5 behavioural, or 4/9 major CVDRF was 28%, 27%, 13% in women and 32%, 62%, 34% in men. Within-individual clustering of metabolic factors was more common among older women and in urban areas. High overall CVD risk (≥20% over 10 years) identified 20% of men and 5% of women-especially at higher ages-who had coexisting CVDRF. Conclusion. Multiple CVDRFs were common in Vietnamese adults with different clustering patterns across sex/age groups. Tackling any single risk factor would not be efficient.
The Everyday Moral Judge – Autobiographical Recollections of Moral Emotions
Tscharaktschiew, Nadine; Schindler, Rose; Schulz, Katrin; Rudolph, Udo
2016-01-01
Moral emotions are typically elicited in everyday social interactions and regulate social behavior. Previous research in the field of attribution theory identified ought (the moral standard of a given situation or intended goal), goal-attainment (a goal can be attained vs. not attained) and effort (high vs. low effort expenditure) as cognitive antecedents of moral emotions. In contrast to earlier studies, mainly relying on thought experiments, we investigated autobiographical recollections of N = 312 participants by means of an online study. We analyzed a diverse range of moral emotions, i.e., admiration, anger, contempt, indignation, pride, respect, schadenfreude, and sympathy, by using a mixed-method approach. Qualitative and quantitative methods clearly corroborate the important role of ought, goal-attainment, and effort as eliciting conditions of moral emotions. Furthermore, we built categorical systems based on our participants’ descriptions of real-life situations, allowing for more fine-grained distinctions between seemingly similar moral emotions. We thus identify additional prerequisites explaining more subtle differences between moral emotion clusters as they emerge from our analyses (i.e., cluster 1: admiration, pride, and respect; cluster 2: anger, contempt, and indignation; cluster 3: schadenfreude and sympathy). Results are discussed in the light of attributional theories of moral emotions, and implications for future research are derived. PMID:27977699
Epidemic dispersion of HIV and HCV in a population of co-infected Romanian injecting drug users.
Paraschiv, Simona; Banica, Leontina; Nicolae, Ionelia; Niculescu, Iulia; Abagiu, Adrian; Jipa, Raluca; Pineda-Peña, Andrea-Clemencia; Pingarilho, Marta; Neaga, Emil; Theys, Kristof; Libin, Pieter; Otelea, Dan; Abecasis, Ana
2017-01-01
Co-infections with HIV and HCV are very frequent among people who inject drugs (PWID). However, very few studies comparatively reconstructed the transmission patterns of both viruses in the same population. We have recruited 117 co-infected PWID during a recent HIV outbreak in Romania. Phylogenetic analyses were performed on HIV and HCV sequences in order to characterize and compare transmission dynamics of the two viruses. Three large HIV clusters (2 subtype F1 and one CRF14_BG) and thirteen smaller HCV transmission networks (genotypes 1a, 1b, 3a, 4a and 4d) were identified. Eighty (65%) patients were both in HIV and HCV transmission chains and 70 of those shared the same HIV and HCV cluster with at least one other patient. Molecular clock analysis indicated that all identified HIV clusters originated around 2006, while the origin of the different HCV clusters ranged between 1980 (genotype 1b) and 2011 (genotypes 3a and 4d). HCV infection preceded HIV infection in 80.3% of cases. Coincidental transmission of HIV and HCV was estimated to be rather low (19.65%) and associated with an outbreak among PWID during detention in the same penitentiary. This study has reconstructed and compared the dispersion of these two viruses in a PWID population.
Huprich, Steven K; Defife, Jared; Westen, Drew
2014-01-01
We sought to determine whether meaningful subtypes of Dysthymic patients could be identified when grouping them by similar personality profiles. A random, national sample of psychiatrists and clinical psychologists (n=1201) described a randomly selected current patient with personality pathology using the descriptors in the Shedler-Westen Assessment Procedure-II (SWAP-II), completed assessments of patients' adaptive functioning, and provided DSM-IV Axis I and II diagnoses. We applied Q-factor cluster analyses to those patients diagnosed with Dysthymic Disorder. Four clusters were identified-High Functioning, Anxious/Dysphoric, Emotionally Dysregulated, and Narcissistic. These factor scores corresponded with a priori hypotheses regarding diagnostic comorbidity and level of adaptive functioning. We compared these groups to diagnostic constructs described and empirically identified in the past literature. The results converge with past and current ideas about the ways in which chronic depression and personality are related and offer an enhanced means by which to understand a heterogeneous diagnostic category that is empirically grounded and clinically useful. © 2013 Published by Elsevier B.V.
Spatial location influences vocal interactions in bullfrog choruses
Bates, Mary E.; Cropp, Brett F.; Gonchar, Marina; Knowles, Jeffrey; Simmons, James A.; Simmons, Andrea Megela
2010-01-01
A multiple sensor array was employed to identify the spatial locations of all vocalizing male bullfrogs (Rana catesbeiana) in five natural choruses. Patterns of vocal activity collected with this array were compared with computer simulations of chorus activity. Bullfrogs were not randomly spaced within choruses, but tended to cluster into closely spaced groups of two to five vocalizing males. There were nonrandom, differing patterns of vocal interactions within clusters of closely spaced males and between different clusters. Bullfrogs located within the same cluster tended to overlap or alternate call notes with two or more other males in that cluster. These near-simultaneous calling bouts produced advertisement calls with more pronounced amplitude modulation than occurred in nonoverlapping notes or calls. Bullfrogs located in different clusters more often alternated entire calls or overlapped only small segments of their calls. They also tended to respond sequentially to calls of their farther neighbors compared to their nearer neighbors. Results of computational analyses showed that the observed patterns of vocal interactions were significantly different than expected based on random activity. The use of a multiple sensor array provides a richer view of the dynamics of choruses than available based on single microphone techniques. PMID:20370047
Cruz, Antonio M; Barr, Cameron; Puñales-Pozo, Elsa
2008-01-01
This research's main goals were to build a predictor for a turnaround time (TAT) indicator for estimating its values and use a numerical clustering technique for finding possible causes of undesirable TAT values. The following stages were used: domain understanding, data characterisation and sample reduction and insight characterisation. Building the TAT indicator multiple linear regression predictor and clustering techniques were used for improving corrective maintenance task efficiency in a clinical engineering department (CED). The indicator being studied was turnaround time (TAT). Multiple linear regression was used for building a predictive TAT value model. The variables contributing to such model were clinical engineering department response time (CE(rt), 0.415 positive coefficient), stock service response time (Stock(rt), 0.734 positive coefficient), priority level (0.21 positive coefficient) and service time (0.06 positive coefficient). The regression process showed heavy reliance on Stock(rt), CE(rt) and priority, in that order. Clustering techniques revealed the main causes of high TAT values. This examination has provided a means for analysing current technical service quality and effectiveness. In doing so, it has demonstrated a process for identifying areas and methods of improvement and a model against which to analyse these methods' effectiveness.
Algorithmic localisation of noise sources in the tip region of a low-speed axial flow fan
NASA Astrophysics Data System (ADS)
Tóth, Bence; Vad, János
2017-04-01
An objective and algorithmised methodology is proposed to analyse beamform data obtained for axial fans. Its application is demonstrated in a case study regarding the tip region of a low-speed cooling fan. First, beamforming is carried out in a co-rotating frame of reference. Then, a distribution of source strength is extracted along the circumference of the rotor at the blade tip radius in each analysed third-octave band. The circumferential distributions are expanded into Fourier series, which allows for filtering out the effects of perturbations, on the basis of an objective criterion. The remaining Fourier components are then considered as base sources to determine the blade-passage-periodic flow mechanisms responsible for the broadband noise. Based on their frequency and angular location, the base sources are grouped together. This is done using the fuzzy c-means clustering method to allow the overlap of the source mechanisms. The number of clusters is determined in a validity analysis. Finally, the obtained clusters are assigned to source mechanisms based on the literature. Thus, turbulent boundary layer - trailing edge interaction noise, tip leakage flow noise, and double leakage flow noise are identified.
Individual participant data meta-analyses should not ignore clustering
Abo-Zaid, Ghada; Guo, Boliang; Deeks, Jonathan J.; Debray, Thomas P.A.; Steyerberg, Ewout W.; Moons, Karel G.M.; Riley, Richard David
2013-01-01
Objectives Individual participant data (IPD) meta-analyses often analyze their IPD as if coming from a single study. We compare this approach with analyses that rather account for clustering of patients within studies. Study Design and Setting Comparison of effect estimates from logistic regression models in real and simulated examples. Results The estimated prognostic effect of age in patients with traumatic brain injury is similar, regardless of whether clustering is accounted for. However, a family history of thrombophilia is found to be a diagnostic marker of deep vein thrombosis [odds ratio, 1.30; 95% confidence interval (CI): 1.00, 1.70; P = 0.05] when clustering is accounted for but not when it is ignored (odds ratio, 1.06; 95% CI: 0.83, 1.37; P = 0.64). Similarly, the treatment effect of nicotine gum on smoking cessation is severely attenuated when clustering is ignored (odds ratio, 1.40; 95% CI: 1.02, 1.92) rather than accounted for (odds ratio, 1.80; 95% CI: 1.29, 2.52). Simulations show models accounting for clustering perform consistently well, but downwardly biased effect estimates and low coverage can occur when ignoring clustering. Conclusion Researchers must routinely account for clustering in IPD meta-analyses; otherwise, misleading effect estimates and conclusions may arise. PMID:23651765
Beverage consumption patterns at age 13–17 are associated with weight, height, and BMI at age 17
Marshall, Teresa A.; Van Buren, John M.; Warren, John J.; Cavanaugh, Joseph E.; Levy, Steven M.
2017-01-01
Background Sugar-sweetened beverages (SSBs) have been associated with obesity in children and adults; however, associations between beverage patterns and obesity are not understood. Objective To describe beverage patterns during adolescence, and the associations between adolescent beverage patterns and age 17 anthropometric measures. Design Cross-sectional analyses of longitudinally-collected data. Participants/setting Participants in the longitudinal Iowa Fluoride Study having at least one beverage questionnaire completed between ages 13.0 and 14.0 years, having a second questionnaire completed between 16.0 and 17.0 years and attending an age 17 clinic exam for weight and height measurements (n=369). Exposure Beverages were collapsed into 4 categories {i.e., 100% juice, milk, water and other sugar-free beverages (water/SFB), and SSBs} for the purpose of clustering. Five beverage clusters were identified from standardized age 13–17 mean daily beverage intakes and named by the authors for the dominant beverage: juice, milk, water/SFB, neutral and SSB. Outcome Age 17 weight, height and BMI. Statistical analyses Ward’s method for clustering of beverage variables. One-way ANOVA and chi-square tests for bivariable associations. Gamma regression for associations of weight or BMI (outcomes) with beverage clusters and demographic variables. Linear regression for associations of height (outcome) with beverage clusters and demographic variables. Results Participants with family incomes < $60,000 trended shorter (1.5±0.8 cm; P=0.070) and were heavier (2.0±0.7 BMI units; P=0.002) than participants with family incomes ≥ 60,000/year. Adjusted mean weight, height and BMI estimates differed by beverage cluster membership. For example, on average, male and female members of the neutral cluster were 4.5 cm (P=0.010) and 4.2 (P=0.034) cm shorter, respectively, than members of the milk cluster. For members of the juice cluster, the mean BMI was lower than for members of the milk cluster (by 2.4 units), water/SFB cluster (3.5 units), neutral cluster (2.2 units) and SSB cluster (3.2 units) (all Ps<0.05). Conclusions Age 13–17 year beverage patterns were associated with age 17 anthropometric measures and BMI in this sample. Beverage patterns might be characteristic of overall food choices and dietary behaviors that influence growth. PMID:28259744
Occupational risk factors for Wilms' tumor
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bunin, G.; Kramer, S.; Nass, C.
A matched case-control study of Wilms' tumor investigated parental occupational risk factors. Cases diagnosed in 1970-1983 were identified through a population-based tumor registry and hospital registries in the Greater Philadelphia area. Controls were selected by random digit dialing and were matched to cases on race, birth date (+/- 3 years), and the area code and exchange of the case's telephone number at diagnosis. Parents of 100 matched pairs were interviewed by telephone. Parents of patients and controls were generally similar in demographic characteristics, except that mothers differed in religion. Published schemes were used to group jobs into clusters of similarmore » exposures and to determine exposures from industry and job title. Analyses were done for preconception, pregnancy, and postnatal time periods. More case than control fathers had jobs in a cluster that includes machinists and welders (odds ratios (ORs) = 4.0-5.7, p less than or equal to 0.04). Paternal exposures to lead, silver, tin, and iron (some exposures of this cluster) were associated with Wilms' tumor in some analyses, with moderate odds ratios (ORs = 1.5-3.4). In general, the highest odds ratios were found for the preconception period among the genetic (prezygotic) cases. No maternal job clusters or exposures gave significantly elevated odds ratios. These results support a previous finding that lead is a risk factor, but not radiation, hydrocarbon, or boron exposures.« less
2009-01-01
Background Soybeans grown in the upper Midwestern United States often suffer from iron deficiency chlorosis, which results in yield loss at the end of the season. To better understand the effect of iron availability on soybean yield, we identified genes in two near isogenic lines with changes in expression patterns when plants were grown in iron sufficient and iron deficient conditions. Results Transcriptional profiles of soybean (Glycine max, L. Merr) near isogenic lines Clark (PI548553, iron efficient) and IsoClark (PI547430, iron inefficient) grown under Fe-sufficient and Fe-limited conditions were analyzed and compared using the Affymetrix® GeneChip® Soybean Genome Array. There were 835 candidate genes in the Clark (PI548553) genotype and 200 candidate genes in the IsoClark (PI547430) genotype putatively involved in soybean's iron stress response. Of these candidate genes, fifty-eight genes in the Clark genotype were identified with a genetic location within known iron efficiency QTL and 21 in the IsoClark genotype. The arrays also identified 170 single feature polymorphisms (SFPs) specific to either Clark or IsoClark. A sliding window analysis of the microarray data and the 7X genome assembly coupled with an iterative model of the data showed the candidate genes are clustered in the genome. An analysis of 5' untranslated regions in the promoter of candidate genes identified 11 conserved motifs in 248 differentially expressed genes, all from the Clark genotype, representing 129 clusters identified earlier, confirming the cluster analysis results. Conclusion These analyses have identified the first genes with expression patterns that are affected by iron stress and are located within QTL specific to iron deficiency stress. The genetic location and promoter motif analysis results support the hypothesis that the differentially expressed genes are co-regulated. The combined results of all analyses lead us to postulate iron inefficiency in soybean is a result of a mutation in a transcription factor(s), which controls the expression of genes required in inducing an iron stress response. PMID:19678937
Roca, Josep; Vargas, Claudia; Cano, Isaac; Selivanov, Vitaly; Barreiro, Esther; Maier, Dieter; Falciani, Francesco; Wagner, Peter; Cascante, Marta; Garcia-Aymerich, Judith; Kalko, Susana; De Mas, Igor; Tegnér, Jesper; Escarrabill, Joan; Agustí, Alvar; Gomez-Cabrero, David
2014-11-28
Heterogeneity in clinical manifestations and disease progression in Chronic Obstructive Pulmonary Disease (COPD) lead to consequences for patient health risk assessment, stratification and management. Implicit with the classical "spill over" hypothesis is that COPD heterogeneity is driven by the pulmonary events of the disease. Alternatively, we hypothesized that COPD heterogeneities result from the interplay of mechanisms governing three conceptually different phenomena: 1) pulmonary disease, 2) systemic effects of COPD and 3) co-morbidity clustering, each of them with their own dynamics. To explore the potential of a systems analysis of COPD heterogeneity focused on skeletal muscle dysfunction and on co-morbidity clustering aiming at generating predictive modeling with impact on patient management. To this end, strategies combining deterministic modeling and network medicine analyses of the Biobridge dataset were used to investigate the mechanisms of skeletal muscle dysfunction. An independent data driven analysis of co-morbidity clustering examining associated genes and pathways was performed using a large dataset (ICD9-CM data from Medicare, 13 million people). Finally, a targeted network analysis using the outcomes of the two approaches (skeletal muscle dysfunction and co-morbidity clustering) explored shared pathways between these phenomena. (1) Evidence of abnormal regulation of skeletal muscle bioenergetics and skeletal muscle remodeling showing a significant association with nitroso-redox disequilibrium was observed in COPD; (2) COPD patients presented higher risk for co-morbidity clustering than non-COPD patients increasing with ageing; and, (3) the on-going targeted network analyses suggests shared pathways between skeletal muscle dysfunction and co-morbidity clustering. The results indicate the high potential of a systems approach to address COPD heterogeneity. Significant knowledge gaps were identified that are relevant to shape strategies aiming at fostering 4P Medicine for patients with COPD.
Sethi, Suresh; Linden, Daniel; Wenburg, John; Lewis, Cara; Lemons, Patrick R.; Fuller, Angela K.; Hare, Matthew P.
2016-01-01
Error-tolerant likelihood-based match calling presents a promising technique to accurately identify recapture events in genetic mark–recapture studies by combining probabilities of latent genotypes and probabilities of observed genotypes, which may contain genotyping errors. Combined with clustering algorithms to group samples into sets of recaptures based upon pairwise match calls, these tools can be used to reconstruct accurate capture histories for mark–recapture modelling. Here, we assess the performance of a recently introduced error-tolerant likelihood-based match-calling model and sample clustering algorithm for genetic mark–recapture studies. We assessed both biallelic (i.e. single nucleotide polymorphisms; SNP) and multiallelic (i.e. microsatellite; MSAT) markers using a combination of simulation analyses and case study data on Pacific walrus (Odobenus rosmarus divergens) and fishers (Pekania pennanti). A novel two-stage clustering approach is demonstrated for genetic mark–recapture applications. First, repeat captures within a sampling occasion are identified. Subsequently, recaptures across sampling occasions are identified. The likelihood-based matching protocol performed well in simulation trials, demonstrating utility for use in a wide range of genetic mark–recapture studies. Moderately sized SNP (64+) and MSAT (10–15) panels produced accurate match calls for recaptures and accurate non-match calls for samples from closely related individuals in the face of low to moderate genotyping error. Furthermore, matching performance remained stable or increased as the number of genetic markers increased, genotyping error notwithstanding.
Sadsad, Rosemarie; Martinez, Elena; Jelfs, Peter; Hill-Cawthorne, Grant A.; Gilbert, Gwendolyn L.; Marais, Ben J.; Sintchenko, Vitali
2016-01-01
Background Improved tuberculosis control and the need to contain the spread of drug-resistant strains provide a strong rationale for exploring tuberculosis transmission dynamics at the population level. Whole-genome sequencing provides optimal strain resolution, facilitating detailed mapping of potential transmission pathways. Methods We sequenced 22 isolates from a Mycobacterium tuberculosis cluster in New South Wales, Australia, identified during routine 24-locus mycobacterial interspersed repetitive unit typing. Following high-depth paired-end sequencing using the Illumina HiSeq 2000 platform, two independent pipelines were employed for analysis, both employing read mapping onto reference genomes as well as de novo assembly, to control biases in variant detection. In addition to single-nucleotide polymorphisms, the analyses also sought to identify insertions, deletions and structural variants. Results Isolates were highly similar, with a distance of 13 variants between the most distant members of the cluster. The most sensitive analysis classified the 22 isolates into 18 groups. Four of the isolates did not appear to share a recent common ancestor with the largest clade; another four isolates had an uncertain ancestral relationship with the largest clade. Conclusion Whole genome sequencing, with analysis of single-nucleotide polymorphisms, insertions, deletions, structural variants and subpopulations, enabled the highest possible level of discrimination between cluster members, clarifying likely transmission pathways and exposing the complexity of strain origin. The analysis provides a basis for targeted public health intervention and enhanced classification of future isolates linked to the cluster. PMID:26938641
Rückert, Christian; Nübel, Ulrich; Blom, Jochen; Wirth, Thierry; Jaenicke, Sebastian; Schuback, Sieglinde; Rüsch-Gerdes, Sabine; Supply, Philip; Kalinowski, Jörn; Niemann, Stefan
2013-01-01
Background Understanding Mycobacterium tuberculosis (Mtb) transmission is essential to guide efficient tuberculosis control strategies. Traditional strain typing lacks sufficient discriminatory power to resolve large outbreaks. Here, we tested the potential of using next generation genome sequencing for identification of outbreak-related transmission chains. Methods and Findings During long-term (1997 to 2010) prospective population-based molecular epidemiological surveillance comprising a total of 2,301 patients, we identified a large outbreak caused by an Mtb strain of the Haarlem lineage. The main performance outcome measure of whole genome sequencing (WGS) analyses was the degree of correlation of the WGS analyses with contact tracing data and the spatio-temporal distribution of the outbreak cases. WGS analyses of the 86 isolates revealed 85 single nucleotide polymorphisms (SNPs), subdividing the outbreak into seven genome clusters (two to 24 isolates each), plus 36 unique SNP profiles. WGS results showed that the first outbreak isolates detected in 1997 were falsely clustered by classical genotyping. In 1998, one clone (termed “Hamburg clone”) started expanding, apparently independently from differences in the social environment of early cases. Genome-based clustering patterns were in better accordance with contact tracing data and the geographical distribution of the cases than clustering patterns based on classical genotyping. A maximum of three SNPs were identified in eight confirmed human-to-human transmission chains, involving 31 patients. We estimated the Mtb genome evolutionary rate at 0.4 mutations per genome per year. This rate suggests that Mtb grows in its natural host with a doubling time of approximately 22 h (400 generations per year). Based on the genome variation discovered, emergence of the Hamburg clone was dated back to a period between 1993 and 1997, hence shortly before the discovery of the outbreak through epidemiological surveillance. Conclusions Our findings suggest that WGS is superior to conventional genotyping for Mtb pathogen tracing and investigating micro-epidemics. WGS provides a measure of Mtb genome evolution over time in its natural host context. Please see later in the article for the Editors' Summary PMID:23424287
Parenting Styles and Home Obesogenic Environments
Johnson, Rachel; Welk, Greg; Saint-Maurice, Pedro F.; Ihmels, Michelle
2012-01-01
Parenting behaviors are known to have a major impact on childhood obesity but it has proven difficult to isolate the specific mechanism of influence. The present study uses Baumrind’s parenting typologies (authoritative, authoritarian, and permissive) to examine associations between parenting styles and parenting practices associated with childhood obesity. Data were collected from a diverse sample of children (n = 182, ages 7–10) in an urban school district in the United States. Parenting behaviors were assessed with the Parenting Styles and Dimension Questionnaire (PSDQ), a 58-item survey that categorizes parenting practices into three styles: authoritative, authoritarian, and permissive. Parent perceptions of the home obesogenic environment were assessed with the Family Nutrition and Physical Activity (FNPA) instrument, a simple 10 item instrument that has been shown in previous research to predict risk for overweight. Cluster analyses were used to identify patterns in the PSDQ data and these clusters were related to FNPA scores and measured BMI values in children (using ANCOVA analyses that controlled for parent income and education) to examine the impact of parenting styles on risk of overweight/obesity. The FNPA score was positively (and significantly) associated with scores on the authoritative parenting scale (r = 0.29) but negatively (and significantly) associated with scores on the authoritarian scale (r = −0.22) and permissive scale (r = −0.20). Permissive parenting was significantly associated with BMIz score but this is the only dimension that exhibited a relationship with BMI. A three-cluster solution explained 40.5% of the total variance and clusters were distinguishable by low and high z-scores on different PSDQ sub-dimensions. A cluster characterized as Permissive/Authoritarian (Cluster 2) had significantly lower FNPA scores (more obesogenic) than clusters characterized as Authoritative (Cluster 1) or Authoritarian/Authoritative (Cluster 3) after controlling for family income and parent education. No direct effects of cluster were evident on the BMI outcomes but the patterns were consistent with the FNPA outcomes. The results suggest that a permissive parenting style is associated with more obesogenic environments while an authoritative parenting style is associated with less obesogenic environments. PMID:22690202
Parenting styles and home obesogenic environments.
Johnson, Rachel; Welk, Greg; Saint-Maurice, Pedro F; Ihmels, Michelle
2012-04-01
Parenting behaviors are known to have a major impact on childhood obesity but it has proven difficult to isolate the specific mechanism of influence. The present study uses Baumrind's parenting typologies (authoritative, authoritarian, and permissive) to examine associations between parenting styles and parenting practices associated with childhood obesity. Data were collected from a diverse sample of children (n = 182, ages 7-10) in an urban school district in the United States. Parenting behaviors were assessed with the Parenting Styles and Dimension Questionnaire (PSDQ), a 58-item survey that categorizes parenting practices into three styles: authoritative, authoritarian, and permissive. Parent perceptions of the home obesogenic environment were assessed with the Family Nutrition and Physical Activity (FNPA) instrument, a simple 10 item instrument that has been shown in previous research to predict risk for overweight. Cluster analyses were used to identify patterns in the PSDQ data and these clusters were related to FNPA scores and measured BMI values in children (using ANCOVA analyses that controlled for parent income and education) to examine the impact of parenting styles on risk of overweight/obesity. The FNPA score was positively (and significantly) associated with scores on the authoritative parenting scale (r = 0.29) but negatively (and significantly) associated with scores on the authoritarian scale (r = -0.22) and permissive scale (r = -0.20). Permissive parenting was significantly associated with BMIz score but this is the only dimension that exhibited a relationship with BMI. A three-cluster solution explained 40.5% of the total variance and clusters were distinguishable by low and high z-scores on different PSDQ sub-dimensions. A cluster characterized as Permissive/Authoritarian (Cluster 2) had significantly lower FNPA scores (more obesogenic) than clusters characterized as Authoritative (Cluster 1) or Authoritarian/Authoritative (Cluster 3) after controlling for family income and parent education. No direct effects of cluster were evident on the BMI outcomes but the patterns were consistent with the FNPA outcomes. The results suggest that a permissive parenting style is associated with more obesogenic environments while an authoritative parenting style is associated with less obesogenic environments.
Borri, Marco; Schmidt, Maria A.; Powell, Ceri; Koh, Dow-Mu; Riddell, Angela M.; Partridge, Mike; Bhide, Shreerang A.; Nutting, Christopher M.; Harrington, Kevin J.; Newbold, Katie L.; Leach, Martin O.
2015-01-01
Purpose To describe a methodology, based on cluster analysis, to partition multi-parametric functional imaging data into groups (or clusters) of similar functional characteristics, with the aim of characterizing functional heterogeneity within head and neck tumour volumes. To evaluate the performance of the proposed approach on a set of longitudinal MRI data, analysing the evolution of the obtained sub-sets with treatment. Material and Methods The cluster analysis workflow was applied to a combination of dynamic contrast-enhanced and diffusion-weighted imaging MRI data from a cohort of squamous cell carcinoma of the head and neck patients. Cumulative distributions of voxels, containing pre and post-treatment data and including both primary tumours and lymph nodes, were partitioned into k clusters (k = 2, 3 or 4). Principal component analysis and cluster validation were employed to investigate data composition and to independently determine the optimal number of clusters. The evolution of the resulting sub-regions with induction chemotherapy treatment was assessed relative to the number of clusters. Results The clustering algorithm was able to separate clusters which significantly reduced in voxel number following induction chemotherapy from clusters with a non-significant reduction. Partitioning with the optimal number of clusters (k = 4), determined with cluster validation, produced the best separation between reducing and non-reducing clusters. Conclusion The proposed methodology was able to identify tumour sub-regions with distinct functional properties, independently separating clusters which were affected differently by treatment. This work demonstrates that unsupervised cluster analysis, with no prior knowledge of the data, can be employed to provide a multi-parametric characterization of functional heterogeneity within tumour volumes. PMID:26398888
Versteeg, Bart; Bruisten, Sylvia M; van der Ende, Arie; Pannekoek, Yvonne
2016-04-18
Chlamydia trachomatis infections remain the most common bacterial sexually transmitted infection worldwide. To gain more insight into the epidemiology and transmission of C. trachomatis, several schemes of multilocus sequence typing (MLST) have been developed. We investigated the clustering of C. trachomatis strains derived from men who have sex with men (MSM) and heterosexuals using the MLST scheme based on 7 housekeeping genes (MLST-7) adapted for clinical specimens and a high-resolution MLST scheme based on 6 polymorphic genes, including ompA (hr-MLST-6). Specimens from 100 C. trachomatis infected men who have sex with men (MSM) and 100 heterosexual women were randomly selected from previous studies and sequenced. We adapted the MLST-7 scheme to a nested assay to be suitable for direct typing of clinical specimens. All selected specimens were typed using both the adapted MLST-7 scheme and the hr-MLST-6 scheme. Clustering of C. trachomatis strains derived from MSM and heterosexuals was assessed using minimum spanning tree analysis. Sufficient chlamydial DNA was present in 188 of the 200 (94 %) selected samples. Using the adapted MLST-7 scheme, full MLST profiles were obtained for 187 of 188 tested specimens resulting in a high success rate of 99.5 %. Of these 187 specimens, 91 (48.7 %) were from MSM and 96 (51.3 %) from heterosexuals. We detected 21 sequence types (STs) using the adapted MLST-7 and 79 STs using the hr-MLST-6 scheme. Minimum spanning tree analyses was used to examine the clustering of MLST-7 data, which showed no reflection of separate transmission in MSM and heterosexual hosts. Moreover, typing using the hr-MLST-6 scheme identified genetically related clusters within each of clusters that were identified by using the MLST-7 scheme. No distinct transmission of C. trachomatis could be observed in MSM and heterosexuals using the adapted MLST-7 scheme in contrast to using the hr-MLST-6. In addition, we compared clustering of both MLST schemes and demonstrated that typing using the hr-MLST-6 scheme is able to identify genetically related clusters of C. trachomatis strains within each of the clusters that were identified by using the MLST-7 scheme.
Defining objective clusters for rabies virus sequences using affinity propagation clustering
Fischer, Susanne; Freuling, Conrad M.; Pfaff, Florian; Bodenhofer, Ulrich; Höper, Dirk; Fischer, Mareike; Marston, Denise A.; Fooks, Anthony R.; Mettenleiter, Thomas C.; Conraths, Franz J.; Homeier-Bachmann, Timo
2018-01-01
Rabies is caused by lyssaviruses, and is one of the oldest known zoonoses. In recent years, more than 21,000 nucleotide sequences of rabies viruses (RABV), from the prototype species rabies lyssavirus, have been deposited in public databases. Subsequent phylogenetic analyses in combination with metadata suggest geographic distributions of RABV. However, these analyses somewhat experience technical difficulties in defining verifiable criteria for cluster allocations in phylogenetic trees inviting for a more rational approach. Therefore, we applied a relatively new mathematical clustering algorythm named ‘affinity propagation clustering’ (AP) to propose a standardized sub-species classification utilizing full-genome RABV sequences. Because AP has the advantage that it is computationally fast and works for any meaningful measure of similarity between data samples, it has previously been applied successfully in bioinformatics, for analysis of microarray and gene expression data, however, cluster analysis of sequences is still in its infancy. Existing (516) and original (46) full genome RABV sequences were used to demonstrate the application of AP for RABV clustering. On a global scale, AP proposed four clusters, i.e. New World cluster, Arctic/Arctic-like, Cosmopolitan, and Asian as previously assigned by phylogenetic studies. By combining AP with established phylogenetic analyses, it is possible to resolve phylogenetic relationships between verifiably determined clusters and sequences. This workflow will be useful in confirming cluster distributions in a uniform transparent manner, not only for RABV, but also for other comparative sequence analyses. PMID:29357361
Eckert, Andrew J.; van Heerwaarden, Joost; Wegrzyn, Jill L.; Nelson, C. Dana; Ross-Ibarra, Jeffrey; González-Martínez, Santíago C.; Neale, David. B.
2010-01-01
Natural populations of forest trees exhibit striking phenotypic adaptations to diverse environmental gradients, thereby making them appealing subjects for the study of genes underlying ecologically relevant phenotypes. Here, we use a genome-wide data set of single nucleotide polymorphisms genotyped across 3059 functional genes to study patterns of population structure and identify loci associated with aridity across the natural range of loblolly pine (Pinus taeda L.). Overall patterns of population structure, as inferred using principal components and Bayesian cluster analyses, were consistent with three genetic clusters likely resulting from expansions out of Pleistocene refugia located in Mexico and Florida. A novel application of association analysis, which removes the confounding effects of shared ancestry on correlations between genetic and environmental variation, identified five loci correlated with aridity. These loci were primarily involved with abiotic stress response to temperature and drought. A unique set of 24 loci was identified as FST outliers on the basis of the genetic clusters identified previously and after accounting for expansions out of Pleistocene refugia. These loci were involved with a diversity of physiological processes. Identification of nonoverlapping sets of loci highlights the fundamental differences implicit in the use of either method and suggests a pluralistic, yet complementary, approach to the identification of genes underlying ecologically relevant phenotypes. PMID:20439779
Rolling epidemic of Legionnaires' disease outbreaks in small geographic areas.
MacIntyre, C Raina; Dyda, Amalie; Bui, Chau Minh; Chughtai, Abrar Ahmad
2018-03-21
Legionnaires' disease (LD) is reported from many parts of the world, mostly linked to drinking water sources or cooling towers. We reviewed two unusual rolling outbreaks in Sydney and New York, each clustered in time and space. Data on these outbreaks were collected from public sources and compared to previous outbreaks in Australia and the US. While recurrent outbreaks of LD over time linked to an identified single source have been described, multiple unrelated outbreaks clustered in time and geography have not been previously described. We describe unusual geographic and temporal clustering of Legionella outbreaks in two cities, each of which experienced multiple different outbreaks within a small geographic area and within a short timeframe. The explanation for this temporal and spatial clustering of LD outbreaks in two cities is not clear, but climate variation and deteriorating water sanitation are two possible explanations. There is a need to critically analyse LD outbreaks and better understand changing trends to effectively prevent disease.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Andersson, Gunther G., E-mail: gunther.andersson@flinders.edu.au, E-mail: vladimir.golovko@canterbury.ac.nz, E-mail: greg.metha@adelaide.edu.au; Al Qahtani, Hassan S.; Golovko, Vladimir B., E-mail: gunther.andersson@flinders.edu.au, E-mail: vladimir.golovko@canterbury.ac.nz, E-mail: greg.metha@adelaide.edu.au
Chemically made, atomically precise phosphine-stabilized clusters Au{sub 9}(PPh{sub 3}){sub 8}(NO{sub 3}){sub 3} were deposited on titania and silica from solutions at various concentrations and the samples heated under vacuum to remove the ligands. Metastable induced electron spectroscopy was used to determine the density of states at the surface, and X-ray photoelectron spectroscopy for analysing the composition of the surface. It was found for the Au{sub 9} cluster deposited on titania that the ligands react with the titania substrate. Based on analysis using the singular value decomposition algorithm, the series of MIE spectra can be described as a linear combination ofmore » 3 base spectra that are assigned to the spectra of the substrate, the phosphine ligands on the substrate, and the Au clusters anchored to titania after removal of the ligands. On silica, the Au clusters show significant agglomeration after heat treatment and no interaction of the ligands with the substrate can be identified.« less
Anaplasma phagocytophilum in sheep and goats in central and southeastern China.
Yang, Jifei; Liu, Zhijie; Niu, Qingli; Liu, Junlong; Han, Rong; Guan, Guiquan; Li, Youquan; Liu, Guangyuan; Luo, Jianxun; Yin, Hong
2016-11-21
Anaplasma phagocytophilum is wide spread throughout the world and impacts both human and animal health. Several distinct ecological clusters and ecotypes of the agent have been established on the basis of various genetic loci. However, information on the genetic variability of A. phagocytophilum isolates in China represents a gap in knowledge. The objective of this study was to determine the prevalence and genetic characterization of A. phagocytophilum in small ruminants in central and southeastern China. The presence of A. phagocytophilum was determined in 421 blood samples collected from small ruminants by PCR. Positive samples were genetically characterized based on 16S rRNA and groEL genes. Statistical analyses were conducted to identify ecotypes of A. phagocytophilum strains, to assess their host range and zoonotic potential. Out of 421 sampled small ruminants, 106 (25.2%) were positive for A. phagocytophilum. The positive rate was higher in sheep (35.1%, 40/114) than in goats (26.4%, 66/307) (P < 0.05). Sequence analyses revealed that the isolates identified in this study were placed on two separate clades, indicating that two 16S rRNA variants of A. phagocytophilum were circulating in small ruminants in China. However, analysis using obtained groEL sequences in this study formed one cluster, which was separate from other known ecotypes reported in Europe. In addition, a novel Anaplasma sp. was identified and closely related to an isolate previously reported in Hyalomma asiaticum, which clustered independently from all recognized Anaplasma species. A molecular survey of A. phagocytophilum was conducted in sheep and goats from ten provinces in central and southeastern China. Two 16S rRNA variants and a new ecotype of A. phagocytophilum were identified in small ruminants in China. Moreover, a potential novel Anaplasma species was reported in goats. Our findings provide additional information on the complexity of A. phagocytophilum in terms of genetic diversity in China.
Haynos, Ann F.; Pearson, Carolyn M.; Utzinger, Linsey M.; Wonderlich, Stephen A.; Crosby, Ross D.; Mitchell, James E.; Crow, Scott J.; Peterson, Carol B.
2016-01-01
Objective Evidence suggests that eating disorder subtypes reflecting under-controlled, over-controlled, and low psychopathology personality traits constitute reliable phenotypes that differentiate treatment response. This study is the first to use statistical analyses to identify these subtypes within treatment-seeking individuals with bulimia nervosa (BN) and to use these statistically derived clusters to predict clinical outcomes. Methods Using variables from the Dimensional Assessment of Personality Pathology–Basic Questionnaire, K-means cluster analyses identified under-controlled, over-controlled, and low psychopathology subtypes within BN patients (n = 80) enrolled in a treatment trial. Generalized linear models examined the impact of personality subtypes on Eating Disorder Examination global score, binge eating frequency, and purging frequency cross-sectionally at baseline and longitudinally at end of treatment (EOT) and follow-up. In the longitudinal models, secondary analyses were conducted to examine personality subtype as a potential moderator of response to Cognitive Behavioral Therapy-Enhanced (CBT-E) or Integrative Cognitive-Affective Therapy for BN (ICAT-BN). Results There were no baseline clinical differences between groups. In the longitudinal models, personality subtype predicted binge eating (p = .03) and purging (p = .01) frequency at EOT and binge eating frequency at follow-up (p = .045). The over-controlled group demonstrated the best outcomes on these variables. In secondary analyses, there was a treatment by subtype interaction for purging at follow-up (p = .04), which indicated a superiority of CBT-E over ICAT-BN for reducing purging among the over-controlled group. Discussion Empirically derived personality subtyping is appears to be a valid classification system with potential to guide eating disorder treatment decisions. PMID:27611235
Changing the paradigm: messages for hand hygiene education and audit from cluster analysis.
Gould, D J; Navaie, D; Purssell, E; Drey, N S; Creedon, S
2018-04-01
Hand hygiene is considered to be the foremost infection prevention measure. How healthcare workers accept and make sense of the hand hygiene message is likely to contribute to the success and sustainability of initiatives to improve performance, which is often poor. A survey of nurses in critical care units in three National Health Service trusts in England was undertaken to explore opinions about hand hygiene, use of alcohol hand rubs, audit with performance feedback, and other key hand-hygiene-related issues. Data were analysed descriptively and subjected to cluster analysis. Three main clusters of opinion were visualized, each forming a significant group: positive attitudes, pragmatism and scepticism. A smaller cluster suggested possible guilt about ability to perform hand hygiene. Cluster analysis identified previously unsuspected constellations of beliefs about hand hygiene that offer a plausible explanation for behaviour. Healthcare workers might respond to education and audit differently according to these beliefs. Those holding predominantly positive opinions might comply with hand hygiene policy and perform well as infection prevention link nurses and champions. Those holding pragmatic attitudes are likely to respond favourably to the need for professional behaviour and need to protect themselves from infection. Greater persuasion may be needed to encourage those who are sceptical about the importance of hand hygiene to comply with guidelines. Interventions to increase compliance should be sufficiently broad in scope to tackle different beliefs. Alternatively, cluster analysis of hand hygiene beliefs could be used to identify the most effective educational and monitoring strategies for a particular clinical setting. Copyright © 2017 The Healthcare Infection Society. Published by Elsevier Ltd. All rights reserved.
Huang, Hung-Jin; Mandelbaum, Rachel; Freeman, Peter E.; ...
2017-11-23
We study the orientations of satellite galaxies in redMaPPer clusters constructed from the Sloan Digital Sky Survey at 0.1 < z < 0.35 to determine whether there is any preferential tendency for satellites to point radially towards cluster centres. Here, we analyse the satellite alignment (SA) signal based on three shape measurement methods (re-Gaussianization, de Vaucouleurs, and isophotal shapes), which trace galaxy light profiles at different radii. The measured SA signal depends on these shape measurement methods. We detect the strongest SA signal in isophotal shapes, followed by de Vaucouleurs shapes. While no net SA signal is detected using re-Gaussianizationmore » shapes across the entire sample, the observed SA signal reaches a statistically significant level when limiting to a subsample of higher luminosity satellites. We further investigate the impact of noise, systematics, and real physical isophotal twisting effects in the comparison between the SA signal detected via different shape measurement methods. Unlike previous studies, which only consider the dependence of SA on a few parameters, here we explore a total of 17 galaxy and cluster properties, using a statistical model averaging technique to naturally account for parameter correlations and identify significant SA predictors. We find that the measured SA signal is strongest for satellites with the following characteristics: higher luminosity, smaller distance to the cluster centre, rounder in shape, higher bulge fraction, and distributed preferentially along the major axis directions of their centrals. Finally, we provide physical explanations for the identified dependences and discuss the connection to theories of SA.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huang, Hung-Jin; Mandelbaum, Rachel; Freeman, Peter E.
We study the orientations of satellite galaxies in redMaPPer clusters constructed from the Sloan Digital Sky Survey at 0.1 < z < 0.35 to determine whether there is any preferential tendency for satellites to point radially towards cluster centres. Here, we analyse the satellite alignment (SA) signal based on three shape measurement methods (re-Gaussianization, de Vaucouleurs, and isophotal shapes), which trace galaxy light profiles at different radii. The measured SA signal depends on these shape measurement methods. We detect the strongest SA signal in isophotal shapes, followed by de Vaucouleurs shapes. While no net SA signal is detected using re-Gaussianizationmore » shapes across the entire sample, the observed SA signal reaches a statistically significant level when limiting to a subsample of higher luminosity satellites. We further investigate the impact of noise, systematics, and real physical isophotal twisting effects in the comparison between the SA signal detected via different shape measurement methods. Unlike previous studies, which only consider the dependence of SA on a few parameters, here we explore a total of 17 galaxy and cluster properties, using a statistical model averaging technique to naturally account for parameter correlations and identify significant SA predictors. We find that the measured SA signal is strongest for satellites with the following characteristics: higher luminosity, smaller distance to the cluster centre, rounder in shape, higher bulge fraction, and distributed preferentially along the major axis directions of their centrals. Finally, we provide physical explanations for the identified dependences and discuss the connection to theories of SA.« less
Typologies of Social Support and Associations with Mental Health Outcomes Among LGBT Youth
Birkett, Michelle A.; Mustanski, Brian
2015-01-01
Abstract Purpose: Lesbian, gay, bisexual, and transgender (LGBT) youth show increased risk for a number of negative mental health outcomes, which research has linked to minority stressors such as victimization. Further, social support promotes positive mental health outcomes for LGBT youth, and different sources of social support show differential relationships with mental health outcomes. However, little is known about how combinations of different sources of support impact mental health. Methods: In the present study, we identify clusters of family, peer, and significant other social support and then examine demographic and mental health differences by cluster in an analytic sample of 232 LGBT youth between the ages of 16 and 20 years. Results: Using k-means cluster analysis, three social support cluster types were identified: high support (44.0% of participants), low support (21.6%), and non-family support (34.5%). A series of chi-square tests were used to examine demographic differences between these clusters, which were found for socio-economic status (SES). Regression analyses indicated that, while controlling for victimization, individuals within the three clusters showed different relationships with multiple mental health outcomes: loneliness, hopelessness, depression, anxiety, somatization, general symptom severity, and symptoms of major depressive disorder (MDD). Conclusion: Findings suggest the combinations of sources of support LGBT youth receive are related to their mental health. Higher SES youth are more likely to receive support from family, peers, and significant others. For most mental health outcomes, family support appears to be an especially relevant and important source of support to target for LGBT youth. PMID:26790019
Bignell, Dawn R D; Seipke, Ryan F; Huguet-Tapia, José C; Chambers, Alan H; Parry, Ronald J; Loria, Rosemary
2010-02-01
Plant-pathogenic Streptomyces spp. cause scab disease on economically important root and tuber crops, the most important of which is potato. Key virulence determinants produced by these species include the cellulose synthesis inhibitor, thaxtomin A, and the secreted Nec1 protein that is required for colonization of the plant host. Recently, the genome sequence of Streptomyces scabies 87-22 was completed, and a biosynthetic cluster was identified that is predicted to synthesize a novel compound similar to coronafacic acid (CFA), a component of the virulence-associated coronatine phytotoxin produced by the plant-pathogenic bacterium Pseudomonas syringae. Southern analysis indicated that the cfa-like cluster in S. scabies 87-22 is likely conserved in other strains of S. scabies but is absent from two other pathogenic streptomycetes, S. turgidiscabies and S. acidiscabies. Transcriptional analyses demonstrated that the cluster is expressed during plant-microbe interactions and that expression requires a transcriptional regulator embedded in the cluster as well as the bldA tRNA. A knockout strain of the biosynthetic cluster displayed a reduced virulence phenotype on tobacco seedlings compared with the wild-type strain. Thus, the cfa-like biosynthetic cluster is a newly discovered locus in S. scabies that contributes to host-pathogen interactions.
Geotemporal Analysis of Neisseria meningitidis Clones in the United States: 2000–2005
Wiringa, Ann E.; Shutt, Kathleen A.; Marsh, Jane W.; Cohn, Amanda C.; Messonnier, Nancy E.; Zansky, Shelley M.; Petit, Susan; Farley, Monica M.; Gershman, Ken; Lynfield, Ruth; Reingold, Arthur; Schaffner, William; Thompson, Jamie; Brown, Shawn T.; Lee, Bruce Y.; Harrison, Lee H.
2013-01-01
Background The detection of meningococcal outbreaks relies on serogrouping and epidemiologic definitions. Advances in molecular epidemiology have improved the ability to distinguish unique Neisseria meningitidis strains, enabling the classification of isolates into clones. Around 98% of meningococcal cases in the United States are believed to be sporadic. Methods Meningococcal isolates from 9 Active Bacterial Core surveillance sites throughout the United States from 2000 through 2005 were classified according to serogroup, multilocus sequence typing, and outer membrane protein (porA, porB, and fetA) genotyping. Clones were defined as isolates that were indistinguishable according to this characterization. Case data were aggregated to the census tract level and all non-singleton clones were assessed for non-random spatial and temporal clustering using retrospective space-time analyses with a discrete Poisson probability model. Results Among 1,062 geocoded cases with available isolates, 438 unique clones were identified, 78 of which had ≥2 isolates. 702 cases were attributable to non-singleton clones, accounting for 66.0% of all geocoded cases. 32 statistically significant clusters comprised of 107 cases (10.1% of all geocoded cases) were identified. Clusters had the following attributes: included 2 to 11 cases; 1 day to 33 months duration; radius of 0 to 61.7 km; and attack rate of 0.7 to 57.8 cases per 100,000 population. Serogroups represented among the clusters were: B (n = 12 clusters, 45 cases), C (n = 11 clusters, 27 cases), and Y (n = 9 clusters, 35 cases); 20 clusters (62.5%) were caused by serogroups represented in meningococcal vaccines that are commercially available in the United States. Conclusions Around 10% of meningococcal disease cases in the U.S. could be assigned to a geotemporal cluster. Molecular characterization of isolates, combined with geotemporal analysis, is a useful tool for understanding the spread of virulent meningococcal clones and patterns of transmission in populations. PMID:24349182
Genomic analyses of bacterial porin-cytochrome gene clusters
Shi, Liang; Fredrickson, James K.; Zachara, John M.
2014-11-26
In this study, the porin-cytochrome (Pcc) protein complex is responsible for trans-outer membrane electron transfer during extracellular reduction of Fe(III) by the dissimilatory metal-reducing bacterium Geobacter sulfurreducens PCA. The identified and characterized Pcc complex of G. sulfurreducens PCA consists of a porin-like outer-membrane protein, a periplasmic 8-heme c type cytochrome (c-Cyt) and an outer-membrane 12-heme c-Cyt, and the genes encoding the Pcc proteins are clustered in the same regions of genome (i.e., the pcc gene clusters) of G. sulfurreducens PCA. A survey of additionally microbial genomes has identified the pcc gene clusters in all sequenced Geobacter spp. and other bacteriamore » from six different phyla, including Anaeromyxobacter dehalogenans 2CP-1, A. dehalogenans 2CP-C, Anaeromyxobacter sp. K, Candidatus Kuenenia stuttgartiensis, Denitrovibrio acetiphilus DSM 12809, Desulfurispirillum indicum S5, Desulfurivibrio alkaliphilus AHT2, Desulfurobacterium thermolithotrophum DSM 11699, Desulfuromonas acetoxidans DSM 684, Ignavibacterium album JCM 16511, and Thermovibrio ammonificans HB-1. The numbers of genes in the pcc gene clusters vary, ranging from two to nine. Similar to the metal-reducing (Mtr) gene clusters of other Fe(III)-reducing bacteria, such as Shewanella spp., additional genes that encode putative c-Cyts with predicted cellular localizations at the cytoplasmic membrane, periplasm and outer membrane often associate with the pcc gene clusters. This suggests that the Pcc-associated c-Cyts may be part of the pathways for extracellular electron transfer reactions. The presence of pcc gene clusters in the microorganisms that do not reduce solid-phase Fe(III) and Mn(IV) oxides, such as D. alkaliphilus AHT2 and I. album JCM 16511, also suggests that some of the pcc gene clusters may be involved in extracellular electron transfer reactions with the substrates other than Fe(III) and Mn(IV) oxides.« less
Nezametdinova, V Z; Mavletova, D A; Alekseeva, M G; Chekalina, M S; Zakharevich, N V; Danilenko, V N
2018-06-01
The objective of this study was to determine for phosphorylated substrates of the species-specific serine-threonine protein kinase (STPK) Pkb2 from Bifidobacterium longum subsp. longum GT15. Two approaches were employed: analyses of phosphorylated membrane vesicles protein spectra following kinase reactions and analyses of the genes surrounding pkb2. A bioinformatics analysis of the genes surrounding pkb2 found a species-specific gene cluster PFNA in the genomes of 34 different bifidobacterial species. The identified cluster consisted of 5-8 genes depending on the species. The first five genes are characteristic for all considered species. These are the following genes encoding serine-threonine protein kinase (pkb2), fibronectin type III domain-containing protein (fn3), AAA-ATPase (aaa-atp), hypothetical protein with DUF58 domain (duf58) and transglutaminase (tgm). The sixth (protein phosphatase, prpC), seventh (hypothetical protein, BLGT_RS02790), and eighth (FHA domain-containing protein, fha) genes are included in this cluster, but they are not found in all species. The operon organization of the PFNA gene cluster was confirmed with transcriptional analysis. AAA-ATPase, which is encoded by a gene of the PFNA gene cluster, was found to be a substrate of the STPK Pkb2. Fourteen AAA-ATPase sites (seven serine, six threonine, and one tyrosine) phosphorylated by STPK Pkb2 were revealed. Analysis of the spectra of phosphorylated membrane vesicles proteins allowed us to identify eleven proteins that were considered as possible Pkb2 substrates. They belong to several functional classes: proteins involved in transcription and translation; proteins of the F1-domain of the FoF1-ATPase; ABC-transporters; molecular chaperone GroEL; and glutamine synthase, GlnA1. All identified proteins were considered moonlighting proteins. Three out of 11 proteins (glutamine synthetase GlnA1 and FoF1-ATPase alpha and beta subunits) were selected for further in vitro phosphorylation assays and were shown to be phosphorylated by Pkb2. Four phosphorylated substrates of the species-specific STPK Pkb2 from B. longum subsp. longum GT15 were identified for the first time. They included the moonlighting protein glutamine synthase GlnA, FoF1-ATPase alpha and beta subunits, and the chaperone MoxR family of AAA-ATPase. The ability of bifidobacterial STPK to phosphorylate the substrate on serine, threonine, and tyrosine residues was shown for the first time. Copyright © 2018 Elsevier Ltd. All rights reserved.
Organic Food Market Segmentation in Lebanon
NASA Astrophysics Data System (ADS)
Tleis, Malak; Roma, Rocco; Callieris, Roberta
2015-04-01
Organic farming in Lebanon is not a new concept. It started with the efforts of the private sector more than a decade ago and is still present even with the limited agricultural production. The local market is quite developed in comparison to neighboring countries, depending mainly on imports. Few studies were addressed to organic consumption in Lebanon, were none of them dealt with organic consumers analysis. Therefore, our objectives were to identify the profiles of Lebanese organic consumer and non organic consumer and to propose appropriate marketing strategies for each segment of consumer with the final aim of developing the Lebanese organic market. A survey, based on the use of closed-ended questionnaire, was addressed to 400 consumers in the capital, Beirut, from the end of February till the end of March 2014. Data underwent descriptive analyses, principal component analyses (PCA) and cluster analyses (k-means method) through the statistical software SPSS. Four cluster were obtained based on psychographic characteristics and willingness to pay (WTP) for the principal organic products purchased. "Localists" and "Health conscious" clusters constituted the largest proportion of the selected sample, thus were the most critical to be addressed by specific marketing strategies emphasizing the combination of local and organic food and the healthy properties of organic products. "Rational" and "Irregular" cluster were relatively small groups, addressed by pricing and promotional strategies. This study showed a positive attitude among Lebanese consumer towards organic food, where egoistic motives are prevailing over altruistic motives. High prices of organic commodities and low trust in organic farming, remain a constraint to levitating organic consumption. The combined efforts of the public and the private sector are required to spread the knowledge about positive environmental payback of organic agriculture and for the promotion of locally produced organic goods.
High ozone levels in the northeast of Portugal: Analysis and characterization
NASA Astrophysics Data System (ADS)
Carvalho, A.; Monteiro, A.; Ribeiro, I.; Tchepel, O.; Miranda, A. I.; Borrego, C.; Saavedra, S.; Souto, J. A.; Casares, J. J.
2010-03-01
Each summer period extremely high ozone levels are registered at the rural background station of Lamas d'Olo, located in the Northeast of Portugal. In average, 30% of the total alert threshold registered in Portugal is detected at this site. The main purpose of this study is to characterize the atmospheric conditions that lead to the ozone-rich episodes at this site. Synoptic patterns anomalies and back trajectories cluster analysis were performed, for the period between 2004 and 2007, considering 76 days when ozone maximum hourly concentrations were above 200 μg m -3. The obtained atmospheric anomaly fields suggested that a positive temperature anomaly is visible above the Iberian Peninsula. A strong wind flow pattern from NE is observable in the North of Portugal and Galicia, in Spain. These two features may lead to an enhancement of the photochemical production and to the transport of pollutants from Spain to Portugal. In addition, the 3D mean back trajectories associated to the ozone episode days were analysed. A clustering method has been applied to the obtained back trajectories. Four main clusters of ozone-rich episodes were identified, with different frequencies of occurrence: north-westerly flows (11%); north-easterly flows (45%), southern flow (4%) and westerly flows (40%). Both analyses highlight the NE flow as a dominant pattern over the North of Portugal during summer. The analysis of the ozone concentrations for each selected cluster indicates that this northeast circulation pattern, together with the southern flow, are responsible for the highest ozone peak episodes. This also suggests that long-range transport of atmospheric pollutants is the main contributor to the ozone levels registered at Lamas d'Olo. This is also highlighted by the correlation of the ozone time-series with the meteorological parameters analysed in the frequency domain.
ANALYSIS AND CHARACTERIZATION OF OZONE-RICH EPISODES IN NORTHEAST PORTUGAL
NASA Astrophysics Data System (ADS)
Carvalho, A.; Monteiro, A.; Ribeiro, I.; Tchepel, O.; Miranda, A.; Borrego, C.; Saavedra, S.; Souto, J. A.; Casares, J. J.
2009-12-01
Each summer period extremely high ozone levels are registered at the rural background station of Lamas d’Olo, located in the Northeast of Portugal. In average, 30% of the total alert threshold registered in Portugal is detected at this site. The main purpose of this study is to characterize the atmospheric conditions that lead to the ozone-rich episodes. Synoptic patterns anomalies and back trajectories cluster analysis were performed for a period of 76 days where ozone maximum concentrations were above 200 µg.m-3. This analysis was performed for the period between 2004 and 2007. The obtained anomaly fields suggested that a positive temperature anomaly is visible above the Iberian Peninsula. In addition, a strong wind flow pattern from NE is visible in the North of Portugal and Galicia, in Spain. These two features may lead to an enhancement of the photochemical production and to the transport of pollutants from Spain to Portugal. In addition, the 3D mean back trajectories associated to the ozone episode days were analysed. A clustering method has been applied to the obtained back trajectories. Four main clusters of ozone-rich episodes were identified, with different frequencies of occurrence: north-westerly flows (11%); north-easterly flows (45%), southern flow (4%) and westerly flows (40%). Both analyses highlight the NE flow as a dominant pattern over the North of Portugal. The analysis of the ozone concentrations for each selected cluster indicates that this northeast circulation pattern, together with the southern flow, is responsible for the highest ozone peak episodes. This also suggests that long-range transport of atmospheric pollutants may be the main contributor to the ozone levels registered at Lamas d’Olo. This is also highlighted by the correlation of the ozone time series with the meteorological parameters analysed in the frequency domain.
A Locus Encoding Variable Defense Systems against Invading DNA Identified in Streptococcus suis
Okura, Masatoshi; Nozawa, Takashi; Watanabe, Takayasu; Murase, Kazunori; Nakagawa, Ichiro; Takamatsu, Daisuke; Osaki, Makoto; Sekizaki, Tsutomu; Gottschalk, Marcelo; Hamada, Shigeyuki
2017-01-01
Streptococcus suis, an important zoonotic pathogen, is known to have an open pan-genome and to develop a competent state. In S. suis, limited genetic lineages are suggested to be associated with zoonosis. However, little is known about the evolution of diversified lineages and their respective phenotypic or ecological characteristics. In this study, we performed comparative genome analyses of S. suis, with a focus on the competence genes, mobile genetic elements, and genetic elements related to various defense systems against exogenous DNAs (defense elements) that are associated with gene gain/loss/exchange mediated by horizontal DNA movements and their restrictions. Our genome analyses revealed a conserved competence-inducing peptide type (pherotype) of the competence system and large-scale genome rearrangements in certain clusters based on the genome phylogeny of 58 S. suis strains. Moreover, the profiles of the defense elements were similar or identical to each other among the strains belonging to the same genomic clusters. Our findings suggest that these genetic characteristics of each cluster might exert specific effects on the phenotypic or ecological differences between the clusters. We also found certain loci that shift several types of defense elements in S. suis. Of note, one of these loci is a previously unrecognized variable region in bacteria, at which strains of distinct clusters code for different and various defense elements. This locus might represent a novel defense mechanism that has evolved through an arms race between bacteria and invading DNAs, mediated by mobile genetic elements and genetic competence. PMID:28379509
Park, Soo-Yun; Lim, Sun-Hyung; Ha, Sun-Hwa; Yeo, Yunsoo; Park, Woo Tae; Kwon, Do Yeon; Park, Sang Un; Kim, Jae Kwang
2013-07-17
In the present study, carotenoids, anthocyanins, and phenolic acids of cauliflowers ( Brassica oleracea L. ssp. botrytis) with various colored florets (white, yellow, green, and purple) were characterized to determine their phytochemical diversity. Additionally, 48 metabolites comprising amino acids, organic acids, sugars, and sugar alcohols were identified using gas chromatography-time-of-flight mass spectrometry (GC-TOFMS). Carotenoid content was considerably higher in green cauliflower; anthocyanins were detected only in purple cauliflower. Phenolic acids were higher in both green and purple cauliflower. Results of partial least-squares discriminant, Pearson correlation, and hierarchical clustering analyses showed that green cauliflower is distinct on the basis of the high levels of amino acids and clusters derived from common or closely related biochemical pathways. These results suggest that GC-TOFMS-based metabolite profiling, combined with chemometrics, is a useful tool for determining phenotypic variation and identifying metabolic networks connecting primary and secondary metabolism.
Santangelo, G M; Tornow, J
1997-12-01
As part of an effort to identify random carbon-source-regulated promoters in the Saccharomyces cerevisiae genome, we discovered that a mitochondrial DNA fragment is capable of directing glucose-repressible expression of a reporter gene. This fragment (CR24) originated from the mitochondrial genome adjacent to a transcription initiation site. Mutational analyses identified a GC cluster within the fragment that is required for transcriptional induction. Repression of nuclear CR24-driven transcription required Reg1p, indicating that this mitochondrially derived promoter is a member of a large group of glucose-repressible nuclear promoters that are similarly regulated by Reg1p. In vivo and in vitro binding assays indicated the presence of factors, located within the nucleus and the mitochondria, that bind to the GC cluster. One or more of these factors may provide a regulatory link between the nucleus and mitochondria.
Keitel, Anne; Gross, Joachim
2016-06-01
The human brain can be parcellated into diverse anatomical areas. We investigated whether rhythmic brain activity in these areas is characteristic and can be used for automatic classification. To this end, resting-state MEG data of 22 healthy adults was analysed. Power spectra of 1-s long data segments for atlas-defined brain areas were clustered into spectral profiles ("fingerprints"), using k-means and Gaussian mixture (GM) modelling. We demonstrate that individual areas can be identified from these spectral profiles with high accuracy. Our results suggest that each brain area engages in different spectral modes that are characteristic for individual areas. Clustering of brain areas according to similarity of spectral profiles reveals well-known brain networks. Furthermore, we demonstrate task-specific modulations of auditory spectral profiles during auditory processing. These findings have important implications for the classification of regional spectral activity and allow for novel approaches in neuroimaging and neurostimulation in health and disease.
Hyde, Jonathan M; DaCosta, Gérald; Hatzoglou, Constantinos; Weekes, Hannah; Radiguet, Bertrand; Styman, Paul D; Vurpillot, Francois; Pareige, Cristelle; Etienne, Auriane; Bonny, Giovanni; Castin, Nicolas; Malerba, Lorenzo; Pareige, Philippe
2017-04-01
Irradiation of reactor pressure vessel (RPV) steels causes the formation of nanoscale microstructural features (termed radiation damage), which affect the mechanical properties of the vessel. A key tool for characterizing these nanoscale features is atom probe tomography (APT), due to its high spatial resolution and the ability to identify different chemical species in three dimensions. Microstructural observations using APT can underpin development of a mechanistic understanding of defect formation. However, with atom probe analyses there are currently multiple methods for analyzing the data. This can result in inconsistencies between results obtained from different researchers and unnecessary scatter when combining data from multiple sources. This makes interpretation of results more complex and calibration of radiation damage models challenging. In this work simulations of a range of different microstructures are used to directly compare different cluster analysis algorithms and identify their strengths and weaknesses.
Shamseldin, Abdelaal; Carro, Lorena; Peix, Alvaro; Velázquez, Encarna; Moawad, Hassan; Sadowsky, Michael J
2016-06-01
In the present work we analyzed the taxonomic status of several Rhizobium strains isolated from Trifolium alexandrinum L. nodules in Egypt. The 16S rRNA genes of these strains were identical to those of Rhizobium bangladeshense BLR175(T) and Rhizobium binae BLR195(T). However, the analyses of recA and atpD genes split the strains into two clusters. Cluster II strains are identified as R. bangladeshense with >98% similarity values in both genes. The cluster I strains are phylogenetically related to Rhizobium etli CFN42(T) and R. bangladeshense BLR175(T), but with less than 94% similarity values in recA and atpD genes. DNA-DNA hybridization analysis showed 42% and 48% average relatedness between the strain 1010(T) from cluster I with respect to R. bangladeshense BLR175(T) and R. etli CFN42(T), respectively. Phenotypic characteristics of cluster I strains also differed from those of their closest related Rhizobium species. Analysis of the nodC gene showed that the strains belong to two groups within the symbiovar trifolii which was identified in Egypt linked to the species R. bangladeshense. Based on the genotypic and phenotypic characteristics, the group I strains belong to a new species for which the name Rhizobium aegyptiacum sp. nov. (sv. trifolii) is proposed, with strain 1010(T) being designated as the type strain (= USDA 7124(T)=LMG 29296(T)=CECT 9098(T)). Copyright © 2016 Elsevier GmbH. All rights reserved.
Deschamps, Kevin; Matricali, Giovanni Arnoldo; Roosen, Philip; Desloovere, Kaat; Bruyninckx, Herman; Spaepen, Pieter; Nobels, Frank; Tits, Jos; Flour, Mieke; Staes, Filip
2013-01-01
Background The aim of this study was to identify groups of subjects with similar patterns of forefoot loading and verify if specific groups of patients with diabetes could be isolated from non-diabetics. Methodology/Principal Findings Ninety-seven patients with diabetes and 33 control participants between 45 and 70 years were prospectively recruited in two Belgian Diabetic Foot Clinics. Barefoot plantar pressure measurements were recorded and subsequently analysed using a semi-automatic total mapping technique. Kmeans cluster analysis was applied on relative regional impulses of six forefoot segments in order to pursue a classification for the control group separately, the diabetic group separately and both groups together. Cluster analysis led to identification of three distinct groups when considering only the control group. For the diabetic group, and the computation considering both groups together, four distinct groups were isolated. Compared to the cluster analysis of the control group an additional forefoot loading pattern was identified. This group comprised diabetic feet only. The relevance of the reported clusters was supported by ANOVA statistics indicating significant differences between different regions of interest and different clusters. Conclusion/s Significance There seems to emerge a new era in diabetic foot medicine which embraces the classification of diabetic patients according to their biomechanical profile. Classification of the plantar pressure distribution has the potential to provide a means to determine mechanical interventions for the prevention and/or treatment of the diabetic foot. PMID:24278219
Longitudinal patterns of gambling activities and associated risk factors in college students
Goudriaan, Anna E.; Slutske, Wendy S.; Krull, Jennifer L.; Sher, Kenneth J.
2009-01-01
Aims To investigate which clusters of gambling activities exist within a longitudinal study of college health, how membership in gambling clusters change over time and whether particular clusters of gambling are associated with unhealthy risk behaviour. Design Four-year longitudinal study (2002–2006). Setting Large, public university. Participants Undergraduate college students. Measurements Ten common gambling activities were measured during 4 consecutive college years (years 1–4). Clusters of gambling activities were examined using latent class analyses. Relations between gambling clusters and gender, Greek membership, alcohol use, drug use, personality indicators of behavioural undercontrol and psychological distress were examined. Findings Four latent gambling classes were identified: (1) a low-gambling class, (2) a card gambling class, (3) a casino/slots gambling class and (4) an extensive gambling class. Over the first college years a high probability of transitioning from the low-gambling class and the card gambling class into the casino/slots gambling class was present. Membership in the card, casino/slots and extensive gambling classes were associated with higher scores on alcohol/drug use, novelty seeking and self-identified gambling problems compared to the low-gambling class. The extensive gambling class scored higher than the other gambling classes on risk factors. Conclusions Extensive gamblers and card gamblers are at higher risk for problem gambling and other risky health behaviours. Prospective examinations of class membership suggested that being in the extensive and the low gambling classes was highly stable across the 4 years of college. PMID:19438422
[Genome-wide identification and analysis of WRKY transcription factors in Medicago truncatula].
Song, Hui; Nan, Zhibiao
2014-02-01
WRKY gene family plays important roles in plant by involving in transcriptional regulations during various physiologically processes such as development, metabolism and responses to biotic and abiotic stresses. WRKY genes have been identified in various plants. However, only few WRKY genes in Medicago truncatula have been identified with systematic analysis and comparison. In this study, we identified 93 WRKY genes through analyses of M. truncatula genome. These genes include 19 type-I genes, 49 type II genes and 13 type-III genes, and 12 non-regular type genes. All of these genes were characterized through analyses of gene duplication, chromosomal locations, structural diversity, conserved protein motifs and phylogenetic relations. The results showed that 11 times of gene duplication event occurred in WRKY gene family involving 24 genes. WRKY genes, containing 6 gene clusters, are unevenly distributed into chromosome 1 to 6, and there is the purifying selection pressure in WRKY group III genes.
A scoping review of spatial cluster analysis techniques for point-event data.
Fritz, Charles E; Schuurman, Nadine; Robertson, Colin; Lear, Scott
2013-05-01
Spatial cluster analysis is a uniquely interdisciplinary endeavour, and so it is important to communicate and disseminate ideas, innovations, best practices and challenges across practitioners, applied epidemiology researchers and spatial statisticians. In this research we conducted a scoping review to systematically search peer-reviewed journal databases for research that has employed spatial cluster analysis methods on individual-level, address location, or x and y coordinate derived data. To illustrate the thematic issues raised by our results, methods were tested using a dataset where known clusters existed. Point pattern methods, spatial clustering and cluster detection tests, and a locally weighted spatial regression model were most commonly used for individual-level, address location data (n = 29). The spatial scan statistic was the most popular method for address location data (n = 19). Six themes were identified relating to the application of spatial cluster analysis methods and subsequent analyses, which we recommend researchers to consider; exploratory analysis, visualization, spatial resolution, aetiology, scale and spatial weights. It is our intention that researchers seeking direction for using spatial cluster analysis methods, consider the caveats and strengths of each approach, but also explore the numerous other methods available for this type of analysis. Applied spatial epidemiology researchers and practitioners should give special consideration to applying multiple tests to a dataset. Future research should focus on developing frameworks for selecting appropriate methods and the corresponding spatial weighting schemes.
Regional heatwaves in china: a cluster analysis
NASA Astrophysics Data System (ADS)
Wang, Pinya; Tang, Jianping; Wang, Shuyu; Dong, Xinning; Fang, Juan
2018-03-01
With the consideration of spatial extension of heatwave events, two kind of regional heatwaves using absolute and relative thresholds, namely RHWs-A and RHWs-R, are investigated during 1959-2013. The temperature data is derived from the daily maximum temperatures (DMTs) of 587 stations in China. Totally 298 RHWs-A and 374 RHWs-R are identified during the past 55 years, and both of them are growing more frequent since the mid-1980s. By utilizing the cluster analysis, several typical spatial distributions of RHWs-A/RHWs-R are obtained. For RHWs-A, there are three clusters covering the southeastern, northwestern China and the lower reaches of Yangtze River, of which the southeastern cluster groups the most heatwaves. For RHWs-R, there are seven clusters distributed throughout the whole regions of China. The clusters in the northwestern and northeastern China are more stable than others for both RHWs-A and RHWs-R, and the northern clusters are of larger intensity than that of the southern ones. All RHWs-A/RHWs-R are accompanied by the anomalous high systems along with the reduced soil moisture. The southern clusters are controlled by Northwestern Pacific subtropical high (WPSH), and the northern ones are influenced by the mid-latitude high systems. The influences of atmospheric circulations and soil moisture on regional heatwaves are further demonstrated by two case analyses of the severe RHW-A in 2003 and the RHW-R in 2013.
Spectroscopic Confirmation of Five Galaxy Clusters at z > 1.25 in the 2500 deg^2 SPT-SZ Survey
NASA Astrophysics Data System (ADS)
Khullar, Gourav; Bleem, Lindsey; Bayliss, Matthew; Gladders, Michael; South Pole Telescope (SPT) Collaboration
2018-06-01
We present spectroscopic confirmation of 5 galaxy clusters at 1.25 < z < 1.5, discovered in the 2500 deg2 South Pole Telescope Sunyaev-Zel’dovich (SPT-SZ) survey. These clusters, taken from a nearly redshift-independent mass-limited sample of clusters, have multi-wavelength follow-up imaging data from the X-ray to the near-IR, and currently form the most homogenous massive high-redshift cluster sample in existence. We briefly describe the analysis pipeline used on the low S/N spectra of these faint galaxies, and describing the multiple techniques used to extract robust redshifts from a combination of absorption-line (Ca II H&K doublet - λλ3934,3968Å) and emission-line ([OII] λλ3727,3729Å) spectral features. We present several ensemble analyses of cluster member galaxies that demonstrate the reliability of the measured redshifts. We also identify modest [OII] emission and pronounced CN and Hδ absorption in a composite stacked spectrum of 28 low S/N passive galaxy spectra with redshifts derived primarily from Ca II H&K features. This work increases the number of spectroscopically-confirmed SPT-SZ galaxy clusters at z > 1.25 from 2 to 7, further demonstrating the efficacy of SZ selection for the highest redshift massive clusters, and enabling further detailed study of these confirmed systems.
de Hoop, Esther; van der Tweel, Ingeborg; van der Graaf, Rieke; Moons, Karel G M; van Delden, Johannes J M; Reitsma, Johannes B; Koffijberg, Hendrik
2015-10-30
Various papers have addressed pros and cons of the stepped wedge cluster randomized trial design (SWD). However, some issues have not or only limitedly been addressed. Our aim was to provide a comprehensive overview of all merits and limitations of the SWD to assist researchers, reviewers and medical ethics committees when deciding on the appropriateness of the SWD for a particular study. We performed an initial search to identify articles with a methodological focus on the SWD, and categorized and discussed all reported advantages and disadvantages of the SWD. Additional aspects were identified during multidisciplinary meetings in which ethicists, biostatisticians, clinical epidemiologists and health economists participated. All aspects of the SWD were compared to the parallel group cluster randomized design. We categorized the merits and limitations of the SWD to distinct phases in the design and conduct of such studies, highlighting that their impact may vary depending on the context of the study or that benefits may be offset by drawbacks across study phases. Furthermore, a real-life illustration is provided. New aspects are identified within all disciplines. Examples of newly identified aspects of an SWD are: the possibility to measure a treatment effect in each cluster to examine the (in)consistency in effects across clusters, the detrimental effect of lower than expected inclusion rates, deviation from the ordinary informed consent process and the question whether studies using the SWD are likely to have sufficient social value. Discussions are provided on e.g. clinical equipoise, social value, health economical decision making, number of study arms, and interim analyses. Deciding on the use of the SWD involves aspects and considerations from different disciplines not all of which have been discussed before. Pros and cons of this design should be balanced in comparison to other feasible design options as to choose the optimal design for a particular intervention study.
Zeh, Clement; Inzaule, Seth C; Ondoa, Pascale; Nafisa, Lillian G; Kasembeli, Alex; Otieno, Fredrick; Vandenhoudt, Hilde; Amornkul, Pauli N; Mills, Lisa A; Nkengasong, John N
2016-01-01
To identify unique characteristics of recent versus established HIV infections and describe sexual transmission networks, we characterized circulating HIV-1 strains from two randomly selected populations of ART-naïve participants in rural western Kenya. Recent HIV infections were identified by the HIV-1 subtype B, E and D, immunoglobulin G capture immunoassay (IgG BED-CEIA) and BioRad avidity assays. Genotypic and phylogenetic analyses were performed on the pol gene to identify transmitted drug resistance (TDR) mutations, characterize HIV subtypes and potential transmission clusters. Factors associated with recent infection and clustering were assessed by logistic regression. Of the 320 specimens, 40 (12.5%) were concordantly identified by the two assays as recent infections. Factors independently associated with being recently infected were age ≤19 years (P = 0.001) and history of sexually transmitted infections (STIs) in the past six months (P = 0.004). HIV subtype distribution differed in recently versus chronically infected participants, with subtype A observed among 53% recent vs. 68% chronic infections (p = 0.04) and subtype D among 26% recent vs. 12% chronic infections (p = 0.012). Overall, the prevalence of primary drug resistance was 1.16%. Of the 258 sequences, 11.2% were in monophyletic clusters of between 2-4 individuals. In multivariate analysis factors associated with clustering included having recent HIV infection P = 0.043 and being from Gem region P = 0.002. Recent HIV-1 infection was more frequent among 13-19 year olds compared with older age groups, underscoring the ongoing risk and susceptibility of younger persons for acquiring HIV infection. Our findings also provide evidence of sexual networks. The association of recent infections with clustering suggests that early infections may be contributing significant proportions of onward transmission highlighting the need for early diagnosis and treatment as prevention for ongoing prevention. Larger studies are needed to better understand the structure of these networks and subsequently implement and evaluate targeted interventions.
Mapping patient safety: a large-scale literature review using bibliometric visualisation techniques.
Rodrigues, S P; van Eck, N J; Waltman, L; Jansen, F W
2014-03-13
The amount of scientific literature available is often overwhelming, making it difficult for researchers to have a good overview of the literature and to see relations between different developments. Visualisation techniques based on bibliometric data are helpful in obtaining an overview of the literature on complex research topics, and have been applied here to the topic of patient safety (PS). On the basis of title words and citation relations, publications in the period 2000-2010 related to PS were identified in the Scopus bibliographic database. A visualisation of the most frequently cited PS publications was produced based on direct and indirect citation relations between publications. Terms were extracted from titles and abstracts of the publications, and a visualisation of the most important terms was created. The main PS-related topics studied in the literature were identified using a technique for clustering publications and terms. A total of 8480 publications were identified, of which the 1462 most frequently cited ones were included in the visualisation. The publications were clustered into 19 clusters, which were grouped into three categories: (1) magnitude of PS problems (42% of all included publications); (2) PS risk factors (31%) and (3) implementation of solutions (19%). In the visualisation of PS-related terms, five clusters were identified: (1) medication; (2) measuring harm; (3) PS culture; (4) physician; (5) training, education and communication. Both analysis at publication and term level indicate an increasing focus on risk factors. A bibliometric visualisation approach makes it possible to analyse large amounts of literature. This approach is very useful for improving one's understanding of a complex research topic such as PS and for suggesting new research directions or alternative research priorities. For PS research, the approach suggests that more research on implementing PS improvement initiatives might be needed.
Do Media Use and Physical Activity Compete in Adolescents? Results of the MoMo Study
Spengler, Sarah; Mess, Filip; Woll, Alexander
2015-01-01
Purpose The displacement hypothesis predicts that physical activity and media use compete in adolescents; however, findings are inconsistent. A more differentiated approach at determining the co-occurrence of physical activity and media use behaviors within subjects may be warranted. The aim of this study was to determine the co-occurrence of physical activity and media use by identifying clusters of adolescents with specific behavior patterns including physical activity in various settings (school, sports club, leisure time) and different types of media use (watching TV, playing console games, using PC / Internet). Methods Cross-sectional data of 2,083 adolescents (11–17 years) from all over Germany were collected between 2009 and 2012 in the Motorik-Modul Study. Physical activity and media use were self-reported. Cluster analyses (Ward’s method and K-means analysis) were used to identify behavior patterns of boys and girls separately. Results Eight clusters were identified for boys and seven for girls. The clusters demonstrated that a high proportion of boys (33%) as well as girls (42%) show low engagement in both physical activity and media use, irrespective of setting or type of media. Other adolescents are engaged in both behaviors, but either physical activity (35% of boys, 27% of girls) or media use (31% of boys and girls) predominates. These adolescents belong to different clusters, whereat in most clusters either one specific setting of physical activity or a specific combination of different types of media predominates. Conclusion The results of this study support to some extent the hypothesis that media use and physical activity compete: Very high media use occurred with low physical activity behavior, but very high activity levels co-occurred with considerable amounts of time using any media. There was no evidence that type of used media was related to physical activity levels, neither setting of physical activity was related to amount of media use in any pattern. PMID:26629688
Gupta, Mayetri; Cheung, Ching-Lung; Hsu, Yi-Hsiang; Demissie, Serkalem; Cupples, L Adrienne; Kiel, Douglas P; Karasik, David
2011-06-01
Genome-wide association studies (GWAS) using high-density genotyping platforms offer an unbiased strategy to identify new candidate genes for osteoporosis. It is imperative to be able to clearly distinguish signal from noise by focusing on the best phenotype in a genetic study. We performed GWAS of multiple phenotypes associated with fractures [bone mineral density (BMD), bone quantitative ultrasound (QUS), bone geometry, and muscle mass] with approximately 433,000 single-nucleotide polymorphisms (SNPs) and created a database of resulting associations. We performed analysis of GWAS data from 23 phenotypes by a novel modification of a block clustering algorithm followed by gene-set enrichment analysis. A data matrix of standardized regression coefficients was partitioned along both axes--SNPs and phenotypes. Each partition represents a distinct cluster of SNPs that have similar effects over a particular set of phenotypes. Application of this method to our data shows several SNP-phenotype connections. We found a strong cluster of association coefficients of high magnitude for 10 traits (BMD at several skeletal sites, ultrasound measures, cross-sectional bone area, and section modulus of femoral neck and shaft). These clustered traits were highly genetically correlated. Gene-set enrichment analyses indicated the augmentation of genes that cluster with the 10 osteoporosis-related traits in pathways such as aldosterone signaling in epithelial cells, role of osteoblasts, osteoclasts, and chondrocytes in rheumatoid arthritis, and Parkinson signaling. In addition to several known candidate genes, we also identified PRKCH and SCNN1B as potential candidate genes for multiple bone traits. In conclusion, our mining of GWAS results revealed the similarity of association results between bone strength phenotypes that may be attributed to pleiotropic effects of genes. This knowledge may prove helpful in identifying novel genes and pathways that underlie several correlated phenotypes, as well as in deciphering genetic and phenotypic modularity underlying osteoporosis risk. Copyright © 2011 American Society for Bone and Mineral Research.
Biomarker clusters are differentially associated with longitudinal cognitive decline in late midlife
Racine, Annie M.; Koscik, Rebecca L.; Berman, Sara E.; Nicholas, Christopher R.; Clark, Lindsay R.; Okonkwo, Ozioma C.; Rowley, Howard A.; Asthana, Sanjay; Bendlin, Barbara B.; Blennow, Kaj; Zetterberg, Henrik; Gleason, Carey E.; Carlsson, Cynthia M.
2016-01-01
The ability to detect preclinical Alzheimer’s disease is of great importance, as this stage of the Alzheimer’s continuum is believed to provide a key window for intervention and prevention. As Alzheimer’s disease is characterized by multiple pathological changes, a biomarker panel reflecting co-occurring pathology will likely be most useful for early detection. Towards this end, 175 late middle-aged participants (mean age 55.9 ± 5.7 years at first cognitive assessment, 70% female) were recruited from two longitudinally followed cohorts to undergo magnetic resonance imaging and lumbar puncture. Cluster analysis was used to group individuals based on biomarkers of amyloid pathology (cerebrospinal fluid amyloid-β42/amyloid-β40 assay levels), magnetic resonance imaging-derived measures of neurodegeneration/atrophy (cerebrospinal fluid-to-brain volume ratio, and hippocampal volume), neurofibrillary tangles (cerebrospinal fluid phosphorylated tau181 assay levels), and a brain-based marker of vascular risk (total white matter hyperintensity lesion volume). Four biomarker clusters emerged consistent with preclinical features of (i) Alzheimer’s disease; (ii) mixed Alzheimer’s disease and vascular aetiology; (iii) suspected non-Alzheimer’s disease aetiology; and (iv) healthy ageing. Cognitive decline was then analysed between clusters using longitudinal assessments of episodic memory, semantic memory, executive function, and global cognitive function with linear mixed effects modelling. Cluster 1 exhibited a higher intercept and greater rates of decline on tests of episodic memory. Cluster 2 had a lower intercept on a test of semantic memory and both Cluster 2 and Cluster 3 had steeper rates of decline on a test of global cognition. Additional analyses on Cluster 3, which had the smallest hippocampal volume, suggest that its biomarker profile is more likely due to hippocampal vulnerability and not to detectable specific volume loss exceeding the rate of normal ageing. Our results demonstrate that pathology, as indicated by biomarkers, in a preclinical timeframe is related to patterns of longitudinal cognitive decline. Such biomarker patterns may be useful for identifying at-risk populations to recruit for clinical trials. PMID:27324877
Racine, Annie M; Koscik, Rebecca L; Berman, Sara E; Nicholas, Christopher R; Clark, Lindsay R; Okonkwo, Ozioma C; Rowley, Howard A; Asthana, Sanjay; Bendlin, Barbara B; Blennow, Kaj; Zetterberg, Henrik; Gleason, Carey E; Carlsson, Cynthia M; Johnson, Sterling C
2016-08-01
The ability to detect preclinical Alzheimer's disease is of great importance, as this stage of the Alzheimer's continuum is believed to provide a key window for intervention and prevention. As Alzheimer's disease is characterized by multiple pathological changes, a biomarker panel reflecting co-occurring pathology will likely be most useful for early detection. Towards this end, 175 late middle-aged participants (mean age 55.9 ± 5.7 years at first cognitive assessment, 70% female) were recruited from two longitudinally followed cohorts to undergo magnetic resonance imaging and lumbar puncture. Cluster analysis was used to group individuals based on biomarkers of amyloid pathology (cerebrospinal fluid amyloid-β42/amyloid-β40 assay levels), magnetic resonance imaging-derived measures of neurodegeneration/atrophy (cerebrospinal fluid-to-brain volume ratio, and hippocampal volume), neurofibrillary tangles (cerebrospinal fluid phosphorylated tau181 assay levels), and a brain-based marker of vascular risk (total white matter hyperintensity lesion volume). Four biomarker clusters emerged consistent with preclinical features of (i) Alzheimer's disease; (ii) mixed Alzheimer's disease and vascular aetiology; (iii) suspected non-Alzheimer's disease aetiology; and (iv) healthy ageing. Cognitive decline was then analysed between clusters using longitudinal assessments of episodic memory, semantic memory, executive function, and global cognitive function with linear mixed effects modelling. Cluster 1 exhibited a higher intercept and greater rates of decline on tests of episodic memory. Cluster 2 had a lower intercept on a test of semantic memory and both Cluster 2 and Cluster 3 had steeper rates of decline on a test of global cognition. Additional analyses on Cluster 3, which had the smallest hippocampal volume, suggest that its biomarker profile is more likely due to hippocampal vulnerability and not to detectable specific volume loss exceeding the rate of normal ageing. Our results demonstrate that pathology, as indicated by biomarkers, in a preclinical timeframe is related to patterns of longitudinal cognitive decline. Such biomarker patterns may be useful for identifying at-risk populations to recruit for clinical trials. © The Author (2016). Published by Oxford University Press on behalf of the Guarantors of Brain. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Performance Assessment of Kernel Density Clustering for Gene Expression Profile Data
Zeng, Beiyan; Chen, Yiping P.; Smith, Oscar H.
2003-01-01
Kernel density smoothing techniques have been used in classification or supervised learning of gene expression profile (GEP) data, but their applications to clustering or unsupervised learning of those data have not been explored and assessed. Here we report a kernel density clustering method for analysing GEP data and compare its performance with the three most widely-used clustering methods: hierarchical clustering, K-means clustering, and multivariate mixture model-based clustering. Using several methods to measure agreement, between-cluster isolation, and withincluster coherence, such as the Adjusted Rand Index, the Pseudo F test, the r2 test, and the profile plot, we have assessed the effectiveness of kernel density clustering for recovering clusters, and its robustness against noise on clustering both simulated and real GEP data. Our results show that the kernel density clustering method has excellent performance in recovering clusters from simulated data and in grouping large real expression profile data sets into compact and well-isolated clusters, and that it is the most robust clustering method for analysing noisy expression profile data compared to the other three methods assessed. PMID:18629292
Klingenberg, Jennifer M; McFarland, Kevin L; Friedman, Aaron J; Boyce, Steven T; Aronow, Bruce J; Supp, Dorothy M
2010-02-01
Bioengineered skin substitutes can facilitate wound closure in severely burned patients, but deficiencies limit their outcomes compared with native skin autografts. To identify gene programs associated with their in vivo capabilities and limitations, we extended previous gene expression profile analyses to now compare engineered skin after in vivo grafting with both in vitro maturation and normal human skin. Cultured skin substitutes were grafted on full-thickness wounds in athymic mice, and biopsy samples for microarray analyses were collected at multiple in vitro and in vivo time points. Over 10,000 transcripts exhibited large-scale expression pattern differences during in vitro and in vivo maturation. Using hierarchical clustering, 11 different expression profile clusters were partitioned on the basis of differential sample type and temporal stage-specific activation or repression. Analyses show that the wound environment exerts a massive influence on gene expression in skin substitutes. For example, in vivo-healed skin substitutes gained the expression of many native skin-expressed genes, including those associated with epidermal barrier and multiple categories of cell-cell and cell-basement membrane adhesion. In contrast, immunological, trichogenic, and endothelial gene programs were largely lacking. These analyses suggest important areas for guiding further improvement of engineered skin for both increased homology with native skin and enhanced wound healing.
Comprehensive Genomic Characterization of Upper Tract Urothelial Carcinoma.
Moss, Tyler J; Qi, Yuan; Xi, Liu; Peng, Bo; Kim, Tae-Beom; Ezzedine, Nader E; Mosqueda, Maribel E; Guo, Charles C; Czerniak, Bogdan A; Ittmann, Michael; Wheeler, David A; Lerner, Seth P; Matin, Surena F
2017-10-01
Upper urinary tract urothelial cancer (UTUC) may have unique etiologic and genomic factors compared to bladder cancer. To characterize the genomic landscape of UTUC and provide insights into its biology using comprehensive integrated genomic analyses. We collected 31 untreated snap-frozen UTUC samples from two institutions and carried out whole-exome sequencing (WES) of DNA, RNA sequencing (RNAseq), and protein analysis. Adjusting for batch effects, consensus mutation calls from independent pipelines identified DNA mutations, gene expression clusters using unsupervised consensus hierarchical clustering (UCHC), and protein expression levels that were correlated with relevant clinical variables, The Cancer Genome Atlas, and other published data. WES identified mutations in FGFR3 (74.1%; 92% low-grade, 60% high-grade), KMT2D (44.4%), PIK3CA (25.9%), and TP53 (22.2%). APOBEC and CpG were the most common mutational signatures. UCHC of RNAseq data segregated samples into four molecular subtypes with the following characteristics. Cluster 1: no PIK3CA mutations, nonsmokers, high-grade
Wang, Jong-Yi; Liang, Yia-Wen; Yeh, Chun-Chen; Liu, Chiu-Shong; Wang, Chen-Yu
2018-02-21
Spousal clustering of cancer warrants attention. Whether the common environment or high-age vulnerability determines cancer clustering is unclear. The risk of clustering in couples versus non-couples is undetermined. The time to cancer clustering after the first cancer diagnosis is yet to be reported. This study investigated cancer clustering over time among couples by using nationwide data. A cohort of 5643 married couples in the 2002-2013 Taiwan National Health Insurance Research Database was identified and randomly matched with 5643 non-couple pairs through dual propensity score matching. Factors associated with clustering (both spouses with tumours) were analysed by using the Cox proportional hazard model. Propensity-matched analysis revealed that the risk of clustering of all tumours among couples (13.70%) was significantly higher than that among non-couples (11.84%) (OR=1.182, 95% CI 1.058 to 1.321, P=0.0031). The median time to clustering of all tumours and of malignant tumours was 2.92 and 2.32 years, respectively. Risk characteristics associated with clustering included high age and comorbidity. Shared environmental factors among spouses might be linked to a high incidence of cancer clustering. Cancer incidence in one spouse may signal cancer vulnerability in the other spouse. Promoting family-oriented cancer care in vulnerable families and preventing shared lifestyle risk factors for cancer are suggested. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Clinical interpretation of the Spinal Cord Injury Functional Index (SCI-FI).
Fyffe, Denise; Kalpakjian, Claire Z; Slavin, Mary; Kisala, Pamela; Ni, Pengsheng; Kirshblum, Steven C; Tulsky, David S; Jette, Alan M
2016-09-01
To provide validation of functional ability levels for the Spinal Cord Injury - Functional Index (SCI-FI). Cross-sectional. Inpatient rehabilitation hospital and community settings. A sample of 855 individuals with traumatic spinal cord injury enrolled in 6 rehabilitation centers participating in the National Spinal Cord Injury Model Systems Network. Not Applicable. Spinal Cord Injury-Functional Index (SCI-FI). Cluster analyses identified three distinct groups that represent low, mid-range and high SCI-FI functional ability levels. Comparison of clusters on personal and other injury characteristics suggested some significant differences between groups. These results strongly support the use of SCI-FI functional ability levels to document the perceived functional abilities of persons with SCI. Results of the cluster analysis suggest that the SCI-FI functional ability levels capture function by injury characteristics. Clinical implications regarding tracking functional activity trajectories during follow-up visits are discussed.
Liem, David Alexandre; Murali, Sanjana; Sigdel, Dibakar; Shi, Yu; Wang, Xuan; Shen, Jiaming; Choi, Howard; Caufield, J Harry; Wang, Wei; Ping, Peipei; Han, Jiawei
2018-05-18
Extracellular matrix (ECM) proteins have been shown to play important roles regulating multiple biological processes in an array of organ systems, including the cardiovascular system. By using a novel bioinformatics text-mining tool, we studied six categories of cardiovascular disease (CVD), namely ischemic heart disease (IHD), cardiomyopathies (CM), cerebrovascular accident (CVA), congenital heart disease (CHD), arrhythmias (ARR), and valve disease (VD), anticipating novel ECM protein-disease and protein-protein relationships hidden within vast quantities of textual data. We conducted a phrase-mining analysis, delineating the relationships of 709 ECM proteins with the six groups of CVDs reported in 1,099,254 abstracts. The technology pipeline known as Context-aware Semantic Online Analytical Processing (CaseOLAP) was applied to semantically rank the association of proteins to each and all six CVDs, performing analyses to quantify each protein-disease relationship. We performed principal component analysis and hierarchical clustering of the data, where each protein is visualized as a six dimensional vector. We found that ECM proteins display variable degrees of association with the six CVDs; certain CVDs share groups of associated proteins whereas others have divergent protein associations. We identified 82 ECM proteins sharing associations with all six CVDs. Our bioinformatics analysis ascribed distinct ECM pathways (via Reactome) from this subset of proteins, namely insulin-like growth factor regulation and interleukin-4 and interleukin-13 signaling, suggesting their contribution to the pathogenesis of all six CVDs. Finally, we performed hierarchical clustering analysis and identified protein clusters associated with a targeted CVD; analyses revealed unexpected insights underlying ECM-pathogenesis of CVDs.
Msaddak, Abdelhakim; Rejili, Mokhtar; Durán, David; Rey, Luis; Imperial, Juan; Palacios, Jose Manuel; Ruiz-Argüeso, Tomas; Mars, Mohamed
2017-06-01
The genetic diversity of bacterial populations nodulating Lupinus luteus (yellow lupine) in Northern Tunisia was examined. Phylogenetic analyses of 43 isolates based on recA and gyrB partial sequences grouped them in three clusters, two of which belong to genus Bradyrhizobium (41 isolates) and one, remarkably, to Microvirga (2 isolates), a genus never previously described as microsymbiont of this lupine species. Representatives of the three clusters were analysed in-depth by multilocus sequence analysis of five housekeeping genes (rrs, recA, glnII, gyrB and dnaK). Surprisingly, the Bradyrhizobium cluster with the two isolates LluI4 and LluTb2 may constitute a new species defined by a separate position between Bradyrhizobium manausense and B. denitrificans. A nodC-based phylogeny identified only two groups: one formed by Bradyrhizobium strains included in the symbiovar genistearum and the other by the Microvirga strains. Symbiotic behaviour of representative isolates was tested, and among the seven legumes inoculated only a difference was observed i.e. the Bradyrhizobium strains nodulated Ornithopus compressus unlike the two strains of Microvirga. On the basis of these data, we conclude that L. luteus root nodule symbionts in Northern Tunisia are mostly strains within the B. canariense/B. lupini lineages, and the remaining strains belong to two groups not previously identified as L. luteus endosymbionts: one corresponding to a new clade of Bradyrhizobium and the other to the genus Microvirga. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
2012-01-01
Background Currently, food regulatory authorities consider all Listeria monocytogenes isolates as equally virulent. However, an increasing number of studies demonstrate extensive variations in virulence and pathogenicity of L. monocytogenes strains. Up to now, there is no comprehensive overview of the population genetic structure of L. monocytogenes taking into account virulence level. We have previously demonstrated that different low-virulence strains exhibit the same mutations in virulence genes suggesting that they could have common evolutionary pathways. New low-virulence strains were identified and assigned to phenotypic and genotypic Groups using cluster analysis. Pulsed-field gel electrophoresis, virulence gene sequencing and multi-locus sequence typing analyses were performed to study the genetic relatedness and the population structure between the studied low-virulence isolates and virulent strains. Results These methods showed that low-virulence strains are widely distributed in the two major lineages, but some are also clustered according to their genetic mutations. These analyses showed that low-virulence strains initially grouped according to their lineage, then to their serotypes and after which, they lost their virulence suggesting a relatively recent emergence. Conclusions Loss of virulence in lineage II strains was related to point mutation in a few virulence genes (prfA, inlA, inlB, plcA). These strains thus form a tightly clustered, monophyletic group with limited diversity. In contrast, low-virulence strains of lineage I were more dispersed among the virulence strains and the origin of their loss of virulence has not been identified yet, even if some strains exhibited different mutations in prfA or inlA. PMID:23267677
NASA Astrophysics Data System (ADS)
Poppe, Sam; Barette, Florian; Smets, Benoît; Benbakkar, Mhammed; Kervyn, Matthieu
2016-04-01
The Virunga Volcanic Province (VVP) is situated within the western branch of the East-African Rift. The geochemistry and petrology of its' volcanic products has been studied extensively in a fragmented manner. They represent a unique collection of silica-undersaturated, ultra-alkaline and ultra-potassic compositions, displaying marked geochemical variations over the area occupied by the VVP. We present a novel spatially-explicit database of existing whole-rock geochemical analyses of the VVP volcanics, compiled from international publications, (post-)colonial scientific reports and PhD theses. In the database, a total of 703 geochemical analyses of whole-rock samples collected from the 1950s until recently have been characterised with a geographical location, eruption source location, analytical results and uncertainty estimates for each of these categories. Comparative box plots and Kruskal-Wallis H tests on subsets of analyses with contrasting ages or analytical methods suggest that the overall database accuracy is consistent. We demonstrate how statistical techniques such as Principal Component Analysis (PCA) and subsequent cluster analysis allow the identification of clusters of samples with similar major-element compositions. The spatial patterns represented by the contrasting clusters show that both the historically active volcanoes represent compositional clusters which can be identified based on their contrasted silica and alkali contents. Furthermore, two sample clusters are interpreted to represent the most primitive, deep magma source within the VVP, different from the shallow magma reservoirs that feed the eight dominant large volcanoes. The samples from these two clusters systematically originate from locations which 1. are distal compared to the eight large volcanoes and 2. mostly coincide with the surface expressions of rift faults or NE-SW-oriented inherited Precambrian structures which were reactivated during rifting. The lava from the Mugogo eruption of 1957 belongs to these primitive clusters and is the only known to have erupted outside the current rift valley in historical times. We thus infer there is a distributed hazard of vent opening susceptibility additional to the susceptibility associated with the main Virunga edifices. This study suggests that the statistical analysis of such geochemical database may help to understand complex volcanic plumbing systems and the spatial distribution of volcanic hazards in active and poorly known volcanic areas such as the Virunga Volcanic Province.
Carpenter, Joanne S; Robillard, Rébecca; Lee, Rico S C; Hermens, Daniel F; Naismith, Sharon L; White, Django; Whitwell, Bradley; Scott, Elizabeth M; Hickie, Ian B
2015-01-01
Although early-stage affective disorders are associated with both cognitive dysfunction and sleep-wake disruptions, relationships between these factors have not been specifically examined in young adults. Sleep and circadian rhythm disturbances in those with affective disorders are considerably heterogeneous, and may not relate to cognitive dysfunction in a simple linear fashion. This study aimed to characterise profiles of sleep and circadian disturbance in young people with affective disorders and examine associations between these profiles and cognitive performance. Actigraphy monitoring was completed in 152 young people (16-30 years; 66% female) with primary diagnoses of affective disorders, and 69 healthy controls (18-30 years; 57% female). Patients also underwent detailed neuropsychological assessment. Actigraphy data were processed to estimate both sleep and circadian parameters. Overall neuropsychological performance in patients was poor on tasks relating to mental flexibility and visual memory. Two hierarchical cluster analyses identified three distinct patient groups based on sleep variables and three based on circadian variables. Sleep clusters included a 'long sleep' cluster, a 'disrupted sleep' cluster, and a 'delayed and disrupted sleep' cluster. Circadian clusters included a 'strong circadian' cluster, a 'weak circadian' cluster, and a 'delayed circadian' cluster. Medication use differed between clusters. The 'long sleep' cluster displayed significantly worse visual memory performance compared to the 'disrupted sleep' cluster. No other cognitive functions differed between clusters. These results highlight the heterogeneity of sleep and circadian profiles in young people with affective disorders, and provide preliminary evidence in support of a relationship between sleep and visual memory, which may be mediated by use of antipsychotic medication. These findings have implications for the personalisation of treatments and improvement of functioning in young adults early in the course of affective illness.
[Space-time suicide clustering in the community of Antequera (Spain)].
Pérez-Costillas, Lucía; Blasco-Fontecilla, Hilario; Benítez, Nicolás; Comino, Raquel; Antón, José Miguel; Ramos-Medina, Valentín; Lopez, Amalia; Palomo, José Luis; Madrigal, Lucía; Alcalde, Javier; Perea-Millá, Emilio; Artieda-Urrutia, Paula; de León-Martínez, Victoria; de Diego Otero, Yolanda
2015-01-01
Approximately 3,500 people commit suicide every year in Spain. The main aim of this study is to explore if a spatial and temporal clustering of suicide exists in the region of Antequera (Málaga, España). Sample and procedure: All suicides from January 1, 2004 to December 31, 2008 were identified using data from the Forensic Pathology Department of the Institute of Legal Medicine, Málaga (España). Geolocalisation. Google Earth was used to calculate the coordinates for each suicide decedent's address. Statistical analysis. A spatiotemporal permutation scan statistic and the Ripley's K function were used to explore spatiotemporal clustering. Pearson's chi-squared was used to determine whether there were differences between suicides inside and outside the spatiotemporal clusters. A total of 120 individuals committed suicide within the region of Antequera, of which 96 (80%) were included in our analyses. Statistically significant evidence for 7 spatiotemporal suicide clusters emerged within critical limits for the 0-2.5 km distance and for the first and second semanas (P<.05 in both cases) after suicide. There was not a single subject diagnosed with a current psychotic disorder, among suicides within clusters, whereas outside clusters, 20% had this diagnosis (X2=4.13; df=1; P<.05). There are spatiotemporal suicide clusters in the area surrounding Antequera. Patients diagnosed with current psychotic disorder are less likely to be influenced by the factors explaining suicide clustering. Copyright © 2013 SEP y SEPB. Published by Elsevier España. All rights reserved.
A redshift survey of the strong-lensing cluster ABELL 383
DOE Office of Scientific and Technical Information (OSTI.GOV)
Geller, Margaret J.; Hwang, Ho Seong; Kurtz, Michael J.
2014-03-01
Abell 383 is a famous rich cluster (z = 0.1887) imaged extensively as a basis for intensive strong- and weak-lensing studies. Nonetheless, there are few spectroscopic observations. We enable dynamical analyses by measuring 2360 new redshifts for galaxies with r {sub Petro} ≤ 20.5 and within 50' of the Brightest Cluster Galaxy (BCG; R.A.{sub 2000} = 42.°014125, decl.{sub 2000} = –03.°529228). We apply the caustic technique to identify 275 cluster members within 7 h {sup –1} Mpc of the hierarchical cluster center. The BCG lies within –11 ± 110 km s{sup –1} and 21 ± 56 h {sup –1} kpcmore » of the hierarchical cluster center; the velocity dispersion profile of the BCG appears to be an extension of the velocity dispersion profile based on cluster members. The distribution of cluster members on the sky corresponds impressively with the weak-lensing contours of Okabe et al. especially when the impact of foreground and background structure is included. The values of R {sub 200} = 1.22 ± 0.01 h {sup –1} Mpc and M {sub 200} = (5.07 ± 0.09) × 10{sup 14} h {sup –1} M {sub ☉} obtained by application of the caustic technique agree well with recent completely independent lensing measures. The caustic estimate extends direct measurement of the cluster mass profile to a radius of ∼5 h {sup –1} Mpc.« less
Tuttolomondo, Teresa; Dugo, Giacomo; Ruberto, Giuseppe; Leto, Claudio; Napoli, Edoardo M; Cicero, Nicola; Gervasi, Teresa; Virga, Giuseppe; Leone, Raffaele; Licata, Mario; La Bella, Salvatore
2015-01-01
In this study the chemical characterisation of 10 Sicilian Rosmarinus officinalis L. biotypes essential oils is reported. The main goal of this work was to analyse the relationship between the essential oils yield and the geographical distribution of the species plants. The essential oils were analysed by GC-FID and GC-MS. Hierarchical cluster analysis and principal component analysis statistical methods were used to cluster biotypes according to the essential oils chemical composition. The essential oil yield ranged from 0.8 to 2.3 (v/w). In total 82 compounds have been identified, these represent 96.7-99.9% of the essential oil. The most represented compounds in the essential oils were 1.8-cineole, linalool, α-terpineol, verbenone, α-pinene, limonene, bornyl acetate and terpinolene. The results show that the essential oil yield of the 10 biotypes is affected by the environmental characteristics of the sampling sites while the chemical composition is linked to the genetic characteristics of different biotypes.
ERIC Educational Resources Information Center
Spybrook, Jessaca; Hedges, Larry; Borenstein, Michael
2014-01-01
Research designs in which clusters are the unit of randomization are quite common in the social sciences. Given the multilevel nature of these studies, the power analyses for these studies are more complex than in a simple individually randomized trial. Tools are now available to help researchers conduct power analyses for cluster randomized…
X-ray aspects of the DAFT/FADA clusters
NASA Astrophysics Data System (ADS)
Guennou, L.; Durret, F.; Lima Neto, G. B.; Adami, C.
2012-12-01
We have undertaken the DAFT/FADA survey with the aim of applying constraints on dark energy based on weak lensing tomography as well as obtaining homogeneous and high quality data for a sample of 91 massive clusters in the redshift range [0.4,0.9] for which there are HST archive data. We have analysed the XMM-Newton data available for 42 of these clusters to derive their X-ray temperatures and luminosities and search for substructures. This study was coupled with a dynamical analysis for the 26 clusters having at least 30 spectroscopic galaxy redshifts in the cluster range. We present preliminary results on the coupled X-ray and dynamical analyses of these clusters.
Detection of Functional Change Using Cluster Trend Analysis in Glaucoma.
Gardiner, Stuart K; Mansberger, Steven L; Demirel, Shaban
2017-05-01
Global analyses using mean deviation (MD) assess visual field progression, but can miss localized changes. Pointwise analyses are more sensitive to localized progression, but more variable so require confirmation. This study assessed whether cluster trend analysis, averaging information across subsets of locations, could improve progression detection. A total of 133 test-retest eyes were tested 7 to 10 times. Rates of change and P values were calculated for possible re-orderings of these series to generate global analysis ("MD worsening faster than x dB/y with P < y"), pointwise and cluster analyses ("n locations [or clusters] worsening faster than x dB/y with P < y") with specificity exactly 95%. These criteria were applied to 505 eyes tested over a mean of 10.5 years, to find how soon each detected "deterioration," and compared using survival models. This was repeated including two subsequent visual fields to determine whether "deterioration" was confirmed. The best global criterion detected deterioration in 25% of eyes in 5.0 years (95% confidence interval [CI], 4.7-5.3 years), compared with 4.8 years (95% CI, 4.2-5.1) for the best cluster analysis criterion, and 4.1 years (95% CI, 4.0-4.5) for the best pointwise criterion. However, for pointwise analysis, only 38% of these changes were confirmed, compared with 61% for clusters and 76% for MD. The time until 25% of eyes showed subsequently confirmed deterioration was 6.3 years (95% CI, 6.0-7.2) for global, 6.3 years (95% CI, 6.0-7.0) for pointwise, and 6.0 years (95% CI, 5.3-6.6) for cluster analyses. Although the specificity is still suboptimal, cluster trend analysis detects subsequently confirmed deterioration sooner than either global or pointwise analyses.
Campbell, Matthew A; Takebayashi, Naoki; López, J Andrés
2015-07-19
Pleistocene climatic instability had profound and diverse effects on the distribution and abundance of Arctic organisms revealed by variation in phylogeographic patterns documented in extant Arctic populations. To better understand the effects of geography and paleoclimate on Beringian freshwater fishes, we examined genetic variability in the genus Dallia (blackfish: Esociformes: Esocidae). The genus Dallia groups between one and three nominal species of small, cold- and hypoxia-tolerant freshwater fishes restricted entirely in distribution to Beringia from the Yukon River basin near Fairbanks, Alaska westward including the Kuskokwim River basin and low-lying areas of Western Alaska to the Amguema River on the north side of the Chukotka Peninsula and Mechigmen Bay on the south side of the Chukotka Peninsula. The genus has a non-continuous distribution divided by the Bering Strait and the Brooks Range. We examined the distribution of genetic variation across this range to determine the number and location of potential sub-refugia within the greater Beringian refugium as well as the roles of the Bering land bridge, Brooks Range, and large rivers within Beringia in shaping the current distribution of populations of Dallia. Our analyses were based on DNA sequence data from two nuclear gene introns (S7 and RAG1) and two mitochondrial genome fragments from nineteen sampling locations. These data were examined under genetic clustering and coalescent frameworks to identify sub-refugia within the greater Beringia refugium and to infer the demographic history of different populations of Dallia. We identified up to five distinct genetic clusters of Dallia. Four of these genetic clusters are present in Alaska: (1) Arctic Coastal Plain genetic cluster found north of the Brooks Range, (2) interior Alaska genetic cluster placed in upstream locations in the Kuskokwim and Yukon river basins, (3) a genetic cluster found on the Seward Peninsula, and (4) a coastal Alaska genetic cluster encompassing downstream Kuskokwim River and Yukon River basin sample locations and samples from Southwest Alaska not in either of these drainages. The Chukotka samples are assigned to their own genetic cluster (5) similar to the coastal Alaska genetic cluster. The clustering and ordination analyses implemented in Discriminant Analysis of Principal Components (DAPC) and STRUCTURE showed mostly concordant groupings and a high degree of differentiation among groups. The groups of sampling locations identified as genetic clusters correspond to geographic areas divided by likely biogeographic barriers including the Brooks Range and the Bering Strait. Estimates of sequence diversity (θ) are highest in the Yukon River and Kuskokwim River drainages near the Bering Sea. We also infer asymmetric migration rates between genetic clusters. The isolation of Dallia on the Arctic Coastal Plain of Alaska is associated with very low estimated migration rates between the coastal Alaska genetic cluster and the Arctic Coastal Plain genetic cluster. Our results support a scenario with multiple aquatic sub-refugia in Beringia during the Pleistocene and the preservation of that structure in extant populations of Dallia. An inferred historical presence of Dallia across the Bering land bridge explains the similarities in the genetic composition of Dallia in West Beringia and western coastal Alaska. In contrast, historic and contemporary isolation across the Brooks Range shaped the distinctiveness of present day Arctic Coastal Plain Dallia. Overall this study uncovered a high degree of genetic structuring among populations of Dallia supporting the idea of multiple Beringian sub-refugia during the Pleistocene and which appears to be maintained to the present due to the strictly freshwater nature and low dispersal ability of this genus.
Haack, Frederike S.; Poehlein, Anja; Kröger, Cathrin; Voigt, Christian A.; Piepenbring, Meike; Bode, Helge B.; Daniel, Rolf; Schäfer, Wilhelm; Streit, Wolfgang R.
2016-01-01
Janthinobacterium and Duganella are well-known for their antifungal effects. Surprisingly, almost nothing is known on molecular aspects involved in the close bacterium-fungus interaction. To better understand this interaction, we established the genomes of 11 Janthinobacterium and Duganella isolates in combination with phylogenetic and functional analyses of all publicly available genomes. Thereby, we identified a core and pan genome of 1058 and 23,628 genes. All strains encoded secondary metabolite gene clusters and chitinases, both possibly involved in fungal growth suppression. All but one strain carried a single gene cluster involved in the biosynthesis of alpha-hydroxyketone-like autoinducer molecules, designated JAI-1. Genome-wide RNA-seq studies employing the background of two isolates and the corresponding JAI-1 deficient strains identified a set of 45 QS-regulated genes in both isolates. Most regulated genes are characterized by a conserved sequence motif within the promoter region. Among the most strongly regulated genes were secondary metabolite and type VI secretion system gene clusters. Most intriguing, co-incubation studies of J. sp. HH102 or its corresponding JAI-1 synthase deletion mutant with the plant pathogen Fusarium graminearum provided first evidence of a QS-dependent interaction with this pathogen. PMID:27833590
NASA Astrophysics Data System (ADS)
Howard, Emma; Meehan, Maria; Parnell, Andrew
2018-05-01
In Maths for Business, a mathematics module for non-mathematics specialists, students are given the choice of completing the module content via short online videos, live lectures or a combination of both. In this study, we identify students' specific usage patterns with both of these resources and discuss their reasons for the preferences they exhibit. In 2015-2016, we collected quantitative data on each student's resource usage (attendance at live lectures and access of online videos) for the entire class of 522 students and employed model-based clustering which identified four distinct resource usage patterns with lectures and/or videos. We also collected qualitative data on students' perceptions of resource usage through a survey administered at the end of the semester, to which 161 students responded. The 161 survey responses were linked to each cluster and analysed using thematic analysis. Perceived benefits of videos include flexibility of scheduling and pace, and avoidance of large, long lectures. In contrast, the main perceived advantages of lectures are the ability to engage in group tasks, to ask questions, and to learn 'gradually'. Students in the two clusters with high lecture attendance achieved, on average, higher marks in the module.
A Direct Comparison of Two Densely Sampled HIV Epidemics: The UK and Switzerland
NASA Astrophysics Data System (ADS)
Ragonnet-Cronin, Manon L.; Shilaih, Mohaned; Günthard, Huldrych F.; Hodcroft, Emma B.; Böni, Jürg; Fearnhill, Esther; Dunn, David; Yerly, Sabine; Klimkait, Thomas; Aubert, Vincent; Yang, Wan-Lin; Brown, Alison E.; Lycett, Samantha J.; Kouyos, Roger; Brown, Andrew J. Leigh
2016-09-01
Phylogenetic clustering approaches can elucidate HIV transmission dynamics. Comparisons across countries are essential for evaluating public health policies. Here, we used a standardised approach to compare the UK HIV Drug Resistance Database and the Swiss HIV Cohort Study while maintaining data-protection requirements. Clusters were identified in subtype A1, B and C pol phylogenies. We generated degree distributions for each risk group and compared distributions between countries using Kolmogorov-Smirnov (KS) tests, Degree Distribution Quantification and Comparison (DDQC) and bootstrapping. We used logistic regression to predict cluster membership based on country, sampling date, risk group, ethnicity and sex. We analysed >8,000 Swiss and >30,000 UK subtype B sequences. At 4.5% genetic distance, the UK was more clustered and MSM and heterosexual degree distributions differed significantly by the KS test. The KS test is sensitive to variation in network scale, and jackknifing the UK MSM dataset to the size of the Swiss dataset removed the difference. Only heterosexuals varied based on the DDQC, due to UK male heterosexuals who clustered exclusively with MSM. Their removal eliminated this difference. In conclusion, the UK and Swiss HIV epidemics have similar underlying dynamics and observed differences in clustering are mainly due to different population sizes.
Kim, Jin Hae; Bothe, Jameson R.; Alderson, T. Reid; Markley, John L.
2014-01-01
Proteins containing iron–sulfur (Fe–S) clusters arose early in evolution and are essential to life. Organisms have evolved machinery consisting of specialized proteins that operate together to assemble Fe–S clusters efficiently so as to minimize cellular exposure to their toxic constituents: iron and sulfide ions. To date, the best studied system is the iron sulfur cluster (isc) operon of Escherichia coli, and the eight ISC proteins it encodes. Our investigations over the past five years have identified two functional conformational states for the scaffold protein (IscU) and have shown that the other ISC proteins that interact with IscU prefer to bind one conformational state or the other. From analyses of the NMR spectroscopy-derived network of interactions of ISC proteins and small-angle X-ray scattering (SAXS), chemical crosslinking experiments, and functional assays, we have constructed working models for Fe–S cluster assembly and delivery. Future work is needed to validate and refine what has been learned about the E. coli system and to extend these findings to the homologous Fe–S cluster biosynthetic machinery of yeast and human mitochondria. This article is part of a Special Issue entitled: Fe/S proteins: Analysis, structure, function, biogenesis and diseases. PMID:25450980
Spatial Autocorrelation of Cancer Incidence in Saudi Arabia
Al-Ahmadi, Khalid; Al-Zahrani, Ali
2013-01-01
Little is known about the geographic distribution of common cancers in Saudi Arabia. We explored the spatial incidence patterns of common cancers in Saudi Arabia using spatial autocorrelation analyses, employing the global Moran’s I and Anselin’s local Moran’s I statistics to detect nonrandom incidence patterns. Global ordinary least squares (OLS) regression and local geographically-weighted regression (GWR) were applied to examine the spatial correlation of cancer incidences at the city level. Population-based records of cancers diagnosed between 1998 and 2004 were used. Male lung cancer and female breast cancer exhibited positive statistically significant global Moran’s I index values, indicating a tendency toward clustering. The Anselin’s local Moran’s I analyses revealed small significant clusters of lung cancer, prostate cancer and Hodgkin’s disease among males in the Eastern region and significant clusters of thyroid cancers in females in the Eastern and Riyadh regions. Additionally, both regression methods found significant associations among various cancers. For example, OLS and GWR revealed significant spatial associations among NHL, leukemia and Hodgkin’s disease (r² = 0.49–0.67 using OLS and r² = 0.52–0.68 using GWR) and between breast and prostate cancer (r² = 0.53 OLS and 0.57 GWR) in Saudi Arabian cities. These findings may help to generate etiologic hypotheses of cancer causation and identify spatial anomalies in cancer incidence in Saudi Arabia. Our findings should stimulate further research on the possible causes underlying these clusters and associations. PMID:24351742
Statistical analyses and characteristics of volcanic tremor on Stromboli Volcano (Italy)
NASA Astrophysics Data System (ADS)
Falsaperla, S.; Langer, H.; Spampinato, S.
A study of volcanic tremor on Stromboli is carried out on the basis of data recorded daily between 1993 and 1995 by a permanent seismic station (STR) located 1.8km away from the active craters. We also consider the signal of a second station (TF1), which operated for a shorter time span. Changes in the spectral tremor characteristics can be related to modifications in volcanic activity, particularly to lava effusions and explosive sequences. Statistical analyses were carried out on a set of spectra calculated daily from seismic signals where explosion quakes were present or excluded. Principal component analysis and cluster analysis were applied to identify different classes of spectra. Three clusters of spectra are associated with two different states of volcanic activity. One cluster corresponds to a state of low to moderate activity, whereas the two other clusters are present during phases with a high magma column as inferred from the occurrence of lava fountains or effusions. We therefore conclude that variations in volcanic activity at Stromboli are usually linked to changes in the spectral characteristics of volcanic tremor. Site effects are evident when comparing the spectra calculated from signals synchronously recorded at STR and TF1. However, some major spectral peaks at both stations may reflect source properties. Statistical considerations and polarization analysis are in favor of a prevailing presence of P-waves in the tremor signal along with a position of the source northwest of the craters and at shallow depth.
Davaalkham, Jagdagsuren; Unenchimeg, Puntsag; Baigalmaa, Chultem; Erdenetuya, Gombo; Nyamkhuu, Dulmaa; Shiino, Teiichiro; Tsuchiya, Kiyoto; Hayashida, Tsunefusa; Gatanaga, Hiroyuki; Oka, Shinichi
2011-10-01
We investigated the current molecular epidemiological status of HIV-1 in Mongolia, a country with very low incidence of HIV-1 though with rapid expansion in recent years. HIV-1 pol (1065 nt) and env (447 nt) genes were sequenced to construct phylogenetic trees. The evolutionary rates, molecular clock phylogenies, and other evolutionary parameters were estimated from heterochronous genomic sequences of HIV-1 subtype B by the Bayesian Markov chain Monte Carlo method. We obtained 41 sera from 56 reported HIV-1-positive cases as of May 2009. The main route of infection was men who have sex with men (MSM). Dominant subtypes were subtype B in 32 cases (78%) followed by subtype CRF02_AG (9.8%). The phylogenetic analysis of the pol gene identified two clusters in subtype B sequences. Cluster 1 consisted of 21 cases including MSM and other routes of infection, and cluster 2 consisted of eight MSM cases. The tree analyses demonstrated very short branch lengths in cluster 1, suggesting a surprisingly active expansion of HIV-1 transmission during a short period with the same ancestor virus. Evolutionary analysis indicated that the outbreak started around the early 2000s. This study identified a current hot spot of HIV-1 transmission and potential seed of the epidemic in Mongolia. Comprehensive preventive measures targeting this group are urgently needed.
PlantTribes: a gene and gene family resource for comparative genomics in plants
Wall, P. Kerr; Leebens-Mack, Jim; Müller, Kai F.; Field, Dawn; Altman, Naomi S.; dePamphilis, Claude W.
2008-01-01
The PlantTribes database (http://fgp.huck.psu.edu/tribe.html) is a plant gene family database based on the inferred proteomes of five sequenced plant species: Arabidopsis thaliana, Carica papaya, Medicago truncatula, Oryza sativa and Populus trichocarpa. We used the graph-based clustering algorithm MCL [Van Dongen (Technical Report INS-R0010 2000) and Enright et al. (Nucleic Acids Res. 2002; 30: 1575–1584)] to classify all of these species’ protein-coding genes into putative gene families, called tribes, using three clustering stringencies (low, medium and high). For all tribes, we have generated protein and DNA alignments and maximum-likelihood phylogenetic trees. A parallel database of microarray experimental results is linked to the genes, which lets researchers identify groups of related genes and their expression patterns. Unified nomenclatures were developed, and tribes can be related to traditional gene families and conserved domain identifiers. SuperTribes, constructed through a second iteration of MCL clustering, connect distant, but potentially related gene clusters. The global classification of nearly 200 000 plant proteins was used as a scaffold for sorting ∼4 million additional cDNA sequences from over 200 plant species. All data and analyses are accessible through a flexible interface allowing users to explore the classification, to place query sequences within the classification, and to download results for further study. PMID:18073194
Stuber, Tod; Quance, Christine; Edwards, William H.; Tiller, Rebekah V.; Linfield, Tom; Rhyan, Jack; Berte, Angela; Harris, Beth
2012-01-01
A variable-number tandem repeat (VNTR) protocol targeting 10 loci in the Brucella abortus genome was used to assess genetic diversity among 366 field isolates recovered from cattle, bison, and elk in the Greater Yellowstone Area (GYA) and Texas during 1998 to 2011. Minimum spanning tree (MST) and unweighted-pair group method with arithmetic mean (UPGMA) analyses of VNTR data identified 237 different VNTR types, among which 14 prominent clusters of isolates could be identified. Cattle isolates from Texas segregated into three clusters: one comprised of field isolates from 1998 to 2005, one comprised of vaccination-associated infections, and one associated with an outbreak in Starr County in January 2011. An isolate obtained from a feral sow trapped on property adjacent to the Starr County herd in May 2011 clustered with the cattle isolates, suggesting a role for feral swine as B. abortus reservoirs in Starr County. Isolates from a 2005 cattle outbreak in Wyoming displayed VNTR-10 profiles matching those of strains recovered from Wyoming and Idaho elk. Additionally, isolates associated with cattle outbreaks in Idaho in 2002, Montana in 2008 and 2011, and Wyoming in 2010 primarily clustered with isolates recovered from GYA elk. This study indicates that elk play a predominant role in the transmission of B. abortus to cattle located in the GYA. PMID:22427502
dndDB: a database focused on phosphorothioation of the DNA backbone.
Ou, Hong-Yu; He, Xinyi; Shao, Yucheng; Tai, Cui; Rajakumar, Kumar; Deng, Zixin
2009-01-01
The Dnd DNA degradation phenotype was first observed during electrophoresis of genomic DNA from Streptomyces lividans more than 20 years ago. It was subsequently shown to be governed by the five-gene dnd cluster. Similar gene clusters have now been found to be widespread among many other distantly related bacteria. Recently the dnd cluster was shown to mediate the incorporation of sulphur into the DNA backbone via a sequence-selective, stereo-specific phosphorothioate modification in Escherichia coli B7A. Intriguingly, to date all identified dnd clusters lie within mobile genetic elements, the vast majority in laterally transferred genomic islands. We organized available data from experimental and bioinformatics analyses about the DNA phosphorothioation phenomenon and associated documentation as a dndDB database. It contains the following detailed information: (i) Dnd phenotype; (ii) dnd gene clusters; (iii) genomic islands harbouring dnd genes; (iv) Dnd proteins and conserved domains. As of 25 December 2008, dndDB contained data corresponding to 24 bacterial species exhibiting the Dnd phenotype reported in the scientific literature. In addition, via in silico analysis, dndDB identified 26 syntenic dnd clusters from 25 species of Eubacteria and Archaea, 25 dnd-bearing genomic islands and one dnd plasmid containing 114 dnd genes. A further 397 other genes coding for proteins with varying levels of similarity to Dnd proteins were also included in dndDB. A broad range of similarity search, sequence alignment and phylogenetic tools are readily accessible to allow for to individualized directions of research focused on dnd genes. dndDB can facilitate efficient investigation of a wide range of aspects relating to dnd DNA modification and other island-encoded functions in host organisms. dndDB version 1.0 is freely available at http://mml.sjtu.edu.cn/dndDB/.
Jung, Wi Hoon; Jang, Joon Hwan; Park, Jin Woo; Kim, Euitae; Goo, Eun-Hoe; Im, Oh-Soo; Kwon, Jun Soo
2014-01-01
As the main input hub of the basal ganglia, the striatum receives projections from the cerebral cortex. Many studies have provided evidence for multiple parallel corticostriatal loops based on the structural and functional connectivity profiles of the human striatum. A recent resting-state fMRI study revealed the topography of striatum by assigning each voxel in the striatum to its most strongly correlated cortical network among the cognitive, affective, and motor networks. However, it remains unclear what patterns of striatal parcellation would result from performing the clustering without subsequent assignment to cortical networks. Thus, we applied unsupervised clustering algorithms to parcellate the human striatum based on its functional connectivity patterns to other brain regions without any anatomically or functionally defined cortical targets. Functional connectivity maps of striatal subdivisions, identified through clustering analyses, were also computed. Our findings were consistent with recent accounts of the functional distinctions of the striatum as well as with recent studies about its functional and anatomical connectivity. For example, we found functional connections between dorsal and ventral striatal clusters and the areas involved in cognitive and affective processes, respectively, and between rostral and caudal putamen clusters and the areas involved in cognitive and motor processes, respectively. This study confirms prior findings, showing similar striatal parcellation patterns between the present and prior studies. Given such striking similarity, it is suggested that striatal subregions are functionally linked to cortical networks involving specific functions rather than discrete portions of cortical regions. Our findings also demonstrate that the clustering of functional connectivity patterns is a reliable feature in parcellating the striatum into anatomically and functionally meaningful subdivisions. The striatal subdivisions identified here may have important implications for understanding the relationship between corticostriatal dysfunction and various neurodegenerative and psychiatric disorders. PMID:25203441
CLUSTERnGO: a user-defined modelling platform for two-stage clustering of time-series data.
Fidaner, Işık Barış; Cankorur-Cetinkaya, Ayca; Dikicioglu, Duygu; Kirdar, Betul; Cemgil, Ali Taylan; Oliver, Stephen G
2016-02-01
Simple bioinformatic tools are frequently used to analyse time-series datasets regardless of their ability to deal with transient phenomena, limiting the meaningful information that may be extracted from them. This situation requires the development and exploitation of tailor-made, easy-to-use and flexible tools designed specifically for the analysis of time-series datasets. We present a novel statistical application called CLUSTERnGO, which uses a model-based clustering algorithm that fulfils this need. This algorithm involves two components of operation. Component 1 constructs a Bayesian non-parametric model (Infinite Mixture of Piecewise Linear Sequences) and Component 2, which applies a novel clustering methodology (Two-Stage Clustering). The software can also assign biological meaning to the identified clusters using an appropriate ontology. It applies multiple hypothesis testing to report the significance of these enrichments. The algorithm has a four-phase pipeline. The application can be executed using either command-line tools or a user-friendly Graphical User Interface. The latter has been developed to address the needs of both specialist and non-specialist users. We use three diverse test cases to demonstrate the flexibility of the proposed strategy. In all cases, CLUSTERnGO not only outperformed existing algorithms in assigning unique GO term enrichments to the identified clusters, but also revealed novel insights regarding the biological systems examined, which were not uncovered in the original publications. The C++ and QT source codes, the GUI applications for Windows, OS X and Linux operating systems and user manual are freely available for download under the GNU GPL v3 license at http://www.cmpe.boun.edu.tr/content/CnG. sgo24@cam.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
Lorenz, Nicole; Wilson, Ella V.; Machado, Caroline; Schardl, Christopher L.; Tudzynski, Paul
2007-01-01
The grass parasites Claviceps purpurea and Claviceps fusiformis produce ergot alkaloids (EA) in planta and in submerged culture. Whereas EA synthesis (EAS) in C. purpurea proceeds via clavine intermediates to lysergic acid and the complex ergopeptines, C. fusiformis produces only agroclavine and elymoclavine. In C. purpurea the EAS gene (EAS) cluster includes dmaW (encoding the first pathway step), cloA (elymoclavine oxidation to lysergic acid), and the lpsA/lpsB genes (ergopeptine formation). We analyzed the corresponding C. fusiformis EAS cluster to investigate the evolutionary basis for chemotypic differences between the Claviceps species. Other than three peptide synthetase genes (lpsC and the tandem paralogues lpsA1 and lpsA2), homologues of all C. purpurea EAS genes were identified in C. fusiformis, including homologues of lpsB and cloA, which in C. purpurea encode enzymes for steps after clavine synthesis. Rearrangement of the cluster was evident around lpsB, which is truncated in C. fusiformis. This and several frameshift mutations render CflpsB a pseudogene (CflpsBΨ). No obvious inactivating mutation was identified in CfcloA. All C. fusiformis EAS genes, including CflpsBΨ and CfcloA, were expressed in culture. Cross-complementation analyses demonstrated that CfcloA and CflpsBΨ were expressed in C. purpurea but did not encode functional enzymes. In contrast, CpcloA catalyzed lysergic acid biosynthesis in C. fusiformis, indicating that C. fusiformis terminates its EAS pathway at elymoclavine because the cloA gene product is inactive. We propose that the C. fusiformis EAS cluster evolved from a more complete cluster by loss of some lps genes and by rearrangements and mutations inactivating lpsB and cloA. PMID:17720822
Lorenz, Nicole; Wilson, Ella V; Machado, Caroline; Schardl, Christopher L; Tudzynski, Paul
2007-11-01
The grass parasites Claviceps purpurea and Claviceps fusiformis produce ergot alkaloids (EA) in planta and in submerged culture. Whereas EA synthesis (EAS) in C. purpurea proceeds via clavine intermediates to lysergic acid and the complex ergopeptines, C. fusiformis produces only agroclavine and elymoclavine. In C. purpurea the EAS gene (EAS) cluster includes dmaW (encoding the first pathway step), cloA (elymoclavine oxidation to lysergic acid), and the lpsA/lpsB genes (ergopeptine formation). We analyzed the corresponding C. fusiformis EAS cluster to investigate the evolutionary basis for chemotypic differences between the Claviceps species. Other than three peptide synthetase genes (lpsC and the tandem paralogues lpsA1 and lpsA2), homologues of all C. purpurea EAS genes were identified in C. fusiformis, including homologues of lpsB and cloA, which in C. purpurea encode enzymes for steps after clavine synthesis. Rearrangement of the cluster was evident around lpsB, which is truncated in C. fusiformis. This and several frameshift mutations render CflpsB a pseudogene (CflpsB(Psi)). No obvious inactivating mutation was identified in CfcloA. All C. fusiformis EAS genes, including CflpsB(Psi) and CfcloA, were expressed in culture. Cross-complementation analyses demonstrated that CfcloA and CflpsB(Psi) were expressed in C. purpurea but did not encode functional enzymes. In contrast, CpcloA catalyzed lysergic acid biosynthesis in C. fusiformis, indicating that C. fusiformis terminates its EAS pathway at elymoclavine because the cloA gene product is inactive. We propose that the C. fusiformis EAS cluster evolved from a more complete cluster by loss of some lps genes and by rearrangements and mutations inactivating lpsB and cloA.
Williams, Bronwyn W; Scribner, Kim T
2010-01-01
Reintroductions and translocations are increasingly used to repatriate or increase probabilities of persistence for animal and plant species. Genetic and demographic characteristics of founding individuals and suitability of habitat at release sites are commonly believed to affect the success of these conservation programs. Genetic divergence among multiple source populations of American martens (Martes americana) and well documented introduction histories permitted analyses of post-introduction dispersion from release sites and development of genetic clusters in the Upper Peninsula (UP) of Michigan <50 years following release. Location and size of spatial genetic clusters and measures of individual-based autocorrelation were inferred using 11 microsatellite loci. We identified three genetic clusters in geographic proximity to original release locations. Estimated distances of effective gene flow based on spatial autocorrelation varied greatly among genetic clusters (30-90 km). Spatial contiguity of genetic clusters has been largely maintained with evidence for admixture primarily in localized regions, suggesting recent contact or locally retarded rates of gene flow. Data provide guidance for future studies of the effects of permeabilities of different land-cover and land-use features to dispersal and of other biotic and environmental factors that may contribute to the colonization process and development of spatial genetic associations.
Radial Profile of the 3.5 kev Line Out to R200 in the Perseus Cluster
NASA Technical Reports Server (NTRS)
Franse, Jeroen; Bulbul, Esra; Foster, Adam; Boyarsky, Alexey; Markevitch, Maxim; Bautz, Mark; Lakubovskyi, Dmytro; Loewenstein, Michael; McDonald, Michael; Miller, Eric;
2016-01-01
The recent discovery of the unidentified emission line at 3.5 keV in galaxies and clusters has attracted great interest from the community. As the origin of the line remains uncertain, we study the surface brightness distribution of the line in the Perseus cluster since that information can be used to identify its origin. We examine the flux distribution of the 3.5 keV line in the deep Suzaku observations of the Perseus cluster in detail. The 3.5 keV line is observed in three concentric annuli in the central observations, although the observations of the outskirts of the cluster did not reveal such a signal. We establish that these detections and the upper limits from the non-detections are consistent with a dark matter decay origin. However, absence of positive detection in the outskirts is also consistent with some unknown astrophysical origin of the line in the dense gas of the Perseus core, as well as with a dark matter origin with a steeper dependence on mass than the dark matter decay. We also comment on several recently published analyses of the 3.5 keV line.
RADIAL PROFILE OF THE 3.5 keV LINE OUT TO R {sub 200} IN THE PERSEUS CLUSTER
DOE Office of Scientific and Technical Information (OSTI.GOV)
Franse, Jeroen; Bulbul, Esra; Bautz, Mark
2016-10-01
The recent discovery of the unidentified emission line at 3.5 keV in galaxies and clusters has attracted great interest from the community. As the origin of the line remains uncertain, we study the surface brightness distribution of the line in the Perseus cluster since that information can be used to identify its origin. We examine the flux distribution of the 3.5 keV line in the deep Suzaku observations of the Perseus cluster in detail. The 3.5 keV line is observed in three concentric annuli in the central observations, although the observations of the outskirts of the cluster did not revealmore » such a signal. We establish that these detections and the upper limits from the non-detections are consistent with a dark matter decay origin. However, absence of positive detection in the outskirts is also consistent with some unknown astrophysical origin of the line in the dense gas of the Perseus core, as well as with a dark matter origin with a steeper dependence on mass than the dark matter decay. We also comment on several recently published analyses of the 3.5 keV line.« less
Spatial and temporal patterns in preterm birth in the United States.
Byrnes, John; Mahoney, Richard; Quaintance, Cele; Gould, Jeffrey B; Carmichael, Suzan; Shaw, Gary M; Showen, Amy; Phibbs, Ciaran; Stevenson, David K; Wise, Paul H
2015-06-01
Despite years of research, the etiologies of preterm birth remain unclear. In order to help generate new research hypotheses, this study explored spatial and temporal patterns of preterm birth in a large, total-population dataset. Data on 145 million US births in 3,000 counties from the Natality Files of the National Center for Health Statistics for 1971-2011 were examined. State trends in early (<34 wk) and late (34-36 wk) preterm birth rates were compared. K-means cluster analyses were conducted to identify gestational age distribution patterns for all US counties over time. A weak association was observed between state trends in <34 wk birth rates and the initial absolute <34 wk birth rate. Significant associations were observed between trends in <34 wk and 34-36 wk birth rates and between white and African American <34 wk births. Periodicity was observed in county-level trends in <34 wk birth rates. Cluster analyses identified periods of significant heterogeneity and homogeneity in gestational age distributional trends for US counties. The observed geographic and temporal patterns suggest periodicity and complex, shared influences among preterm birth rates in the United States. These patterns could provide insight into promising hypotheses for further research.
Non-linear clustering in the cold plus hot dark matter model
NASA Astrophysics Data System (ADS)
Bonometto, Silvio A.; Borgani, Stefano; Ghigna, Sebastiano; Klypin, Anatoly; Primack, Joel R.
1995-03-01
The main aim of this work is to find out if hierarchical scaling, observed in galaxy clustering, can be dynamically explained by studying N-body simulations. Previous analyses of dark matter (DM) particle distributions indicated heavy distortions with respect to the hierarchical pattern. Here, we shall describe how such distortions are to be interpreted and why they can be fully reconciled with the observed galaxy clustering. This aim is achieved by using high-resolution (512^3 grid-points) particle-mesh (PM) N-body simulations to follow the development of non-linear clustering in a Omega=1 universe, dominated either by cold dark matter (CDM) or by a mixture of cold+hot dark matter (CHDM) with Omega_cold=0.6, and Omega_hot=0.3 and Omega_baryon=0.1 a simulation box of side 100 Mpc (h=0.5) is used. We analyse two CHDM realizations with biasing factor b=1.5 (COBE normalization), starting from different initial random numbers, and compare them with CDM simulations with b=1 (COBE-compatible) and b=1.5. We evaluate high-order correlation functions and the void probability function (VPF). Correlation functions are obtained from both counts in cells and counts of neighbours. The analysis is carried out for DM particles and for galaxies identified as massive haloes of the evolved density field. We confirm that clustering of DM particles systematically exhibits deviations from hierarchical scaling, although the deviation increases somewhat in redshift space. Deviations from the hierarchical scaling of DM particles are found to be related to the spectrum shape, in a way that indicates that such distortions arise from finite sampling effects. We identify galaxy positions in the simulations and show that, quite differently from the DM particle background, galaxies follow hierarchical scaling (S_q=xi_q/& xgr^q-1_2=consta nt) far more closely, with reduced skewness and kurtosis coefficients S_3~2.5 and S_4~7.5, in general agreement with observational results. Unlike DM, the scaling of galaxy clustering is must marginally affected by redshift distortions and is obtained for both CDM and CHDM models. Hierarchical scaling in simulations is confirmed by VPF analysis. Also in this case, we find substantial agreement with observational findings.
NASA Astrophysics Data System (ADS)
McDermott, Scott D.
This research study uses geographic information retrieval (GIR) to georeference toponyms and points-of-interest (POI) names from a travel journal. Travel journals are an ideal data source with which to conduct this study because they are significant accounts specific to the author's experience, and contain geographic instances based on the experiences made at a specific time and location along a traversed route of a trip. Using a travel journal, toponyms and POI names are georeferenced to locate where the author visited or what the author observed along a travel path. GIR relies on algorithms to maximize the georeferencing of spatially sensitive data while minimizing issues related to semantic ambiguities, which can incorrectly place geographic content due to shared names by other geographic or non-geographic contents. Frequency analysis and proximity clustering are used to minimize semantic ambiguities and georeference the toponyms and POI names to their correct locations. Frequency analysis identifies the primary and adjacent state names for each chapter of the travel journal, which act as containers for the subsequent toponyms and POI names. Proximity clustering groups the toponyms and POI names based on the distance to the cluster group's centroid. A cluster group with a significant number of toponyms and POI names contains the placenames that are more relevant to the travel journal. The use of frequency and proximity clustering analyses narrows the geographic scope to select states and identify the toponyms and POI names that exist along the travel path. The reliability measurements for this dissertation yield a precision rate of 88 percent and a recall rate of 30 percent. The precision rate is comparable to similar peer-reviewed studies and shows that this dissertation can assist in the GIR process. Obstacles and issues in this research study include name matching errors between the travel journal, geoparser, and gazetteers; temporal disassociations between the time the journal was written and the time this dissertation was conducted; omissions of POI names from the gazetteers; and incorrect tagging by the geoparser. Future studies are needed to provide better name matching between the travel journal, geoparser, and gazetteers and on managing POI names to become integral to the GIR process.
Kooyman, Robert M; Rossetto, Maurizio; Sauquet, Hervé; Laffan, Shawn W
2013-01-01
Identify patterns of change in species distributions, diversity, concentrations of evolutionary history, and assembly of Australian rainforests. We used the distribution records of all known rainforest woody species in Australia across their full continental extent. These were analysed using measures of species richness, phylogenetic diversity (PD), phylogenetic endemism (PE) and phylogenetic structure (net relatedness index; NRI). Phylogenetic structure was assessed using both continental and regional species pools. To test the influence of growth-form, freestanding and climbing plants were analysed independently, and in combination. Species richness decreased along two generally orthogonal continental axes, corresponding with wet to seasonally dry and tropical to temperate habitats. The PE analyses identified four main areas of substantially restricted phylogenetic diversity, including parts of Cape York, Wet Tropics, Border Ranges, and Tasmania. The continental pool NRI results showed evenness (species less related than expected by chance) in groups of grid cells in coastally aligned areas of species rich tropical and sub-tropical rainforest, and in low diversity moist forest areas in the south-east of the Great Dividing Range and in Tasmania. Monsoon and drier vine forests, and moist forests inland from upland refugia showed phylogenetic clustering, reflecting lower diversity and more relatedness. Signals for evenness in Tasmania and clustering in northern monsoon forests weakened in analyses using regional species pools. For climbing plants, values for NRI by grid cell showed strong spatial structuring, with high diversity and PE concentrated in moist tropical and subtropical regions. Concentrations of rainforest evolutionary history (phylo-diversity) were patchily distributed within a continuum of species distributions. Contrasting with previous concepts of rainforest community distribution, our findings of continuous distributions and continental connectivity have significant implications for interpreting rainforest evolutionary history and current day ecological processes, and for managing rainforest diversity in changing circumstances.
Haynos, Ann F; Pearson, Carolyn M; Utzinger, Linsey M; Wonderlich, Stephen A; Crosby, Ross D; Mitchell, James E; Crow, Scott J; Peterson, Carol B
2017-05-01
Evidence suggests that eating disorder subtypes reflecting under-controlled, over-controlled, and low psychopathology personality traits constitute reliable phenotypes that differentiate treatment response. This study is the first to use statistical analyses to identify these subtypes within treatment-seeking individuals with bulimia nervosa (BN) and to use these statistically derived clusters to predict clinical outcomes. Using variables from the Dimensional Assessment of Personality Pathology-Basic Questionnaire, K-means cluster analyses identified under-controlled, over-controlled, and low psychopathology subtypes within BN patients (n = 80) enrolled in a treatment trial. Generalized linear models examined the impact of personality subtypes on Eating Disorder Examination global score, binge eating frequency, and purging frequency cross-sectionally at baseline and longitudinally at end of treatment (EOT) and follow-up. In the longitudinal models, secondary analyses were conducted to examine personality subtype as a potential moderator of response to Cognitive Behavioral Therapy-Enhanced (CBT-E) or Integrative Cognitive-Affective Therapy for BN (ICAT-BN). There were no baseline clinical differences between groups. In the longitudinal models, personality subtype predicted binge eating (p = 0.03) and purging (p = 0.01) frequency at EOT and binge eating frequency at follow-up (p = 0.045). The over-controlled group demonstrated the best outcomes on these variables. In secondary analyses, there was a treatment by subtype interaction for purging at follow-up (p = 0.04), which indicated a superiority of CBT-E over ICAT-BN for reducing purging among the over-controlled group. Empirically derived personality subtyping appears to be a valid classification system with potential to guide eating disorder treatment decisions. © 2016 Wiley Periodicals, Inc.(Int J Eat Disord 2017; 50:506-514). © 2016 Wiley Periodicals, Inc.
Hydrogeochemical processes and isotopes analysis. Study case: "La Línea Tunnel", Colombia
NASA Astrophysics Data System (ADS)
Piña, Adriana; Donado, Leonardo; Cramer, Thomas
2017-04-01
Hydrogeochemical and stable isotopes analyses have been widely used to identify recharge and discharge zones, flowpaths, type, origin and age of water, chemical processes between minerals and groundwater as well as effects caused by anthropogenic or natural pollution. In this paper we analyze the interactions between groundwater and surface water using as laboratory the tunnels located at the La Línea Massif in the Cordillera Central of the Colombian Andes. The massif is formed by two igneous-metamorphic fractured complexes (Cajamarca and Quebradagrande group) plus andesithic porphyry rocks from the tertiary period. There, eight main fault zones related to surface creeks were identified and main inflows inside the tunnels were reported. 60 water samples were collected in surface and inside the tunnel in fault zones in two different years, 2010 and 2015. To classify water samples, a multivariate statistical analysis combining Factor Analysis (FA) with Hierarchical Cluster Analysis (HCA) was performed. Then, analyses of the major chemical elements and water isotopes (18O, 2H and 3H) were used to define the origin of dissolved components and to analyse the evolution in time. Most samples were classified as bicarbonate calcite water or bicarbonate magnesium water type. Isotopic analyses show a characteristic behavior for east and west watershed and each geologic group. According to the FA and HCA, obtained factors and clusters are first related to the location of the samples (surface or tunnel samples) followed by the geology. Surface samples behave according to the Colombian meteoric line as inflows related to permeable faults while less permeable faults show hydrothermal processes. Finally, water evolution in time shows a decrease of pH, conductivity and Mg2+ related to silicate weathering or precipitation/dissolution processes that affect the spacing in fractures and consequently, the hydraulic properties.
Jensen, Anders; Scholz, Christian F P; Kilian, Mogens
2016-11-01
The Mitis group of the genus Streptococcus currently comprises 20 species with validly published names, including the pathogen S. pneumoniae. They have been the subject of much taxonomic confusion, due to phenotypic overlap and genetic heterogeneity, which has hampered a full appreciation of their clinical significance. The purpose of this study was to critically re-examine the taxonomy of the Mitis group using 195 publicly available genomes, including designated type strains for phylogenetic analyses based on core genomes, multilocus sequences and 16S rRNA gene sequences, combined with estimates of average nucleotide identity (ANI) and in silico and in vitro analyses of specific phenotypic characteristics. Our core genomic phylogenetic analyses revealed distinct clades that, to some extent, and from the clustering of type strains represent known species. However, many of the genomes have been incorrectly identified adding to the current confusion. Furthermore, our data show that 16S rRNA gene sequences and ANI are unsuitable for identifying and circumscribing new species of the Mitis group of the genus Streptococci. Based on the clustering patterns resulting from core genome phylogenetic analysis, we conclude that S. oligofermentans is a later synonym of S. cristatus. The recently described strains of the species Streptococcus dentisani includes one previously referred to as 'S. mitis biovar 2'. Together with S. oralis, S. dentisani and S. tigurinus form subclusters within a coherent phylogenetic clade. We propose that the species S. oralis consists of three subspecies: S. oralis subsp. oralis subsp. nov., S. oralis subsp. tigurinus comb. nov., and S. oralis subsp. dentisani comb. nov.
Tlou, Boikhutso; Sartorius, Benn; Tanser, Frank
2017-01-01
The aim of the study was to identify the key determinants of child mortality 'hot-spots' in space and time. Comprehensive population-based mortality data collected between 2000 and 2014 by the Africa Centre Demographic Information System located in the UMkhanyakude District of KwaZulu-Natal Province, South Africa, was analysed. We assigned all mortality events and person-time of observation for children <5 years of age to an exact homestead of residence (mapped to <2m accuracy as part of the DSA platform). Using these exact locations, both the Kulldorff and Tango spatial scan statistics for regular and irregular shaped cluster detection were used to identify clusters of childhood mortality events in both space and time. Of the 49 986 children aged < 5 years who resided in the study area between 2000 and 2014, 2010 (4.0%) died. Childhood mortality decreased by 80% over the period from >20 per 1000 person-years in 2001-2003 to 4 per 1000 person-years in 2014. The two scanning spatial techniques identified two high-risk clusters for child mortality along the eastern border of the study site near the national highway, with a relative risk of 2.10 and 1.91 respectively. The high-risk communities detected in this work, and the differential risk factor profile of these communities, can assist public health professionals to identify similar populations in other parts of rural South Africa. Identifying child mortality hot-spots will potentially guide policy interventions in rural, resource-limited settings.
Watts, P; Buck, D; Netuveli, G; Renton, A
2016-06-01
Clustering of lifestyle risk behaviours is very important in predicting premature mortality. Understanding the extent to which risk behaviours are clustered in deprived communities is vital to most effectively target public health interventions. We examined co-occurrence and associations between risk behaviours (smoking, alcohol consumption, poor diet, low physical activity and high sedentary time) reported by adults living in deprived London neighbourhoods. Associations between sociodemographic characteristics and clustered risk behaviours were examined. Latent class analysis was used to identify underlying clustering of behaviours. Over 90% of respondents reported at least one risk behaviour. Reporting specific risk behaviours predicted reporting of further risk behaviours. Latent class analyses revealed four underlying classes. Membership of a maximal risk behaviour class was more likely for young, white males who were unable to work. Compared with recent national level analysis, there was a weaker relationship between education and clustering of behaviours and a very high prevalence of clustering of risk behaviours in those unable to work. Young, white men who report difficulty managing on income were at high risk of reporting multiple risk behaviours. These groups may be an important target for interventions to reduce premature mortality caused by multiple risk behaviours. © The Author 2015. Published by Oxford University Press on behalf of Faculty of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Cluster stability in the analysis of mass cytometry data.
Melchiotti, Rossella; Gracio, Filipe; Kordasti, Shahram; Todd, Alan K; de Rinaldis, Emanuele
2017-01-01
Manual gating has been traditionally applied to cytometry data sets to identify cells based on protein expression. The advent of mass cytometry allows for a higher number of proteins to be simultaneously measured on cells, therefore providing a means to define cell clusters in a high dimensional expression space. This enhancement, whilst opening unprecedented opportunities for single cell-level analyses, makes the incremental replacement of manual gating with automated clustering a compelling need. To this aim many methods have been implemented and their successful applications demonstrated in different settings. However, the reproducibility of automatically generated clusters is proving challenging and an analytical framework to distinguish spurious clusters from more stable entities, and presumably more biologically relevant ones, is still missing. One way to estimate cell clusters' stability is the evaluation of their consistent re-occurrence within- and between-algorithms, a metric that is commonly used to evaluate results from gene expression. Herein we report the usage and importance of cluster stability evaluations, when applied to results generated from three popular clustering algorithms - SPADE, FLOCK and PhenoGraph - run on four different data sets. These algorithms were shown to generate clusters with various degrees of statistical stability, many of them being unstable. By comparing the results of automated clustering with manually gated populations, we illustrate how information on cluster stability can assist towards a more rigorous and informed interpretation of clustering results. We also explore the relationships between statistical stability and other properties such as clusters' compactness and isolation, demonstrating that whilst cluster stability is linked to other properties it cannot be reliably predicted by any of them. Our study proposes the introduction of cluster stability as a necessary checkpoint for cluster interpretation and contributes to the construction of a more systematic and standardized analytical framework for the assessment of cytometry clustering results. © 2016 International Society for Advancement of Cytometry. © 2016 International Society for Advancement of Cytometry.
Bansal, Ravi; Peterson, Bradley S
2018-06-01
Identifying regional effects of interest in MRI datasets usually entails testing a priori hypotheses across many thousands of brain voxels, requiring control for false positive findings in these multiple hypotheses testing. Recent studies have suggested that parametric statistical methods may have incorrectly modeled functional MRI data, thereby leading to higher false positive rates than their nominal rates. Nonparametric methods for statistical inference when conducting multiple statistical tests, in contrast, are thought to produce false positives at the nominal rate, which has thus led to the suggestion that previously reported studies should reanalyze their fMRI data using nonparametric tools. To understand better why parametric methods may yield excessive false positives, we assessed their performance when applied both to simulated datasets of 1D, 2D, and 3D Gaussian Random Fields (GRFs) and to 710 real-world, resting-state fMRI datasets. We showed that both the simulated 2D and 3D GRFs and the real-world data contain a small percentage (<6%) of very large clusters (on average 60 times larger than the average cluster size), which were not present in 1D GRFs. These unexpectedly large clusters were deemed statistically significant using parametric methods, leading to empirical familywise error rates (FWERs) as high as 65%: the high empirical FWERs were not a consequence of parametric methods failing to model spatial smoothness accurately, but rather of these very large clusters that are inherently present in smooth, high-dimensional random fields. In fact, when discounting these very large clusters, the empirical FWER for parametric methods was 3.24%. Furthermore, even an empirical FWER of 65% would yield on average less than one of those very large clusters in each brain-wide analysis. Nonparametric methods, in contrast, estimated distributions from those large clusters, and therefore, by construct rejected the large clusters as false positives at the nominal FWERs. Those rejected clusters were outlying values in the distribution of cluster size but cannot be distinguished from true positive findings without further analyses, including assessing whether fMRI signal in those regions correlates with other clinical, behavioral, or cognitive measures. Rejecting the large clusters, however, significantly reduced the statistical power of nonparametric methods in detecting true findings compared with parametric methods, which would have detected most true findings that are essential for making valid biological inferences in MRI data. Parametric analyses, in contrast, detected most true findings while generating relatively few false positives: on average, less than one of those very large clusters would be deemed a true finding in each brain-wide analysis. We therefore recommend the continued use of parametric methods that model nonstationary smoothness for cluster-level, familywise control of false positives, particularly when using a Cluster Defining Threshold of 2.5 or higher, and subsequently assessing rigorously the biological plausibility of the findings, even for large clusters. Finally, because nonparametric methods yielded a large reduction in statistical power to detect true positive findings, we conclude that the modest reduction in false positive findings that nonparametric analyses afford does not warrant a re-analysis of previously published fMRI studies using nonparametric techniques. Copyright © 2018 Elsevier Inc. All rights reserved.
X-ray and optical substructures of the DAFT/FADA survey clusters
NASA Astrophysics Data System (ADS)
Guennou, L.; Durret, F.; Adami, C.; Lima Neto, G. B.
2013-04-01
We have undertaken the DAFT/FADA survey with the double aim of setting constraints on dark energy based on weak lensing tomography and of obtaining homogeneous and high quality data for a sample of 91 massive clusters in the redshift range 0.4-0.9 for which there were HST archive data. We have analysed the XMM-Newton data available for 42 of these clusters to derive their X-ray temperatures and luminosities and search for substructures. Out of these, a spatial analysis was possible for 30 clusters, but only 23 had deep enough X-ray data for a really robust analysis. This study was coupled with a dynamical analysis for the 26 clusters having at least 30 spectroscopic galaxy redshifts in the cluster range. Altogether, the X-ray sample of 23 clusters and the optical sample of 26 clusters have 14 clusters in common. We present preliminary results on the coupled X-ray and dynamical analyses of these 14 clusters.
Moment tensor clustering: a tool to monitor mining induced seismicity
NASA Astrophysics Data System (ADS)
Cesca, Simone; Dahm, Torsten; Tolga Sen, Ali
2013-04-01
Automated moment tensor inversion routines have been setup in the last decades for the analysis of global and regional seismicity. Recent developments could be used to analyse smaller events and larger datasets. In particular, applications to microseismicity, e.g. in mining environments, have then led to the generation of large moment tensor catalogues. Moment tensor catalogues provide a valuable information about the earthquake source and details of rupturing processes taking place in the seismogenic region. Earthquake focal mechanisms can be used to discuss the local stress field, possible orientations of the fault system or to evaluate the presence of shear and/or tensile cracks. Focal mechanism and moment tensor solutions are typically analysed for selected events, and quick and robust tools for the automated analysis of larger catalogues are needed. We propose here a method to perform cluster analysis for large moment tensor catalogues and identify families of events which characterize the studied microseismicity. Clusters include events with similar focal mechanisms, first requiring the definition of distance between focal mechanisms. Different metrics are here proposed, both for the case of pure double couple, constrained moment tensor and full moment tensor catalogues. Different clustering approaches are implemented and discussed. The method is here applied to synthetic and real datasets from mining environments to demonstrate its potential: the proposed cluserting techniques prove to be able to automatically recognise major clusters. An important application for mining monitoring concerns the early identification of anomalous rupture processes, which is relevant for the hazard assessment. This study is funded by the project MINE, which is part of the R&D-Programme GEOTECHNOLOGIEN. The project MINE is funded by the German Ministry of Education and Research (BMBF), Grant of project BMBF03G0737.
Martínez-del Campo, Ana; Bodea, Smaranda; Hamer, Hilary A; Marks, Jonathan A; Haiser, Henry J; Turnbaugh, Peter J; Balskus, Emily P
2015-04-14
Elucidation of the molecular mechanisms underlying the human gut microbiota's effects on health and disease has been complicated by difficulties in linking metabolic functions associated with the gut community as a whole to individual microorganisms and activities. Anaerobic microbial choline metabolism, a disease-associated metabolic pathway, exemplifies this challenge, as the specific human gut microorganisms responsible for this transformation have not yet been clearly identified. In this study, we established the link between a bacterial gene cluster, the choline utilization (cut) cluster, and anaerobic choline metabolism in human gut isolates by combining transcriptional, biochemical, bioinformatic, and cultivation-based approaches. Quantitative reverse transcription-PCR analysis and in vitro biochemical characterization of two cut gene products linked the entire cluster to growth on choline and supported a model for this pathway. Analyses of sequenced bacterial genomes revealed that the cut cluster is present in many human gut bacteria, is predictive of choline utilization in sequenced isolates, and is widely but discontinuously distributed across multiple bacterial phyla. Given that bacterial phylogeny is a poor marker for choline utilization, we were prompted to develop a degenerate PCR-based method for detecting the key functional gene choline TMA-lyase (cutC) in genomic and metagenomic DNA. Using this tool, we found that new choline-metabolizing gut isolates universally possessed cutC. We also demonstrated that this gene is widespread in stool metagenomic data sets. Overall, this work represents a crucial step toward understanding anaerobic choline metabolism in the human gut microbiota and underscores the importance of examining this microbial community from a function-oriented perspective. Anaerobic choline utilization is a bacterial metabolic activity that occurs in the human gut and is linked to multiple diseases. While bacterial genes responsible for choline fermentation (the cut gene cluster) have been recently identified, there has been no characterization of these genes in human gut isolates and microbial communities. In this work, we use multiple approaches to demonstrate that the pathway encoded by the cut genes is present and functional in a diverse range of human gut bacteria and is also widespread in stool metagenomes. We also developed a PCR-based strategy to detect a key functional gene (cutC) involved in this pathway and applied it to characterize newly isolated choline-utilizing strains. Both our analyses of the cut gene cluster and this molecular tool will aid efforts to further understand the role of choline metabolism in the human gut microbiota and its link to disease. Copyright © 2015 Martínez-del Campo et al.
Patanasatienkul, Thitiwan; Sanchez, Javier; Rees, Erin E; Pfeiffer, Dirk; Revie, Crawford W
2015-06-15
Sea lice infestation levels on wild chum and pink salmon in the Broughton Archipelago region are known to vary spatially and temporally; however, the locations of areas associated with a high infestation level had not been investigated yet. In the present study, the multivariate spatial scan statistic based on a Poisson model was used to assess spatial clustering of elevated sea lice (Caligus clemensi and Lepeophtheirus salmonis) infestation levels on wild chum and pink salmon sampled between March and July of 2004 to 2012 in the Broughton Archipelago and Knight Inlet regions of British Columbia, Canada. Three covariates, seine type (beach and purse seining), fish size, and year effect, were used to provide adjustment within the analyses. The analyses were carried out across the five months/datasets and between two fish species to assess the consistency of the identified clusters. Sea lice stages were explored separately for the early life stages (non-motile) and the late life stages of sea lice (motile). Spatial patterns in fish migration were also explored using monthly plots showing the average number of each fish species captured per sampling site. The results revealed three clusters for non-motile C. clemensi, two clusters for non-motile L. salmonis, and one cluster for the motile stage in each of the sea lice species. In general, the location and timing of clusters detected for both fish species were similar. Early in the season, the clusters of elevated sea lice infestation levels on wild fish are detected in areas closer to the rivers, with decreasing relative risks as the season progresses. Clusters were detected further from the estuaries later in the season, accompanied by increasing relative risks. In addition, the plots for fish migration exhibit similar patterns for both fish species in that, as expected, the juveniles move from the rivers toward the open ocean as the season progresses The identification of space-time clustering of infestation on wild fish from this study can help in targeting investigations of factors associated with these infestations and thereby support the development of more effective sea lice control measures. Copyright © 2015 Elsevier B.V. All rights reserved.
WIYN OPEN CLUSTER STUDY. XXXVI. SPECTROSCOPIC BINARY ORBITS IN NGC 188
DOE Office of Scientific and Technical Information (OSTI.GOV)
Geller, Aaron M.; Mathieu, Robert D.; Harris, Hugh C.
2009-04-15
We present 98 spectroscopic binary orbits resulting from our ongoing radial velocity survey of the old (7 Gyr) open cluster NGC 188. All but 13 are high-probability cluster members based on both radial velocity and proper motion membership analyses. Fifteen of these member binaries are double lined. Our stellar sample spans a magnitude range of 10.8 {<=}V{<=} 16.5 (1.14-0.92 M {sub sun}) and extends spatially to 17 pc ({approx}13 core radii). All of our binary orbits have periods ranging from a few days to on the order of 10{sup 3} days, and thus are hard binaries that dynamically power themore » cluster. For each binary, we present the orbital solutions and place constraints on the component masses. Additionally, we discuss a few binaries of note from our sample, identifying a likely blue straggler-blue straggler binary system (7782), a double-lined binary with a secondary star which is underluminous for its mass (5080), two potential eclipsing binaries (4705 and 5762), and two binaries which are likely members of a quadruple system (5015a and 5015b)« less
Measuring Spatial Dependence for Infectious Disease Epidemiology
Grabowski, M. Kate; Cummings, Derek A. T.
2016-01-01
Global spatial clustering is the tendency of points, here cases of infectious disease, to occur closer together than expected by chance. The extent of global clustering can provide a window into the spatial scale of disease transmission, thereby providing insights into the mechanism of spread, and informing optimal surveillance and control. Here the authors present an interpretable measure of spatial clustering, τ, which can be understood as a measure of relative risk. When biological or temporal information can be used to identify sets of potentially linked and likely unlinked cases, this measure can be estimated without knowledge of the underlying population distribution. The greater our ability to distinguish closely related (i.e., separated by few generations of transmission) from more distantly related cases, the more closely τ will track the true scale of transmission. The authors illustrate this approach using examples from the analyses of HIV, dengue and measles, and provide an R package implementing the methods described. The statistic presented, and measures of global clustering in general, can be powerful tools for analysis of spatially resolved data on infectious diseases. PMID:27196422
Mears, Jessica; Abubakar, Ibrahim; Cohen, Theodore; McHugh, Timothy D; Sonnenberg, Pam
2015-01-21
To systematically review the evidence for the impact of study design and setting on the interpretation of tuberculosis (TB) transmission using clustering derived from Mycobacterial Interspersed Repetitive Units-Variable Number Tandem Repeats (MIRU-VNTR) strain typing. MEDLINE, EMBASE, CINHAL, Web of Science and Scopus were searched for articles published before 21st October 2014. Studies in humans that reported the proportion of clustering of TB isolates by MIRU-VNTR were included in the analysis. Univariable meta-regression analyses were conducted to assess the influence of study design and setting on the proportion of clustering. The search identified 27 eligible articles reporting clustering between 0% and 63%. The number of MIRU-VNTR loci typed, requiring consent to type patient isolates (as a proxy for sampling fraction), the TB incidence and the maximum cluster size explained 14%, 14%, 27% and 48% of between-study variation, respectively, and had a significant association with the proportion of clustering. Although MIRU-VNTR typing is being adopted worldwide there is a paucity of data on how study design and setting may influence estimates of clustering. We have highlighted study design variables for consideration in the design and interpretation of future studies. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
NASA Astrophysics Data System (ADS)
Fawole, O. G.; Cai, X.; MacKenzie, A. R.
2015-12-01
Aerosol remote sensing techniques and back-trajectory modeling can be combined to identify aerosol types. We have clustered 7 years of AERONET aerosol signals using trajectory analysis to identify dominant aerosol sources at two AERONET sites in West Africa: Ilorin (4.34 oE, 8.32 oN) and Djougou (1.60 oE, 9.76 oN). Of particular interest are air masses that have passed through the gas flaring region in the Niger Delta area, of Nigeria, en-route the AERONET sites. 7-day back trajectories were calculated using the UK UGAMP trajectory model driven by ECMWF wind analyses data. Dominant sources identified, using literature classifications, are desert dust (DD), Biomass burning (BB) and Urban-Industrial (UI). Below, we use a combination of synoptic trajectories and aerosol optical properties to distinguish a fourth source: that due to gas flaring. Gas flaring, (GF) the disposal of gas through stack in an open-air flame, is believed to be a prominent source of black carbon (BC) and greenhouse gases. For these different aerosol source signatures, single scattering albedo (SSA), refractive index , extinction Angstrom exponent (EEA) and absorption Angstrom exponent (AAE) were used to classify the light absorption characteristics of the aerosols for λ = 440, 675, 870 and1020 nm. A total of 1625 daily averages of aerosol data were collected for the two sites. Of which 245 make up the GF cluster for both sites. For GF cluster, the range of fine-mode fraction is 0.4 - 0.7. Average values SSA(λ), for the total and GF clusters are 0.90(440), 0.93(675), 0.95(870) and 0.96(1020), and 0.93(440), 0.92(675), 0.9(870) and 0.9(1020), respectively. Values of for the GF clusters for both sites are 0.62 - 1.11, compared to 1.28 - 1.66 for the remainder of the clusters, which strongly indicates the dominance of carbonaceous particles (BC), typical of a highly industrial area. An average value of 1.58 for the real part of the refractive index at low SSA for aerosol in the GF cluster is also an indicator of high BC content. Extinction Angstrom exponent, is an indicator of the particle size. EAE values of 0.95-1.32 for aerosol in the GF cluster shows that the aerosols are mainly fine or accumulation mode while values of EAE (0.36-0.6) for the other cluster indicate coarse mode domination of the aerosol. See table 1 for a summary of result.
Zhang, Lin; Vranckx, Katleen; Janssens, Koen; Sandrin, Todd R.
2015-01-01
MALDI-TOF mass spectrometry has been shown to be a rapid and reliable tool for identification of bacteria at the genus and species, and in some cases, strain levels. Commercially available and open source software tools have been developed to facilitate identification; however, no universal/standardized data analysis pipeline has been described in the literature. Here, we provide a comprehensive and detailed demonstration of bacterial identification procedures using a MALDI-TOF mass spectrometer. Mass spectra were collected from 15 diverse bacteria isolated from Kartchner Caverns, AZ, USA, and identified by 16S rDNA sequencing. Databases were constructed in BioNumerics 7.1. Follow-up analyses of mass spectra were performed, including cluster analyses, peak matching, and statistical analyses. Identification was performed using blind-coded samples randomly selected from these 15 bacteria. Two identification methods are presented: similarity coefficient-based and biomarker-based methods. Results show that both identification methods can identify the bacteria to the species level. PMID:25590854
Zhang, Lin; Vranckx, Katleen; Janssens, Koen; Sandrin, Todd R
2015-01-02
MALDI-TOF mass spectrometry has been shown to be a rapid and reliable tool for identification of bacteria at the genus and species, and in some cases, strain levels. Commercially available and open source software tools have been developed to facilitate identification; however, no universal/standardized data analysis pipeline has been described in the literature. Here, we provide a comprehensive and detailed demonstration of bacterial identification procedures using a MALDI-TOF mass spectrometer. Mass spectra were collected from 15 diverse bacteria isolated from Kartchner Caverns, AZ, USA, and identified by 16S rDNA sequencing. Databases were constructed in BioNumerics 7.1. Follow-up analyses of mass spectra were performed, including cluster analyses, peak matching, and statistical analyses. Identification was performed using blind-coded samples randomly selected from these 15 bacteria. Two identification methods are presented: similarity coefficient-based and biomarker-based methods. Results show that both identification methods can identify the bacteria to the species level.
Allanson, Emma R; Tunçalp, Özge; Vogel, Joshua P; Khan, Dina N; Oladapo, Olufemi T; Long, Qian; Gülmezoglu, Ahmet Metin
2017-01-01
The capacity for health systems to support the translation of research in to clinical practice may be limited. The cluster randomised controlled trial (cluster RCT) design is often employed in evaluating the effectiveness of implementation of evidence-based practices. We aimed to systematically review available evidence to identify and evaluate the components in the implementation process at the facility level using cluster RCT designs. All cluster RCTs where the healthcare facility was the unit of randomisation, published or written from 1990 to 2014, were assessed. Included studies were analysed for the components of implementation interventions employed in each. Through iterative mapping and analysis, we synthesised a master list of components used and summarised the effects of different combinations of interventions on practices. Forty-six studies met the inclusion criteria and covered the specialty groups of obstetrics and gynaecology (n=9), paediatrics and neonatology (n=4), intensive care (n=4), internal medicine (n=20), and anaesthetics and surgery (n=3). Six studies included interventions that were delivered across specialties. Nine components of multifaceted implementation interventions were identified: leadership, barrier identification, tailoring to the context, patient involvement, communication, education, supportive supervision, provision of resources, and audit and feedback. The four main components that were most commonly used were education (n=42, 91%), audit and feedback (n=26, 57%), provision of resources (n=23, 50%) and leadership (n=21, 46%). Future implementation research should focus on better reporting of multifaceted approaches, incorporating sets of components that facilitate the translation of research into practice, and should employ rigorous monitoring and evaluation.
Klein Velderman, Mariska; Dusseldorp, Elise; van Nieuwenhuijzen, Maroesjka; Junger, Marianne; Paulussen, Theo G W M; Reijneveld, Sijmen A
2015-02-01
Adverse health-related behaviours (HRBs) have been shown to co-occur in adolescents. Evidence lacks on factors associated with these co-occurring HRBs. The Theory of Triadic Influence (TTI) offers a route to categorize these determinants according to type (social, cultural and intrapersonal) and distance in the causal pathway (ultimate or distal). Our aims were to identify cultural, social and intrapersonal factors associated with co-occurring HRBs and to assess the relative importance of ultimate and distal factors for each cluster of co-occurring HRBs. Respondents concerned a random sample of 898 adolescents aged 12-18 years, stratified by age, sex and educational level of head of household. Data were collected via face-to-face computer-assisted interviewing and internet questionnaires. Analyses were performed for young (12-15 years) and late (16-18 years) adolescents regarding two and three clusters of HRB, respectively. For each cluster of HRBs (e.g. smoking, delinquency), associated factors were found. These accounted for 27 to 57% of the total variance per cluster. Factors came in particular from the intrapersonal stream of the TTI at the ultimate level and the social stream at the distal level. Associations were strongest for parenting practices, risk behaviours of friends and parents and self-control. Results of this study confirm that it is possible to identify a selection of cultural, social and intrapersonal factors associated with co-occurring HRBs among adolescents. © The Author 2014. Published by Oxford University Press on behalf of the European Public Health Association. All rights reserved.
Phenotypes determined by cluster analysis in severe or difficult-to-treat asthma.
Schatz, Michael; Hsu, Jin-Wen Y; Zeiger, Robert S; Chen, Wansu; Dorenbaum, Alejandro; Chipps, Bradley E; Haselkorn, Tmirah
2014-06-01
Asthma phenotyping can facilitate understanding of disease pathogenesis and potential targeted therapies. To further characterize the distinguishing features of phenotypic groups in difficult-to-treat asthma. Children ages 6-11 years (n = 518) and adolescents and adults ages ≥12 years (n = 3612) with severe or difficult-to-treat asthma from The Epidemiology and Natural History of Asthma: Outcomes and Treatment Regimens (TENOR) study were evaluated in this post hoc cluster analysis. Analyzed variables included sex, race, atopy, age of asthma onset, smoking (adolescents and adults), passive smoke exposure (children), obesity, and aspirin sensitivity. Cluster analysis used the hierarchical clustering algorithm with the Ward minimum variance method. The results were compared among clusters by χ(2) analysis; variables with significant (P < .05) differences among clusters were considered as distinguishing feature candidates. Associations among clusters and asthma-related health outcomes were assessed in multivariable analyses by adjusting for socioeconomic status, environmental exposures, and intensity of therapy. Five clusters were identified in each age stratum. Sex, atopic status, and nonwhite race were distinguishing variables in both strata; passive smoke exposure was distinguishing in children and aspirin sensitivity in adolescents and adults. Clusters were not related to outcomes in children, but 2 adult and adolescent clusters distinguished by nonwhite race and aspirin sensitivity manifested poorer quality of life (P < .0001), and the aspirin-sensitive cluster experienced more frequent asthma exacerbations (P < .0001). Distinct phenotypes appear to exist in patients with severe or difficult-to-treat asthma, which is related to outcomes in adolescents and adults but not in children. The study of the therapeutic implications of these phenotypes is warranted. Copyright © 2013 American Academy of Allergy, Asthma & Immunology. Published by Mosby, Inc. All rights reserved.
Wiangkham, Taweewat; Duda, Joan; Haque, M Sayeed; Price, Jonathan; Rushton, Alison
2016-01-01
Introduction Whiplash-associated disorder (WAD) causes substantial social and economic burden internationally. Up to 60% of patients with WAD progress to chronicity. Research therefore needs to focus on effective management in the acute stage to prevent the development of chronicity. Approximately 93% of patients are classified as WADII (neck complaint and musculoskeletal sign(s)), and in the UK, most are managed in the private sector. In our recent systematic review, a combination of active and behavioural physiotherapy was identified as potentially effective in the acute stage. An Active Behavioural Physiotherapy Intervention (ABPI) was developed through combining empirical (modified Delphi study) and theoretical (social cognitive theory focusing on self-efficacy) evidence. This pilot and feasibility trial has been designed to inform the design of an adequately powered definitive randomised controlled trial. Methods and analysis Two parallel phases. (1) An external pilot and feasibility cluster randomised double-blind (assessor and participants), parallel two-arm (ABPI vs standard physiotherapy) clinical trial to evaluate procedures and feasibility. Six UK private physiotherapy clinics will be recruited and cluster randomised by a computer-generated randomisation sequence. Sixty participants (30 each arm) will be assessed at recruitment (baseline) and at 3 months postbaseline. The planned primary outcome measure is the neck disability index. (2) An embedded exploratory qualitative study using semistructured indepth interviews (n=3–4 physiotherapists) and a focus group (n=6–8 patients) and entailing the recruitment of purposive samples will explore perceptions of the ABPI. Quantitative data will be analysed descriptively. Qualitative data will be coded and analysed deductively (identify themes) and inductively (identify additional themes). Ethics and dissemination This trial is approved by the University of Birmingham Ethics Committee (ERN_15-0542). Trial registration number ISRCTN84528320. PMID:27412105
Wiangkham, Taweewat; Duda, Joan; Haque, M Sayeed; Price, Jonathan; Rushton, Alison
2016-07-13
Whiplash-associated disorder (WAD) causes substantial social and economic burden internationally. Up to 60% of patients with WAD progress to chronicity. Research therefore needs to focus on effective management in the acute stage to prevent the development of chronicity. Approximately 93% of patients are classified as WADII (neck complaint and musculoskeletal sign(s)), and in the UK, most are managed in the private sector. In our recent systematic review, a combination of active and behavioural physiotherapy was identified as potentially effective in the acute stage. An Active Behavioural Physiotherapy Intervention (ABPI) was developed through combining empirical (modified Delphi study) and theoretical (social cognitive theory focusing on self-efficacy) evidence. This pilot and feasibility trial has been designed to inform the design of an adequately powered definitive randomised controlled trial. Two parallel phases. (1) An external pilot and feasibility cluster randomised double-blind (assessor and participants), parallel two-arm (ABPI vs standard physiotherapy) clinical trial to evaluate procedures and feasibility. Six UK private physiotherapy clinics will be recruited and cluster randomised by a computer-generated randomisation sequence. Sixty participants (30 each arm) will be assessed at recruitment (baseline) and at 3 months postbaseline. The planned primary outcome measure is the neck disability index. (2) An embedded exploratory qualitative study using semistructured indepth interviews (n=3-4 physiotherapists) and a focus group (n=6-8 patients) and entailing the recruitment of purposive samples will explore perceptions of the ABPI. Quantitative data will be analysed descriptively. Qualitative data will be coded and analysed deductively (identify themes) and inductively (identify additional themes). This trial is approved by the University of Birmingham Ethics Committee (ERN_15-0542). ISRCTN84528320. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Azeredo, Catarina Machado; Levy, Renata Bertazzi; Peres, Maria Fernanda Tourinho; Menezes, Paulo Rossi; Araya, Ricardo
2016-11-10
The aim of this study was to analyse the clustering of multiple health-related behaviours among adolescents and describe which socio-demographic characteristics are associated with these patterns. Cross-sectional study. Brazilian schools assessed by the National Survey of School Health (PeNSE, 2012). 104 109 Brazilian ninth-grade students from public and private schools (response rate=82.7%). Exploratory and confirmatory factor analyses were performed to identify behaviour clustering and linear regression models were used to identify socio-demographic characteristics associated with each one of these behaviour patterns. We identified a good fit model with three behaviour patterns. The first was labelled 'problem-behaviour' and included aggressive behaviour, alcohol consumption, smoking, drug use and unsafe sex; the second was labelled 'health-compromising diet and sedentary behaviours' and included unhealthy food indicators and sedentary behaviour; and the third was labelled 'health-promoting diet and physical activity' and included healthy food indicators and physical activity. No differences in behaviour patterns were found between genders. The problem-behaviour pattern was associated with male gender, older age, more developed region (socially and economically) and public schools (compared with private). The 'health-compromising diet and sedentary behaviours' pattern was associated with female gender, older age, mothers with higher education level and more developed region. The 'health-promoting diet and physical activity' pattern was associated with male gender and mothers with higher education level. Three health-related behaviour patterns were found among Brazilian adolescents. Interventions to decrease those negative patterns should take into account how these behaviours cluster together and the individuals most at risk. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
The ergot alkaloid gene cluster: functional analyses and evolutionary aspects.
Lorenz, Nicole; Haarmann, Thomas; Pazoutová, Sylvie; Jung, Manfred; Tudzynski, Paul
2009-01-01
Ergot alkaloids and their derivatives have been traditionally used as therapeutic agents in migraine, blood pressure regulation and help in childbirth and abortion. Their production in submerse culture is a long established biotechnological process. Ergot alkaloids are produced mainly by members of the genus Claviceps, with Claviceps purpurea as best investigated species concerning the biochemistry of ergot alkaloid synthesis (EAS). Genes encoding enzymes involved in EAS have been shown to be clustered; functional analyses of EAS cluster genes have allowed to assign specific functions to several gene products. Various Claviceps species differ with respect to their host specificity and their alkaloid content; comparison of the ergot alkaloid clusters in these species (and of clavine alkaloid clusters in other genera) yields interesting insights into the evolution of cluster structure. This review focuses on recently published and also yet unpublished data on the structure and evolution of the EAS gene cluster and on the function and regulation of cluster genes. These analyses have also significant biotechnological implications: the characterization of non-ribosomal peptide synthetases (NRPS) involved in the synthesis of the peptide moiety of ergopeptines opened interesting perspectives for the synthesis of ergot alkaloids; on the other hand, defined mutants could be generated producing interesting intermediates or only single peptide alkaloids (instead of the alkaloid mixtures usually produced by industrial strains).
Detecting space-time cancer clusters using residential histories
NASA Astrophysics Data System (ADS)
Jacquez, Geoffrey M.; Meliker, Jaymie R.
2007-04-01
Methods for analyzing geographic clusters of disease typically ignore the space-time variability inherent in epidemiologic datasets, do not adequately account for known risk factors (e.g., smoking and education) or covariates (e.g., age, gender, and race), and do not permit investigation of the latency window between exposure and disease. Our research group recently developed Q-statistics for evaluating space-time clustering in cancer case-control studies with residential histories. This technique relies on time-dependent nearest neighbor relationships to examine clustering at any moment in the life-course of the residential histories of cases relative to that of controls. In addition, in place of the widely used null hypothesis of spatial randomness, each individual's probability of being a case is instead based on his/her risk factors and covariates. Case-control clusters will be presented using residential histories of 220 bladder cancer cases and 440 controls in Michigan. In preliminary analyses of this dataset, smoking, age, gender, race and education were sufficient to explain the majority of the clustering of residential histories of the cases. Clusters of unexplained risk, however, were identified surrounding the business address histories of 10 industries that emit known or suspected bladder cancer carcinogens. The clustering of 5 of these industries began in the 1970's and persisted through the 1990's. This systematic approach for evaluating space-time clustering has the potential to generate novel hypotheses about environmental risk factors. These methods may be extended to detect differences in space-time patterns of any two groups of people, making them valuable for security intelligence and surveillance operations.
Grieve, Richard; Nixon, Richard; Thompson, Simon G
2010-01-01
Cost-effectiveness analyses (CEA) may be undertaken alongside cluster randomized trials (CRTs) where randomization is at the level of the cluster (for example, the hospital or primary care provider) rather than the individual. Costs (and outcomes) within clusters may be correlated so that the assumption made by standard bivariate regression models, that observations are independent, is incorrect. This study develops a flexible modeling framework to acknowledge the clustering in CEA that use CRTs. The authors extend previous Bayesian bivariate models for CEA of multicenter trials to recognize the specific form of clustering in CRTs. They develop new Bayesian hierarchical models (BHMs) that allow mean costs and outcomes, and also variances, to differ across clusters. They illustrate how each model can be applied using data from a large (1732 cases, 70 primary care providers) CRT evaluating alternative interventions for reducing postnatal depression. The analyses compare cost-effectiveness estimates from BHMs with standard bivariate regression models that ignore the data hierarchy. The BHMs show high levels of cost heterogeneity across clusters (intracluster correlation coefficient, 0.17). Compared with standard regression models, the BHMs yield substantially increased uncertainty surrounding the cost-effectiveness estimates, and altered point estimates. The authors conclude that ignoring clustering can lead to incorrect inferences. The BHMs that they present offer a flexible modeling framework that can be applied more generally to CEA that use CRTs.
Kiwuwa-Muyingo, Sylvia; Nazziwa, Jamirah; Ssemwanga, Deogratius; Ilmonen, Pauliina; Ndembi, Nicaise; Parry, Chris; Kitandwe, Paul Kato; Gershim, Asiki; Mpendo, Juliet; Neilsen, Leslie; Seeley, Janet; Seppälä, Heikki; Lyagoba, Fred; Kamali, Anatoli; Kaleebu, Pontiano
2017-01-01
Background Fishing communities around Lake Victoria in sub-Saharan Africa have been characterised as a population at high risk of HIV-infection. Methods Using data from a cohort of HIV-positive individuals aged 13–49 years, enrolled from 5 fishing communities on Lake Victoria between 2009–2011, we sought to identify factors contributing to the epidemic and to understand the underlying structure of HIV transmission networks. Clinical and socio-demographic data were combined with HIV-1 phylogenetic analyses. HIV-1 gag-p24 and env-gp-41 sub-genomic fragments were amplified and sequenced from 283 HIV-1-infected participants. Phylogenetic clusters with ≥2 highly related sequences were defined as transmission clusters. Logistic regression models were used to determine factors associated with clustering. Results Altogether, 24% (n = 67/283) of HIV positive individuals with sequences fell within 34 phylogenetically distinct clusters in at least one gene region (either gag or env). Of these, 83% occurred either within households or within community; 8/34 (24%) occurred within household partnerships, and 20/34 (59%) within community. 7/12 couples (58%) within households clustered together. Individuals in clusters with potential recent transmission (11/34) were more likely to be younger 71% (15/21) versus 46% (21/46) in un-clustered individuals and had recently become resident in the community 67% (14/21) vs 48% (22/46). Four of 11 (36%) potential transmission clusters included incident-incident transmissions. Independently, clustering was less likely in HIV subtype D (adjusted Odds Ratio, aOR = 0.51 [95% CI 0.26–1.00]) than A and more likely in those living with an HIV-infected individual in the household (aOR = 6.30 [95% CI 3.40–11.68]). Conclusions A large proportion of HIV sexual transmissions occur within house-holds and within communities even in this key mobile population. The findings suggest localized HIV transmissions and hence a potential benefit for the test and treat approach even at a community level, coupled with intensified HIV counselling to identify early infections. PMID:29023474
Kiwuwa-Muyingo, Sylvia; Nazziwa, Jamirah; Ssemwanga, Deogratius; Ilmonen, Pauliina; Njai, Harr; Ndembi, Nicaise; Parry, Chris; Kitandwe, Paul Kato; Gershim, Asiki; Mpendo, Juliet; Neilsen, Leslie; Seeley, Janet; Seppälä, Heikki; Lyagoba, Fred; Kamali, Anatoli; Kaleebu, Pontiano
2017-01-01
Fishing communities around Lake Victoria in sub-Saharan Africa have been characterised as a population at high risk of HIV-infection. Using data from a cohort of HIV-positive individuals aged 13-49 years, enrolled from 5 fishing communities on Lake Victoria between 2009-2011, we sought to identify factors contributing to the epidemic and to understand the underlying structure of HIV transmission networks. Clinical and socio-demographic data were combined with HIV-1 phylogenetic analyses. HIV-1 gag-p24 and env-gp-41 sub-genomic fragments were amplified and sequenced from 283 HIV-1-infected participants. Phylogenetic clusters with ≥2 highly related sequences were defined as transmission clusters. Logistic regression models were used to determine factors associated with clustering. Altogether, 24% (n = 67/283) of HIV positive individuals with sequences fell within 34 phylogenetically distinct clusters in at least one gene region (either gag or env). Of these, 83% occurred either within households or within community; 8/34 (24%) occurred within household partnerships, and 20/34 (59%) within community. 7/12 couples (58%) within households clustered together. Individuals in clusters with potential recent transmission (11/34) were more likely to be younger 71% (15/21) versus 46% (21/46) in un-clustered individuals and had recently become resident in the community 67% (14/21) vs 48% (22/46). Four of 11 (36%) potential transmission clusters included incident-incident transmissions. Independently, clustering was less likely in HIV subtype D (adjusted Odds Ratio, aOR = 0.51 [95% CI 0.26-1.00]) than A and more likely in those living with an HIV-infected individual in the household (aOR = 6.30 [95% CI 3.40-11.68]). A large proportion of HIV sexual transmissions occur within house-holds and within communities even in this key mobile population. The findings suggest localized HIV transmissions and hence a potential benefit for the test and treat approach even at a community level, coupled with intensified HIV counselling to identify early infections.
NASA Astrophysics Data System (ADS)
Benson, Bryant Joseph
Context: Galaxy clusters are the most massive gravitationally bound structures in the universe and are formed through the process of hierarchical clustering, in which smaller systems undergo a series of mergers to form ever larger clusters. Because of the masses involved, mergers between these giants provide a unique laboratory for observing many interesting astrophysical processes. These merging systems also act as large dark matter colliders, because the dark matter halos of the clusters involved pass through each other during of the merger. This offers us a means to observe if dark matter-dark matter collisions result in momentum exchange beyond what occurs from gravity alone. Such observations can help us to unravel some of the mysteries behind dark matter, such as does it interact with itself through mechanisms beyond gravity, and how strong are those interactions. Answers to questions like these are what will eventually allow us to discover what dark matter really is. However, the extremely long time scales for these mergers (˜several billion years) make each observation a single snapshot in the long merger history, and we must infer many of the details necessary for understanding the full merger process. Furthermore, current weak lensing analyses lack the precision required to detect a signal from self-interacting dark matter. Uncertain weak lensing mass and position estimates also yield large uncertainties in the dynamical reconstruction of the merger scenarios. Need: In order to better model the dynamics of merging galaxy cluster systems, and to potentially measure any signal from self-interacting dark matter, we need to obtain more precise measurements on the masses and positions of the dark matter halos involved. Gravitational lensing offers a robust method for mapping the mass in these clusters because it directly measures the gravitational field, and does not depend on the dynamical state of the system that has been disturbed in the merger process. Of the lensing methods, weak gravitational lensing is the only way that we can probe a wide field and measure the total mass of the cluster. However, the precision of conventional weak lensing techniques is currently limited by shape noise (uncertainty in the shear due to the dispersion in the intrinsic shapes and orientations of unlensed galaxies). A possible avenue forward is to eliminate shape noise as a source of uncertainty in shear measurements via a technique to be described below. This would eliminate the largest source of uncertainty in weak lensing analyses, and enable us to obtain mass and position estimates of dark matter halos with a much higher level of precision. Task: In this dissertation we perform statistical clustering, conventional weak lensing analyses, and dynamical reconstruction on the merging galaxy cluster system ZwCl 2341.1+0000 in order to test the capabilities of the dynamical modeling on a complex, multiple merger. We use targeted optical spectroscopy to identify cluster member galaxies, which we then use to model the galaxy substructures. We also obtain a dynamical mass estimate using the galaxy velocity dispersions, and perform weak lensing analyses in the forms of aperture densitometry to place an upper bound on the total cluster mass, and multiple NFW profile halo fitting to approximate the masses and positions of the individual dark matter halos present in the merger. The masses, positions, and line of sight velocities of those clusters are then used to constrain the parameters describing the best fit merger scenario, with radio relic positions and polarization used to further tighten those constraints. We also develop a new method for obtaining weak lensing data from individual source galaxies in the form of shear measurements that are independent of shape noise, and direct measurements of the convergence. We accomplish this by simultaneously modeling the pre-lensing velocity and intensity profiles of a lensed, rotating disk galaxy, and the lensing transform required to distort those into the lensed profiles we observe. We test this method with a host of idealized simulations to characterize its capabilities in a best-case scenario and forecast the possible improvements it can bring to the precision of weak lensing analyses on galaxy clusters. (Abstract shortened by ProQuest.).
Gil-Serna, Jessica; Vázquez, Covadonga; González-Jaén, María Teresa; Patiño, Belén
2015-12-02
Aspergillus steynii is probably the most relevant species of section Circumdati producing ochratoxin A (OTA). This mycotoxin contaminates a wide number of commodities and it is highly toxic for humans and animals. Little is known on the biosynthetic genes and their regulation in Aspergillus species. In this work, we identified and analysed three contiguous genes in A. steynii using 5'-RACE and genome walking approaches which predicted a cytochrome P450 monooxygenase (p450ste), a non-ribosomal peptide synthetase (nrpsste) and a polyketide synthase (pksste). These three genes were contiguous within a 20742 bp long genomic DNA fragment. Their corresponding cDNA were sequenced and their expression was analysed in three A. steynii strains using real time RT-PCR specific assays in permissive conditions in in vitro cultures. OTA was also analysed in these cultures. Comparative analyses of predicted genomic, cDNA and amino acid sequences were performed with sequences of similar gene functions. All the results obtained in these analyses were consistent and point out the involvement of these three genes in OTA biosynthesis by A. steynii and showed a co-ordinated expression pattern. This is the first time that a clustered organization OTA biosynthetic genes has been reported in Aspergillus genus. The results also suggested that this situation might be common in Aspergillus OTA-producing species and distinct to the one described for Penicillium species. Copyright © 2015 Elsevier B.V. All rights reserved.
Hammas, Karima; Yaouanq, Jacqueline; Lannes, Morgane; Edan, Gilles; Viel, Jean-François
2017-09-21
Despite intensive research over several decades, the etiology of multiple sclerosis (MS) remains poorly understood, although environmental factors are supposedly implicated. Our goal was to identify spatial clusters of MS incident cases at the small-area level to provide clues to local environmental risk factors that might cause or trigger the disease. A population-based and multi-stage study was performed in the French Brittany region to accurately ascertain the clinical onset of disease during the 2000-2004 period. The municipality of residence at the time of clinical onset was geocoded. To test for the presence of MS incidence clusters and to identify their approximate locations, we used a spatial scan statistic. We adjusted for socioeconomic deprivation, known to be strongly associated with increased MS incident rates, and scanned simultaneously for areas with either high or low rates. Sensitivity analyses (focusing on relapsing-remitting forms and/or places of residence available within the year following clinical onset) were performed. A total of 848 incident cases of MS were registered in Brittany, corresponding to a crude annual incidence rate of 5.8 per 100,000 inhabitants. The spatial scan statistic did not find a significant cluster of MS incidence in either the primary analysis (p value ≥ 0.56) or in the sensitivity analyses (p value ≥ 0.16). The findings of this study indicate that MS incidence is not markedly affected across space, suggesting that in the years preceding the first clinical expression of the disease, no environmental trigger is operative at the small-area population level in the French Brittany region.
Camu, Nicholas; De Winter, Tom; Verbrugghe, Kristof; Cleenwerck, Ilse; Vandamme, Peter; Takrama, Jemmy S.; Vancanneyt, Marc; De Vuyst, Luc
2007-01-01
The Ghanaian cocoa bean heap fermentation process was studied through a multiphasic approach, encompassing both microbiological and metabolite target analyses. A culture-dependent (plating and incubation, followed by repetitive-sequence-based PCR analyses of picked-up colonies) and culture-independent (denaturing gradient gel electrophoresis [DGGE] of 16S rRNA gene amplicons, PCR-DGGE) approach revealed a limited biodiversity and targeted population dynamics of both lactic acid bacteria (LAB) and acetic acid bacteria (AAB) during fermentation. Four main clusters were identified among the LAB isolated: Lactobacillus plantarum, Lactobacillus fermentum, Leuconostoc pseudomesenteroides, and Enterococcus casseliflavus. Other taxa encompassed, for instance, Weissella. Only four clusters were found among the AAB identified: Acetobacter pasteurianus, Acetobacter syzygii-like bacteria, and two small clusters of Acetobacter tropicalis-like bacteria. Particular strains of L. plantarum, L. fermentum, and A. pasteurianus, originating from the environment, were well adapted to the environmental conditions prevailing during Ghanaian cocoa bean heap fermentation and apparently played a significant role in the cocoa bean fermentation process. Yeasts produced ethanol from sugars, and LAB produced lactic acid, acetic acid, ethanol, and mannitol from sugars and/or citrate. Whereas L. plantarum strains were abundant in the beginning of the fermentation, L. fermentum strains converted fructose into mannitol upon prolonged fermentation. A. pasteurianus grew on ethanol, mannitol, and lactate and converted ethanol into acetic acid. A newly proposed Weissella sp., referred to as “Weissella ghanaensis,” was detected through PCR-DGGE analysis in some of the fermentations and was only occasionally picked up through culture-based isolation. Two new species of Acetobacter were found as well, namely, the species tentatively named “Acetobacter senegalensis” (A. tropicalis-like) and “Acetobacter ghanaensis” (A. syzygii-like). PMID:17277227
Spatiotemporal Dynamics of Scrub Typhus Transmission in Mainland China, 2006-2014
Hu, Wen-Biao; Haque, Ubydul; Weppelmann, Thomas A.; Wang, Yong; Liu, Yun-Xi; Li, Xin-Lou; Sun, Hai-Long; Sun, Yan-Song; Clements, Archie C. A.; Li, Shen-Long; Zhang, Wen-Yi
2016-01-01
Background Scrub typhus is endemic in the Asia-Pacific region including China, and the number of reported cases has increased dramatically in the past decade. However, the spatial-temporal dynamics and the potential risk factors in transmission of scrub typhus in mainland China have yet to be characterized. Objective This study aims to explore the spatiotemporal dynamics of reported scrub typhus cases in mainland China between January 2006 and December 2014, to detect the location of high risk spatiotemporal clusters of scrub typhus cases, and identify the potential risk factors affecting the re-emergence of the disease. Method Monthly cases of scrub typhus reported at the county level between 2006 and 2014 were obtained from the Chinese Center for Diseases Control and Prevention. Time-series analyses, spatiotemporal cluster analyses, and spatial scan statistics were used to explore the characteristics of the scrub typhus incidence. To explore the association between scrub typhus incidence and environmental variables panel Poisson regression analysis was conducted. Results During the time period between 2006 and 2014 a total of 54,558 scrub typhus cases were reported in mainland China, which grew exponentially. The majority of cases were reported each year between July and November, with peak incidence during October every year. The spatiotemporal dynamics of scrub typhus varied over the study period with high-risk clusters identified in southwest, southern, and middle-eastern part of China. Scrub typhus incidence was positively correlated with the percentage of shrub and meteorological variables including temperature and precipitation. Conclusions The results of this study demonstrate areas in China that could be targeted with public health interventions to mitigate the growing threat of scrub typhus in the country. PMID:27479297
Spatiotemporal Dynamics of Scrub Typhus Transmission in Mainland China, 2006-2014.
Wu, Yi-Cheng; Qian, Quan; Soares Magalhaes, Ricardo J; Han, Zhi-Hai; Hu, Wen-Biao; Haque, Ubydul; Weppelmann, Thomas A; Wang, Yong; Liu, Yun-Xi; Li, Xin-Lou; Sun, Hai-Long; Sun, Yan-Song; Clements, Archie C A; Li, Shen-Long; Zhang, Wen-Yi
2016-08-01
Scrub typhus is endemic in the Asia-Pacific region including China, and the number of reported cases has increased dramatically in the past decade. However, the spatial-temporal dynamics and the potential risk factors in transmission of scrub typhus in mainland China have yet to be characterized. This study aims to explore the spatiotemporal dynamics of reported scrub typhus cases in mainland China between January 2006 and December 2014, to detect the location of high risk spatiotemporal clusters of scrub typhus cases, and identify the potential risk factors affecting the re-emergence of the disease. Monthly cases of scrub typhus reported at the county level between 2006 and 2014 were obtained from the Chinese Center for Diseases Control and Prevention. Time-series analyses, spatiotemporal cluster analyses, and spatial scan statistics were used to explore the characteristics of the scrub typhus incidence. To explore the association between scrub typhus incidence and environmental variables panel Poisson regression analysis was conducted. During the time period between 2006 and 2014 a total of 54,558 scrub typhus cases were reported in mainland China, which grew exponentially. The majority of cases were reported each year between July and November, with peak incidence during October every year. The spatiotemporal dynamics of scrub typhus varied over the study period with high-risk clusters identified in southwest, southern, and middle-eastern part of China. Scrub typhus incidence was positively correlated with the percentage of shrub and meteorological variables including temperature and precipitation. The results of this study demonstrate areas in China that could be targeted with public health interventions to mitigate the growing threat of scrub typhus in the country.
The Adequate Corpus Luteum: miR-96 Promotes Luteal Cell Survival and Progesterone Production.
Mohammed, Bushra T; Sontakke, Sadanand D; Ioannidis, Jason; Duncan, W Colin; Donadeu, F Xavier
2017-07-01
Inadequate progesterone production from the corpus luteum is associated with pregnancy loss. Data available in model species suggest important roles of microRNAs (miRNAs) in luteal development and maintenance. To comprehensively investigate the involvement of miRNAs during the ovarian follicle-luteal transition. The effects of specific miRNAs on survival and steroid production by human luteinized granulosa cells (hLGCs) were tested using specific miRNA inhibitors. Candidate miRNAs were identified through microarray analyses of follicular and luteal tissues in a bovine model. An academic institution in the United Kingdom associated with a teaching hospital. hLGCs were obtained by standard transvaginal follicular-fluid aspiration from 35 women undergoing assisted conception. Inhibition of candidate miRNAs in vitro. Levels of miRNAs, mRNAs, FOXO1 protein, apoptosis, and steroids were measured in tissues and/or cultured cells. Two specific miRNA clusters, miR-183-96-182 and miR-212-132, were dramatically increased in luteal relative to follicular tissues. miR-96 and miR-132 were the most upregulated miRNAs within each cluster. Database analyses identified FOXO1 as a putative target of both these miRNAs. In cultured hLGCs, inhibition of miR-96 increased apoptosis and FOXO1 protein levels, and decreased progesterone production. These effects were prevented by small interfering RNA-mediated downregulation of FOXO1. In bovine luteal cells, miR-96 inhibition also led to increases in apoptosis and FOXO1 protein levels. miR-96 targets FOXO1 to regulate luteal development through effects on cell survival and steroid production. The miR-183-96-182 cluster could provide a novel target for the manipulation of luteal function. Copyright © 2017 Endocrine Society
Sundqvist, Martin; Granholm, Susanne; Naseer, Umaer; Rydén, Patrik; Brolund, Alma; Sundsfjord, Arnfinn; Kahlmeter, Gunnar; Johansson, Anders
2014-12-01
A 2-year prospective intervention on the prescription of trimethoprim reduced the use by 85% in a health care region with 178,000 inhabitants. Here, we performed before-and-after analyses of the within-population distribution of trimethoprim resistance in Escherichia coli. Phylogenetic and population genetic methods were applied to multilocus sequence typing data of 548 consecutively collected E. coli isolates from clinical urinary specimens. Results were analyzed in relation to antibiotic susceptibility and the presence and genomic location of different trimethoprim resistance gene classes. A total of 163 E. coli sequence types (STs) were identified, of which 68 were previously undescribed. The isolates fell into one of three distinct genetic clusters designated BAPS 1 (E. coli phylogroup B2), BAPS 2 (phylogroup A and B1), and BAPS 3 (phylogroup D), each with a similar frequency before and after the intervention. BAPS 2 and BAPS 3 were positively and BAPS 1 was negatively associated with trimethoprim resistance (odds ratios of 1.97, 3.17, and 0.26, respectively). In before-and-after analyses, trimethoprim resistance frequency increased in BAPS 1 and decreased in BAPS 2. Resistance to antibiotics other than trimethoprim increased in BAPS 2. Analysis of the genomic location of different trimethoprim resistance genes in isolates of ST69, ST58, and ST73 identified multiple independent acquisition events in isolates of the same ST. The results show that despite a stable overall resistance frequency in E. coli before and after the intervention, marked within-population changes occurred. A decrease of resistance in one major genetic cluster was masked by a reciprocal increase in another major cluster. Copyright © 2014, American Society for Microbiology. All Rights Reserved.
Sundqvist, Martin; Granholm, Susanne; Naseer, Umaer; Rydén, Patrik; Brolund, Alma; Sundsfjord, Arnfinn; Kahlmeter, Gunnar
2014-01-01
A 2-year prospective intervention on the prescription of trimethoprim reduced the use by 85% in a health care region with 178,000 inhabitants. Here, we performed before-and-after analyses of the within-population distribution of trimethoprim resistance in Escherichia coli. Phylogenetic and population genetic methods were applied to multilocus sequence typing data of 548 consecutively collected E. coli isolates from clinical urinary specimens. Results were analyzed in relation to antibiotic susceptibility and the presence and genomic location of different trimethoprim resistance gene classes. A total of 163 E. coli sequence types (STs) were identified, of which 68 were previously undescribed. The isolates fell into one of three distinct genetic clusters designated BAPS 1 (E. coli phylogroup B2), BAPS 2 (phylogroup A and B1), and BAPS 3 (phylogroup D), each with a similar frequency before and after the intervention. BAPS 2 and BAPS 3 were positively and BAPS 1 was negatively associated with trimethoprim resistance (odds ratios of 1.97, 3.17, and 0.26, respectively). In before-and-after analyses, trimethoprim resistance frequency increased in BAPS 1 and decreased in BAPS 2. Resistance to antibiotics other than trimethoprim increased in BAPS 2. Analysis of the genomic location of different trimethoprim resistance genes in isolates of ST69, ST58, and ST73 identified multiple independent acquisition events in isolates of the same ST. The results show that despite a stable overall resistance frequency in E. coli before and after the intervention, marked within-population changes occurred. A decrease of resistance in one major genetic cluster was masked by a reciprocal increase in another major cluster. PMID:25288078
Using coordinate-based meta-analyses to explore structural imaging genetics.
Janouschek, Hildegard; Eickhoff, Claudia R; Mühleisen, Thomas W; Eickhoff, Simon B; Nickl-Jockschat, Thomas
2018-05-05
Imaging genetics has become a highly popular approach in the field of schizophrenia research. A frequently reported finding is that effects from common genetic variation are associated with a schizophrenia-related structural endophenotype. Genetic contributions to a structural endophenotype may be easier to delineate, when referring to biological rather than diagnostic criteria. We used coordinate-based meta-analyses, namely the anatomical likelihood estimation (ALE) algorithm on 30 schizophrenia-related imaging genetics studies, representing 44 single-nucleotide polymorphisms at 26 gene loci investigated in 4682 subjects. To test whether analyses based on biological information would improve the convergence of results, gene ontology (GO) terms were used to group the findings from the published studies. We did not find any significant results for the main contrast. However, our analysis enrolling studies on genotype × diagnosis interaction yielded two clusters in the left temporal lobe and the medial orbitofrontal cortex. All other subanalyses did not yield any significant results. To gain insight into possible biological relationships between the genes implicated by these clusters, we mapped five of them to GO terms of the category "biological process" (AKT1, CNNM2, DISC1, DTNBP1, VAV3), then five to "cellular component" terms (AKT1, CNNM2, DISC1, DTNBP1, VAV3), and three to "molecular function" terms (AKT1, VAV3, ZNF804A). A subsequent cluster analysis identified representative, non-redundant subsets of semantically similar terms that aided a further interpretation. We regard this approach as a new option to systematically explore the richness of the literature in imaging genetics.
A Web service substitution method based on service cluster nets
NASA Astrophysics Data System (ADS)
Du, YuYue; Gai, JunJing; Zhou, MengChu
2017-11-01
Service substitution is an important research topic in the fields of Web services and service-oriented computing. This work presents a novel method to analyse and substitute Web services. A new concept, called a Service Cluster Net Unit, is proposed based on Web service clusters. A service cluster is converted into a Service Cluster Net Unit. Then it is used to analyse whether the services in the cluster can satisfy some service requests. Meanwhile, the substitution methods of an atomic service and a composite service are proposed. The correctness of the proposed method is proved, and the effectiveness is shown and compared with the state-of-the-art method via an experiment. It can be readily applied to e-commerce service substitution to meet the business automation needs.
D Nearest Neighbour Search Using a Clustered Hierarchical Tree Structure
NASA Astrophysics Data System (ADS)
Suhaibah, A.; Uznir, U.; Anton, F.; Mioc, D.; Rahman, A. A.
2016-06-01
Locating and analysing the location of new stores or outlets is one of the common issues facing retailers and franchisers. This is due to assure that new opening stores are at their strategic location to attract the highest possible number of customers. Spatial information is used to manage, maintain and analyse these store locations. However, since the business of franchising and chain stores in urban areas runs within high rise multi-level buildings, a three-dimensional (3D) method is prominently required in order to locate and identify the surrounding information such as at which level of the franchise unit will be located or is the franchise unit located is at the best level for visibility purposes. One of the common used analyses used for retrieving the surrounding information is Nearest Neighbour (NN) analysis. It uses a point location and identifies the surrounding neighbours. However, with the immense number of urban datasets, the retrieval and analysis of nearest neighbour information and their efficiency will become more complex and crucial. In this paper, we present a technique to retrieve nearest neighbour information in 3D space using a clustered hierarchical tree structure. Based on our findings, the proposed approach substantially showed an improvement of response time analysis compared to existing approaches of spatial access methods in databases. The query performance was tested using a dataset consisting of 500,000 point locations building and franchising unit. The results are presented in this paper. Another advantage of this structure is that it also offers a minimal overlap and coverage among nodes which can reduce repetitive data entry.
Identifying seizure clusters in patients with epilepsy
Lipton, R. B.; LeValley, A. J.; Hall, C. B.; Shinnar, S.
2006-01-01
Clinicians often encounter patients whose neurologic attacks appear to cluster. In a daily diary study, the authors explored whether clustering is a true phenomenon in epilepsy and can be identified in the clinical setting. Nearly half the subjects experienced at least one episode of three or more seizures in 24 hours; 20% also met a statistical clustering criterion. Utilizing the clinical definition of clustering should identify all seizure clusterers, and false positives can be determined with diary data. PMID:16247068
Spittal, Matthew J; Carlin, John B; Currier, Dianne; Downes, Marnie; English, Dallas R; Gordon, Ian; Pirkis, Jane; Gurrin, Lyle
2016-10-31
The Australian Longitudinal Study on Male Health (Ten to Men) used a complex sampling scheme to identify potential participants for the baseline survey. This raises important questions about when and how to adjust for the sampling design when analyzing data from the baseline survey. We describe the sampling scheme used in Ten to Men focusing on four important elements: stratification, multi-stage sampling, clustering and sample weights. We discuss how these elements fit together when using baseline data to estimate a population parameter (e.g., population mean or prevalence) or to estimate the association between an exposure and an outcome (e.g., an odds ratio). We illustrate this with examples using a continuous outcome (weight in kilograms) and a binary outcome (smoking status). Estimates of a population mean or disease prevalence using Ten to Men baseline data are influenced by the extent to which the sampling design is addressed in an analysis. Estimates of mean weight and smoking prevalence are larger in unweighted analyses than weighted analyses (e.g., mean = 83.9 kg vs. 81.4 kg; prevalence = 18.0 % vs. 16.7 %, for unweighted and weighted analyses respectively) and the standard error of the mean is 1.03 times larger in an analysis that acknowledges the hierarchical (clustered) structure of the data compared with one that does not. For smoking prevalence, the corresponding standard error is 1.07 times larger. Measures of association (mean group differences, odds ratios) are generally similar in unweighted or weighted analyses and whether or not adjustment is made for clustering. The extent to which the Ten to Men sampling design is accounted for in any analysis of the baseline data will depend on the research question. When the goals of the analysis are to estimate the prevalence of a disease or risk factor in the population or the magnitude of a population-level exposure-outcome association, our advice is to adopt an analysis that respects the sampling design.
Keitel, Anne; Gross, Joachim
2016-01-01
The human brain can be parcellated into diverse anatomical areas. We investigated whether rhythmic brain activity in these areas is characteristic and can be used for automatic classification. To this end, resting-state MEG data of 22 healthy adults was analysed. Power spectra of 1-s long data segments for atlas-defined brain areas were clustered into spectral profiles (“fingerprints”), using k-means and Gaussian mixture (GM) modelling. We demonstrate that individual areas can be identified from these spectral profiles with high accuracy. Our results suggest that each brain area engages in different spectral modes that are characteristic for individual areas. Clustering of brain areas according to similarity of spectral profiles reveals well-known brain networks. Furthermore, we demonstrate task-specific modulations of auditory spectral profiles during auditory processing. These findings have important implications for the classification of regional spectral activity and allow for novel approaches in neuroimaging and neurostimulation in health and disease. PMID:27355236
Murine mesenchymal and embryonic stem cells express a similar Hox gene profile.
Phinney, Donald G; Gray, Andrew J; Hill, Katy; Pandey, Amitabh
2005-12-30
Using degenerate oligonucleotide primers targeting the homeobox domain, we amplified by PCR and sequenced 723 clones from five murine cell populations and lines derived from embryonic mesoderm and adult bone marrow. Transcripts from all four vertebrate Hox clusters were expressed by the different populations. Hierarchical clustering of the data revealed that mesenchymal stem cells (MSCs) and the embryonic stem (ES) cell line D3 shared a similar Hox expression profile. These populations exclusively expressed Hoxb2, Hoxb5, Hoxb7, and Hoxc4, transcripts regulating self-renewal and differentiation of other stem cells. Additionally, Hoxa7 transcript quantified by real-time PCR strongly correlated (r2=0.89) with the number of Hoxa7 clones identified by sequencing, validating that data from the PCR screen reflects differences in Hox mRNA abundance between populations. This is the first study to catalogue Hox transcripts in murine MSCs and by comparative analyses identify specific Hox genes that may contribute to their stem cell character.
Chalker, Victoria J; Smith, Alyson; Al-Shahib, Ali; Botchway, Stella; Macdonald, Emily; Daniel, Roger; Phillips, Sarah; Platt, Steven; Doumith, Michel; Tewolde, Rediat; Coelho, Juliana; Jolley, Keith A; Underwood, Anthony; McCarthy, Noel D
2016-06-01
Single-strain outbreaks of Streptococcus pyogenes infections are common and often go undetected. In 2013, two clusters of invasive group A Streptococcus (iGAS) infection were identified in independent but closely located care homes in Oxfordshire, United Kingdom. Investigation included visits to each home, chart review, staff survey, microbiologic sampling, and genome sequencing. S. pyogenes emm type 1.0, the most common circulating type nationally, was identified from all cases yielding GAS isolates. A tailored whole-genome reference population comprising epidemiologically relevant contemporaneous isolates and published isolates was assembled. Data were analyzed independently using whole-genome multilocus sequencing and single-nucleotide polymorphism analyses. Six isolates from staff and residents of the homes formed a single cluster that was separated from the reference population by both analytical approaches. No further cases occurred after mass chemoprophylaxis and enhanced infection control. Our findings demonstrate the ability of 2 independent analytical approaches to enable robust conclusions from nonstandardized whole-genome analysis to support public health practice.
Geographic differentiation of domesticated einkorn wheat and possible Neolithic migration routes.
Brandolini, A; Volante, A; Heun, M
2016-09-01
To analyse the spread of domesticated einkorn into Europe, 136 landraces, 9 wild einkorns and 3 Triticum urartu were fingerprinted by the diversity array technology sequence (DArT-seq) marker technology. The obtained 3455 single-nucleotide polymorphism (SNP) markers confirmed earlier results about the separation of wild and domesticated einkorn from T. urartu and about the pinpointing of the domesticated forms to the Karacadağ Mountains (Turkey). Further analyses identified two major domesticated landrace einkorn groups, one relating to the Prealpine region and the other to the Maghreb/Iberian region. The previously published four geographical provenance groups were mostly identified in our results. The earlier reported unique position of the Maghreb/Iberia einkorns cannot be confirmed, as the three landrace clusters we identified with STRUCTURE also occur in the remaining einkorn, although at different frequencies. The results are discussed with respect to the spreading of domesticated einkorn into Western Europe and two possible Neolithic migration routes are indicated.
NASA Astrophysics Data System (ADS)
Bocsi, Jozsef; Mittag, Anja; Pierzchalski, Arkadiusz; Osmancik, Pavel; Dähnert, Ingo; Tárnok, Attila
2011-02-01
Introduction: Methylprednisolone (MP) is frequently preoperatively administered in children undergoing open heart surgery. The aim of this medication is to inhibit overshooting immune responses. Earlier studies demonstrated cellular and humoral immunological changes in pediatric patients undergoing heart surgeries with and without MP administration. Here in a retrospective study we investigated the modulation of the cellular immune response by MP. The aim was to identify suitable parameters characterizing MP effects by cluster analysis. Methods: Blood samples were analysed from two aged matched groups with surgical correction of septum defects. Group without MP treatment consisted of 10 patients; MP was administered on 21 patients (median dose: 11mg/kg) before cardiopulmonary bypass (CPB). EDTA anticoagulated blood was obtained 24 h preoperatively, after anesthesia, at CPB begin and end (CPB2), 4h, 24h, 48h after surgery, at discharge and at out-patient followup (8.2; 3.3-12.2 month after surgery; median and IQR). Flow cytometry showed the biggest MP relevant changes at CPB2 and 4h postoperatively. They were used for clustering analysis. Classification was made by discriminant analysis and cluster analysis by means of Genes@work software. Results & conclusion: 146 parameters were obtained from analysis. Cross-validation revealed several parameters being able to discriminate between MP groups and to identify immune modulation. MP administration resulted in a delayed activation of monocytes, increased ratio of neutrophils, reduced T-lymphocytes counts. Cluster analysis demonstrated that classification of patients is possible based on the identified cytomics parameters. Further investigation of these parameters might help to understand the MP effects in pediatric open heart surgery.
Truscott, James E; Werkman, Marleen; Wright, James E; Farrell, Sam H; Sarkar, Rajiv; Ásbjörnsdóttir, Kristjana; Anderson, Roy M
2017-06-30
There is an increased focus on whether mass drug administration (MDA) programmes alone can interrupt the transmission of soil-transmitted helminths (STH). Mathematical models can be used to model these interventions and are increasingly being implemented to inform investigators about expected trial outcome and the choice of optimum study design. One key factor is the choice of threshold for detecting elimination. However, there are currently no thresholds defined for STH regarding breaking transmission. We develop a simulation of an elimination study, based on the DeWorm3 project, using an individual-based stochastic disease transmission model in conjunction with models of MDA, sampling, diagnostics and the construction of study clusters. The simulation is then used to analyse the relationship between the study end-point elimination threshold and whether elimination is achieved in the long term within the model. We analyse the quality of a range of statistics in terms of the positive predictive values (PPV) and how they depend on a range of covariates, including threshold values, baseline prevalence, measurement time point and how clusters are constructed. End-point infection prevalence performs well in discriminating between villages that achieve interruption of transmission and those that do not, although the quality of the threshold is sensitive to baseline prevalence and threshold value. Optimal post-treatment prevalence threshold value for determining elimination is in the range 2% or less when the baseline prevalence range is broad. For multiple clusters of communities, both the probability of elimination and the ability of thresholds to detect it are strongly dependent on the size of the cluster and the size distribution of the constituent communities. Number of communities in a cluster is a key indicator of probability of elimination and PPV. Extending the time, post-study endpoint, at which the threshold statistic is measured improves PPV value in discriminating between eliminating clusters and those that bounce back. The probability of elimination and PPV are very sensitive to baseline prevalence for individual communities. However, most studies and programmes are constructed on the basis of clusters. Since elimination occurs within smaller population sub-units, the construction of clusters introduces new sensitivities for elimination threshold values to cluster size and the underlying population structure. Study simulation offers an opportunity to investigate key sources of sensitivity for elimination studies and programme designs in advance and to tailor interventions to prevailing local or national conditions.
Chemodynamical Clustering Applied to APOGEE Data: Rediscovering Globular Clusters
NASA Astrophysics Data System (ADS)
Chen, Boquan; D’Onghia, Elena; Pardy, Stephen A.; Pasquali, Anna; Bertelli Motta, Clio; Hanlon, Bret; Grebel, Eva K.
2018-06-01
We have developed a novel technique based on a clustering algorithm that searches for kinematically and chemically clustered stars in the APOGEE DR12 Cannon data. As compared to classical chemical tagging, the kinematic information included in our methodology allows us to identify stars that are members of known globular clusters with greater confidence. We apply our algorithm to the entire APOGEE catalog of 150,615 stars whose chemical abundances are derived by the Cannon. Our methodology found anticorrelations between the elements Al and Mg, Na and O, and C and N previously identified in the optical spectra in globular clusters, even though we omit these elements in our algorithm. Our algorithm identifies globular clusters without a priori knowledge of their locations in the sky. Thus, not only does this technique promise to discover new globular clusters, but it also allows us to identify candidate streams of kinematically and chemically clustered stars in the Milky Way.
Zeh, Clement; Inzaule, Seth C.; Ondoa, Pascale; Nafisa, Lillian G.; Kasembeli, Alex; Otieno, Fredrick; Vandenhoudt, Hilde; Amornkul, Pauli N.; Mills, Lisa A.; Nkengasong, John N.
2016-01-01
Objective To identify unique characteristics of recent versus established HIV infections and describe sexual transmission networks, we characterized circulating HIV-1 strains from two randomly selected populations of ART-naïve participants in rural western Kenya. Methods Recent HIV infections were identified by the HIV-1 subtype B, E and D, immunoglobulin G capture immunoassay (IgG BED-CEIA) and BioRad avidity assays. Genotypic and phylogenetic analyses were performed on the pol gene to identify transmitted drug resistance (TDR) mutations, characterize HIV subtypes and potential transmission clusters. Factors associated with recent infection and clustering were assessed by logistic regression. Results Of the 320 specimens, 40 (12.5%) were concordantly identified by the two assays as recent infections. Factors independently associated with being recently infected were age ≤19 years (P = 0.001) and history of sexually transmitted infections (STIs) in the past six months (P = 0.004). HIV subtype distribution differed in recently versus chronically infected participants, with subtype A observed among 53% recent vs. 68% chronic infections (p = 0.04) and subtype D among 26% recent vs. 12% chronic infections (p = 0.012). Overall, the prevalence of primary drug resistance was 1.16%. Of the 258 sequences, 11.2% were in monophyletic clusters of between 2–4 individuals. In multivariate analysis factors associated with clustering included having recent HIV infection P = 0.043 and being from Gem region P = 0.002. Conclusions Recent HIV-1 infection was more frequent among 13–19 year olds compared with older age groups, underscoring the ongoing risk and susceptibility of younger persons for acquiring HIV infection. Our findings also provide evidence of sexual networks. The association of recent infections with clustering suggests that early infections may be contributing significant proportions of onward transmission highlighting the need for early diagnosis and treatment as prevention for ongoing prevention. Larger studies are needed to better understand the structure of these networks and subsequently implement and evaluate targeted interventions. PMID:26871567
Spearson Goulet, Jo-Annie; Tardif, Monique
2018-06-05
Very few studies have taken a specific interest in the various sexual dimensions, beyond delinquent sexual behavior, of adolescents who have engaged in sexual abuse (AESA). Those that went beyond delinquent sexual behavior have report mixed results, suggesting they are a heterogeneous group. The current study used cluster analysis to examine the sexuality profiles of AESA, which included information on several sexual dimensions (atypical and normative fantasies and experiences, drive, body image, pornography, first masturbation, onset of sexual interest and first exposure to sex). Participants (N = 136) are adolescents who have engaged in sexual abuse involving physical contact, for which at least one parent also participated in the study. They were recruited from six specialized treatment centers and three youth centers in Quebec (Canada). Cluster analyses were performed to identify specific sexual profiles. Results suggest three clusters of AESA: 1- Discordant sexuality pertaining to adolescents who show mostly normative sexual interests, 2- Constrictive sexuality, characterizing adolescents who seem to be less invested/interested in their sexuality and 3- Overinvested sexuality for adolescents showing an exacerbated sexuality, including atypical sexual interest. Additional analyses (ANOVAs and Chi-square tests) reveal that five delinquency and offense characteristics were significantly more likely to be present in the Overinvested than the Constrictive cluster: non-sexual offenses, three or more victims, peer victims and alcohol and drug consumption. Advancing our knowledge on this topic can provide relevant data for clinicians to better target interventions. Copyright © 2018. Published by Elsevier Ltd.
Vanbinst, Kiran; Ceulemans, Eva; Peters, Lien; Ghesquière, Pol; De Smedt, Bert
2018-02-01
Although symbolic numerical magnitude processing skills are key for learning arithmetic, their developmental trajectories remain unknown. Therefore, we delineated during the first 3years of primary education (5-8years of age) groups with distinguishable developmental trajectories of symbolic numerical magnitude processing skills using a model-based clustering approach. Three clusters were identified and were labeled as inaccurate, accurate but slow, and accurate and fast. The clusters did not differ in age, sex, socioeconomic status, or IQ. We also tested whether these clusters differed in domain-specific (nonsymbolic magnitude processing and digit identification) and domain-general (visuospatial short-term memory, verbal working memory, and processing speed) cognitive competencies that might contribute to children's ability to (efficiently) process the numerical meaning of Arabic numerical symbols. We observed minor differences between clusters in these cognitive competencies except for verbal working memory for which no differences were observed. Follow-up analyses further revealed that the above-mentioned cognitive competencies did not merely account for the cluster differences in children's development of symbolic numerical magnitude processing skills, suggesting that other factors account for these individual differences. On the other hand, the three trajectories of symbolic numerical magnitude processing revealed remarkable and stable differences in children's arithmetic fact retrieval, which stresses the importance of symbolic numerical magnitude processing for learning arithmetic. Copyright © 2017 Elsevier Inc. All rights reserved.
Glenn, Anthony E.; Davis, C. Britton; Gao, Minglu; Gold, Scott E.; Mitchell, Trevor R.; Proctor, Robert H.; Stewart, Jane E.; Snook, Maurice E.
2016-01-01
Microbes encounter a broad spectrum of antimicrobial compounds in their environments and often possess metabolic strategies to detoxify such xenobiotics. We have previously shown that Fusarium verticillioides, a fungal pathogen of maize known for its production of fumonisin mycotoxins, possesses two unlinked loci, FDB1 and FDB2, necessary for detoxification of antimicrobial compounds produced by maize, including the γ-lactam 2-benzoxazolinone (BOA). In support of these earlier studies, microarray analysis of F. verticillioides exposed to BOA identified the induction of multiple genes at FDB1 and FDB2, indicating the loci consist of gene clusters. One of the FDB1 cluster genes encoded a protein having domain homology to the metallo-β-lactamase (MBL) superfamily. Deletion of this gene (MBL1) rendered F. verticillioides incapable of metabolizing BOA and thus unable to grow on BOA-amended media. Deletion of other FDB1 cluster genes, in particular AMD1 and DLH1, did not affect BOA degradation. Phylogenetic analyses and topology testing of the FDB1 and FDB2 cluster genes suggested two horizontal transfer events among fungi, one being transfer of FDB1 from Fusarium to Colletotrichum, and the second being transfer of the FDB2 cluster from Fusarium to Aspergillus. Together, the results suggest that plant-derived xenobiotics have exerted evolutionary pressure on these fungi, leading to horizontal transfer of genes that enhance fitness or virulence. PMID:26808652
Health-related fitness profiles in adolescents with complex congenital heart disease.
Klausen, Susanne Hwiid; Wetterslev, Jørn; Søndergaard, Lars; Andersen, Lars L; Mikkelsen, Ulla Ramer; Dideriksen, Kasper; Zoffmann, Vibeke; Moons, Philip
2015-04-01
This study investigates whether subgroups of different health-related fitness (HrF) profiles exist among girls and boys with complex congenital heart disease (ConHD) and how these are associated with lifestyle behaviors. We measured the cardiorespiratory fitness, muscle strength, and body composition of 158 adolescents aged 13-16 years with previous surgery for a complex ConHD. Data on lifestyle behaviors were collected concomitantly between October 2010 and April 2013. A cluster analysis was conducted to identify profiles with similar HrF. For comparisons between clusters, multivariate analyses of covariance were used to test the differences in lifestyle behaviors. Three distinct profiles were formed: (1) Robust (43, 27%; 20 girls and 23 boys); (2) Moderately Robust (85, 54%; 37 girls and 48 boys); and (3) Less robust (30, 19%; 9 girls and 21 boys). The participants in the Robust clusters reported leading a physically active lifestyle and participants in the Less robust cluster reported leading a sedentary lifestyle. Diagnoses were evenly distributed between clusters. The cluster analysis attributed some of the variability in cardiorespiratory fitness among adolescents with complex ConHD to lifestyle behaviors and physical activity. Profiling of HrF offers a valuable new option in the management of person-centered health promotion. Copyright © 2015 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.
Spatial dynamics of invasion: the geometry of introduced species.
Korniss, Gyorgy; Caraco, Thomas
2005-03-07
Many exotic species combine low probability of establishment at each introduction with rapid population growth once introduction does succeed. To analyse this phenomenon, we note that invaders often cluster spatially when rare, and consequently an introduced exotic's population dynamics should depend on locally structured interactions. Ecological theory for spatially structured invasion relies on deterministic approximations, and determinism does not address the observed uncertainty of the exotic-introduction process. We take a new approach to the population dynamics of invasion and, by extension, to the general question of invasibility in any spatial ecology. We apply the physical theory for nucleation of spatial systems to a lattice-based model of competition between plant species, a resident and an invader, and the analysis reaches conclusions that differ qualitatively from the standard ecological theories. Nucleation theory distinguishes between dynamics of single- and multi-cluster invasion. Low introduction rates and small system size produce single-cluster dynamics, where success or failure of introduction is inherently stochastic. Single-cluster invasion occurs only if the cluster reaches a critical size, typically preceded by a number of failed attempts. For this case, we identify the functional form of the probability distribution of time elapsing until invasion succeeds. Although multi-cluster invasion for sufficiently large systems exhibits spatial averaging and almost-deterministic dynamics of the global densities, an analytical approximation from nucleation theory, known as Avrami's law, describes our simulation results far better than standard ecological approximations.
Han, Xiaolong; Chakrabortti, Alolika; Zhu, Jindong; Liang, Zhao-Xun; Li, Jinming
2016-08-15
Aspergillus westerdijkiae produces ochratoxin A (OTA) in Aspergillus section Circumdati. It is responsible for the contamination of agricultural crops, fruits, and food commodities, as its secondary metabolite OTA poses a potential threat to animals and humans. As a member of the filamentous fungi family, its capacity for enzymatic catalysis and secondary metabolite production is valuable in industrial production and medicine. To understand the genetic factors underlying its pathogenicity, enzymatic degradation, and secondary metabolism, we analysed the whole genome of A. westerdijkiae and compared it with eight other sequenced Aspergillus species. We sequenced the complete genome of A. westerdijkiae and assembled approximately 36 Mb of its genomic DNA, in which we identified 10,861 putative protein-coding genes. We constructed a phylogenetic tree of A. westerdijkiae and eight other sequenced Aspergillus species and found that the sister group of A. westerdijkiae was the A. oryzae - A. flavus clade. By searching the associated databases, we identified 716 cytochrome P450 enzymes, 633 carbohydrate-active enzymes, and 377 proteases. By combining comparative analysis with Kyoto Encyclopaedia of Genes and Genomes (KEGG), Conserved Domains Database (CDD), and Pfam annotations, we predicted 228 potential carbohydrate-active enzymes related to plant polysaccharide degradation (PPD). We found a large number of secondary biosynthetic gene clusters, which suggested that A. westerdijkiae had a remarkable capacity to produce secondary metabolites. Furthermore, we obtained two more reliable and integrated gene sequences containing the reported portions of OTA biosynthesis and identified their respective secondary metabolite clusters. We also systematically annotated these two hybrid t1pks-nrps gene clusters involved in OTA biosynthesis. These two clusters were separate in the genome, and one of them encoded a couple of GH3 and AA3 enzyme genes involved in sucrose and glucose metabolism. The genomic information obtained in this study is valuable for understanding the life cycle and pathogenicity of A. westerdijkiae. We identified numerous enzyme genes that are potentially involved in host invasion and pathogenicity, and we provided a preliminary prediction for each putative secondary metabolite (SM) gene cluster. In particular, for the OTA-related SM gene clusters, we delivered their components with domain and pathway annotations. This study sets the stage for experimental verification of the biosynthetic and regulatory mechanisms of OTA and for the discovery of new secondary metabolites.
Shang, Sang; Wu, Chunlai; Huang, Chao; Tie, Weiwei; Yan, Yan; Ding, Zehong; Xia, Zhiqiang; Wang, Wenquan; Peng, Ming; Tian, Libo; Hu, Wei
2018-02-20
GENERAL REGULATORY FACTOR (GRF) proteins play vital roles in the regulation of plant growth, development, and response to abiotic stress. However, little information is known for this gene family in cassava ( Manihot esculenta ). In this study, 15 MeGRFs were identified from the cassava genome and were clustered into the ε and the non-ε groups according to phylogenetic, conserved motif, and gene structure analyses. Transcriptomic analyses showed eleven Me GRFs with constitutively high expression in stems, leaves, and storage roots of two cassava genotypes. Expression analyses revealed that the majority of GRFs showed transcriptional changes under cold, osmotic, salt, abscisic acid (ABA), and H₂O₂ treatments. Six Me GRFs were found to be commonly upregulated by abiotic stress, ABA, and H₂O₂ treatments, which may be the converging points of multiple signaling pathways. Interaction network analysis identified 18 possible interactors of MeGRFs. Taken together, this study elucidates the transcriptional control of Me GRFs in tissue development and the responses of abiotic stress and related signaling in cassava. Some constitutively expressed, tissue-specific, and abiotic stress-responsive candidate MeGRF genes were identified for the further genetic improvement of crops.
Nissen, Kathrine G; Trevino, Kelly; Lange, Theis; Prigerson, Holly G
2016-12-01
Caring for a family member with advanced cancer strains family caregivers. Classification of family types has been shown to identify patients at risk of poor psychosocial function. However, little is known about how family relationships affect caregiver psychosocial function. To investigate family types identified by a cluster analysis and to examine the reproducibility of cluster analyses. We also sought to examine the relationship between family types and caregivers' psychosocial function. Data from 622 caregivers of advanced cancer patients (part of the Coping with Cancer Study) were analyzed using Gaussian Mixture Modeling as the primary method to identify family types based on the Family Relationship Index questionnaire. We then examined the relationship between family type and caregiver quality of life (Medical Outcome Survey Short Form), social support (Interpersonal Support Evaluation List), and perceived caregiver burden (Caregiving Burden Scale). Three family types emerged: low-expressive, detached, and supportive. Analyses of variance with post hoc comparisons showed that caregivers of detached and low-expressive family types experienced lower levels of quality of life and perceived social support in comparison to supportive family types. The study identified supportive, low-expressive, and detached family types among caregivers of advanced cancer patients. The supportive family type was associated with the best outcomes and detached with the worst. These findings indicate that family function is related to psychosocial function of caregivers of advanced cancer patients. Therefore, paying attention to family support and family members' ability to share feelings and manage conflicts may serve as an important tool to improve psychosocial function in families affected by cancer. Copyright © 2016 American Academy of Hospice and Palliative Medicine. All rights reserved.
Kirsten, Holger; Al-Hasani, Hoor; Holdt, Lesca; Gross, Arnd; Beutner, Frank; Krohn, Knut; Horn, Katrin; Ahnert, Peter; Burkhardt, Ralph; Reiche, Kristin; Hackermüller, Jörg; Löffler, Markus; Teupser, Daniel; Thiery, Joachim; Scholz, Markus
2015-01-01
Genetics of gene expression (eQTLs or expression QTLs) has proved an indispensable tool for understanding biological pathways and pathomechanisms of trait-associated SNPs. However, power of most genome-wide eQTL studies is still limited. We performed a large eQTL study in peripheral blood mononuclear cells of 2112 individuals increasing the power to detect trans-effects genome-wide. Going beyond univariate SNP-transcript associations, we analyse relations of eQTLs to biological pathways, polygenetic effects of expression regulation, trans-clusters and enrichment of co-localized functional elements. We found eQTLs for about 85% of analysed genes, and 18% of genes were trans-regulated. Local eSNPs were enriched up to a distance of 5 Mb to the transcript challenging typically implemented ranges of cis-regulations. Pathway enrichment within regulated genes of GWAS-related eSNPs supported functional relevance of identified eQTLs. We demonstrate that nearest genes of GWAS-SNPs might frequently be misleading functional candidates. We identified novel trans-clusters of potential functional relevance for GWAS-SNPs of several phenotypes including obesity-related traits, HDL-cholesterol levels and haematological phenotypes. We used chromatin immunoprecipitation data for demonstrating biological effects. Yet, we show for strongly heritable transcripts that still little trans-chromosomal heritability is explained by all identified trans-eSNPs; however, our data suggest that most cis-heritability of these transcripts seems explained. Dissection of co-localized functional elements indicated a prominent role of SNPs in loci of pseudogenes and non-coding RNAs for the regulation of coding genes. In summary, our study substantially increases the catalogue of human eQTLs and improves our understanding of the complex genetic regulation of gene expression, pathways and disease-related processes. PMID:26019233
Integrated genetic and epigenetic analysis identifies three different subclasses of colon cancer
Shen, Lanlan; Toyota, Minoru; Kondo, Yutaka; Lin, E; Zhang, Li; Guo, Yi; Hernandez, Natalie Supunpong; Chen, Xinli; Ahmed, Saira; Konishi, Kazuo; Hamilton, Stanley R.; Issa, Jean-Pierre J.
2007-01-01
Colon cancer has been viewed as the result of progressive accumulation of genetic and epigenetic abnormalities. However, this view does not fully reflect the molecular heterogeneity of the disease. We have analyzed both genetic (mutations of BRAF, KRAS, and p53 and microsatellite instability) and epigenetic alterations (DNA methylation of 27 CpG island promoter regions) in 97 primary colorectal cancer patients. Two clustering analyses on the basis of either epigenetic profiling or a combination of genetic and epigenetic profiling were performed to identify subclasses with distinct molecular signatures. Unsupervised hierarchical clustering of the DNA methylation data identified three distinct groups of colon cancers named CpG island methylator phenotype (CIMP) 1, CIMP2, and CIMP negative. Genetically, these three groups correspond to very distinct profiles. CIMP1 are characterized by MSI (80%) and BRAF mutations (53%) and rare KRAS and p53 mutations (16% and 11%, respectively). CIMP2 is associated with 92% KRAS mutations and rare MSI, BRAF, or p53 mutations (0, 4, and 31% respectively). CIMP-negative cases have a high rate of p53 mutations (71%) and lower rates of MSI (12%) or mutations of BRAF (2%) or KRAS (33%). Clustering based on both genetic and epigenetic parameters also identifies three distinct (and homogeneous) groups that largely overlap with the previous classification. The three groups are independent of age, gender, or stage, but CIMP1 and 2 are more common in proximal tumors. Together, our integrated genetic and epigenetic analysis reveals that colon cancers correspond to three molecularly distinct subclasses of disease. PMID:18003927
Unusual bacterioplankton community structure in ultra-oligotrophic Crater Lake
Urbach, Ena; Vergin, Kevin L.; Morse, Ariel
2001-01-01
The bacterioplankton assemblage in Crater Lake, Oregon (U.S.A.), is different from communities found in other oxygenated lakes, as demonstrated by four small subunit ribosomal ribonucleic acid (SSU rRNA) gene clone libraries and oligonucleotide probe hybridization to RNA from lake water. Populations in the euphotic zone of this deep (589 m), oligotrophic caldera lake are dominated by two phylogenetic clusters of currently uncultivated bacteria: CL120-10, a newly identified cluster in the verrucomicrobiales, and ACK4 actinomycetes, known as a minor constituent of bacterioplankton in other lakes. Deep-water populations at 300 and 500 m are dominated by a different pair of uncultivated taxa: CL500-11, a novel cluster in the green nonsulfur bacteria, and group I marine crenarchaeota. b-Proteobacteria, dominant in most other freshwater environments, are relatively rare in Crater Lake (<=16% of nonchloroplast bacterial rRNA at all depths). Other taxa identified in Crater Lake libraries include a newly identified candidate bacterial division, ABY1, and a newly identified subcluster, CL0-1, within candidate division OP10. Probe analyses confirmed vertical stratification of several microbial groups, similar to patterns observed in open-ocean systems. Additional similarities between Crater Lake and ocean microbial populations include aphotic zone dominance of group I marine crenarchaeota and green nonsulfur bacteria. Comparison of Crater Lake to other lakes studied by rRNA methods suggests that selective factors structuring Crater Lake bacterioplankton populations may include low concentrations of available trace metals and dissolved organic matter, chemistry of infiltrating hydrothermal waters, and irradiation by high levels of ultraviolet light.
Trajectories of acute low back pain: a latent class growth analysis.
Downie, Aron S; Hancock, Mark J; Rzewuska, Magdalena; Williams, Christopher M; Lin, Chung-Wei Christine; Maher, Christopher G
2016-01-01
Characterising the clinical course of back pain by mean pain scores over time may not adequately reflect the complexity of the clinical course of acute low back pain. We analysed pain scores over 12 weeks for 1585 patients with acute low back pain presenting to primary care to identify distinct pain trajectory groups and baseline patient characteristics associated with membership of each cluster. This was a secondary analysis of the PACE trial that evaluated paracetamol for acute low back pain. Latent class growth analysis determined a 5 cluster model, which comprised 567 (35.8%) patients who recovered by week 2 (cluster 1, rapid pain recovery); 543 (34.3%) patients who recovered by week 12 (cluster 2, pain recovery by week 12); 222 (14.0%) patients whose pain reduced but did not recover (cluster 3, incomplete pain recovery); 167 (10.5%) patients whose pain initially decreased but then increased by week 12 (cluster 4, fluctuating pain); and 86 (5.4%) patients who experienced high-level pain for the whole 12 weeks (cluster 5, persistent high pain). Patients with longer pain duration were more likely to experience delayed recovery or nonrecovery. Belief in greater risk of persistence was associated with nonrecovery, but not delayed recovery. Higher pain intensity, longer duration, and workers' compensation were associated with persistent high pain, whereas older age and increased number of episodes were associated with fluctuating pain. Identification of discrete pain trajectory groups offers the potential to better manage acute low back pain.
NASA Astrophysics Data System (ADS)
Chen, Siyue; Leung, Henry; Dondo, Maxwell
2014-05-01
As computer network security threats increase, many organizations implement multiple Network Intrusion Detection Systems (NIDS) to maximize the likelihood of intrusion detection and provide a comprehensive understanding of intrusion activities. However, NIDS trigger a massive number of alerts on a daily basis. This can be overwhelming for computer network security analysts since it is a slow and tedious process to manually analyse each alert produced. Thus, automated and intelligent clustering of alerts is important to reveal the structural correlation of events by grouping alerts with common features. As the nature of computer network attacks, and therefore alerts, is not known in advance, unsupervised alert clustering is a promising approach to achieve this goal. We propose a joint optimization technique for feature selection and clustering to aggregate similar alerts and to reduce the number of alerts that analysts have to handle individually. More precisely, each identified feature is assigned a binary value, which reflects the feature's saliency. This value is treated as a hidden variable and incorporated into a likelihood function for clustering. Since computing the optimal solution of the likelihood function directly is analytically intractable, we use the Expectation-Maximisation (EM) algorithm to iteratively update the hidden variable and use it to maximize the expected likelihood. Our empirical results, using a labelled Defense Advanced Research Projects Agency (DARPA) 2000 reference dataset, show that the proposed method gives better results than the EM clustering without feature selection in terms of the clustering accuracy.
Pellegrini, Michael; Zoghi, Maryam; Jaberzadeh, Shapour
2018-01-12
Cluster analysis and other subgrouping techniques have risen in popularity in recent years in non-invasive brain stimulation research in the attempt to investigate the issue of inter-individual variability - the issue of why some individuals respond, as traditionally expected, to non-invasive brain stimulation protocols and others do not. Cluster analysis and subgrouping techniques have been used to categorise individuals, based on their response patterns, as responder or non-responders. There is, however, a lack of consensus and consistency on the most appropriate technique to use. This systematic review aimed to provide a systematic summary of the cluster analysis and subgrouping techniques used to date and suggest recommendations moving forward. Twenty studies were included that utilised subgrouping techniques, while seven of these additionally utilised cluster analysis techniques. The results of this systematic review appear to indicate that statistical cluster analysis techniques are effective in identifying subgroups of individuals based on response patterns to non-invasive brain stimulation. This systematic review also reports a lack of consensus amongst researchers on the most effective subgrouping technique and the criteria used to determine whether an individual is categorised as a responder or a non-responder. This systematic review provides a step-by-step guide to carrying out statistical cluster analyses and subgrouping techniques to provide a framework for analysis when developing further insights into the contributing factors of inter-individual variability in response to non-invasive brain stimulation.
Spohn, Marius; Kirchner, Norbert; Kulik, Andreas; Jochim, Angelika; Wolf, Felix; Muenzer, Patrick; Borst, Oliver; Gross, Harald; Wohlleben, Wolfgang
2014-01-01
The emergence of antibiotic-resistant pathogenic bacteria within the last decades is one reason for the urgent need for new antibacterial agents. A strategy to discover new anti-infective compounds is the evaluation of the genetic capacity of secondary metabolite producers and the activation of cryptic gene clusters (genome mining). One genus known for its potential to synthesize medically important products is Amycolatopsis. However, Amycolatopsis japonicum does not produce an antibiotic under standard laboratory conditions. In contrast to most Amycolatopsis strains, A. japonicum is genetically tractable with different methods. In order to activate a possible silent glycopeptide cluster, we introduced a gene encoding the transcriptional activator of balhimycin biosynthesis, the bbr gene from Amycolatopsis balhimycina (bbrAba), into A. japonicum. This resulted in the production of an antibiotically active compound. Following whole-genome sequencing of A. japonicum, 29 cryptic gene clusters were identified by genome mining. One of these gene clusters is a putative glycopeptide biosynthesis gene cluster. Using bioinformatic tools, ristomycin (syn. ristocetin), a type III glycopeptide, which has antibacterial activity and which is used for the diagnosis of von Willebrand disease and Bernard-Soulier syndrome, was deduced as a possible product of the gene cluster. Chemical analyses by high-performance liquid chromatography and mass spectrometry (HPLC-MS), tandem mass spectrometry (MS/MS), and nuclear magnetic resonance (NMR) spectroscopy confirmed the in silico prediction that the recombinant A. japonicum/pRM4-bbrAba synthesizes ristomycin A. PMID:25114137
Pages-Monteiro, Laurence; Marti, Romain; Commun, Carine; Alliot, Nolwenn; Bardel, Claire; Meugnier, Helene; Perouse-de-Montclos, Michele; Reix, Philippe; Durieu, Isabelle; Durupt, Stephane; Vandenesch, Francois; Freney, Jean; Cournoyer, Benoit; Doleans-Jordheim, Anne
2017-01-01
Cystic fibrosis (CF) lungs harbor a complex community of interacting microbes, including pathogens like Pseudomonas aeruginosa. Meta-taxogenomic analysis based on V5-V6 rrs PCR products of 52 P. aeruginosa-positive (Pp) and 52 P. aeruginosa-negative (Pn) pooled DNA extracts from CF sputa suggested positive associations between P. aeruginosa and Stenotrophomonas and Prevotella, but negative ones with Haemophilus, Neisseria and Burkholderia. Internal Transcribed Spacer analyses (RISA) from individual DNA extracts identified three significant genetic structures within the CF cohorts, and indicated an impact of P. aeruginosa. RISA clusters Ip and IIIp contained CF sputa with a P. aeruginosa prevalence above 93%, and of 24.2% in cluster IIp. Clusters Ip and IIIp showed lower RISA genetic diversity and richness than IIp. Highly similar cluster IIp RISA profiles were obtained from two patients harboring isolates of a same P. aeruginosa clone, suggesting convergent evolution in the structure of their microbiota. CF patients of cluster IIp had received significantly less antibiotics than patients of clusters Ip and IIIp but harbored the most resistant P. aeruginosa strains. Patients of cluster IIIp were older than those of Ip. The effects of P. aeruginosa on the RISA structures could not be fully dissociated from the above two confounding factors but several trends in these datasets support the conclusion of a strong incidence of P. aeruginosa on the genetic structure of CF lung microbiota. PMID:28282386
University students' achievement goals and approaches to learning in mathematics.
Cano, Francisco; Berbén, A B G
2009-03-01
Achievement goals (AG) and students' approaches to learning (SAL) are two research perspectives on student motivation and learning in higher education that have until now been pursued quite independently. This study sets out: (a) to explore the relationship between the most representative variables of SAL and AG; (b) to identify subgroups (clusters) of students with multiple AG; and (c) to examine the differences between these clusters with respect to various SAL and AG characteristics. The participants were 680 male and female 1st year university students studying different subjects (e.g. mathematics, physics, economics) but all enrolled on mathematics courses (e.g. algebra, calculus). Participants completed a series of questionnaires that measured their conceptions of mathematics, approaches to learning, course experience, personal 2 x 2 AG, and perceived AG. SAL and AG variables were moderately associated and related to both the way students perceived their academic environment and the way they conceived of the nature of mathematics (i.e. the perceptual-cognitive framework). Four clusters of students with distinctive multiple AG were identified and when the differences between clusters were analysed, we were able to attribute them to various constructs including perceptual-cognitive framework, learning approaches, and academic performance. This study reveals a consistent pattern of relationships between SAL and AG perspectives across different methods of analysis, supports the relevance of the 2 x 2 AG framework in a mathematics learning context and suggests that AG and SAL may be intertwined aspects of students' experience of learning mathematics at university.
Dynamic spatiotemporal trends of dengue transmission in the Asia-Pacific region, 1955-2004.
Banu, Shahera; Hu, Wenbiao; Guo, Yuming; Naish, Suchithra; Tong, Shilu
2014-01-01
Dengue fever (DF) is one of the most important emerging arboviral human diseases. Globally, DF incidence has increased by 30-fold over the last fifty years, and the geographic range of the virus and its vectors has expanded. The disease is now endemic in more than 120 countries in tropical and subtropical parts of the world. This study examines the spatiotemporal trends of DF transmission in the Asia-Pacific region over a 50-year period, and identified the disease's cluster areas. The World Health Organization's DengueNet provided the annual number of DF cases in 16 countries in the Asia-Pacific region for the period 1955 to 2004. This fifty-year dataset was divided into five ten-year periods as the basis for the investigation of DF transmission trends. Space-time cluster analyses were conducted using scan statistics to detect the disease clusters. This study shows an increasing trend in the spatiotemporal distribution of DF in the Asia-Pacific region over the study period. Thailand, Vietnam, Laos, Singapore and Malaysia are identified as the most likely clusters (relative risk = 13.02) of DF transmission in this region in the period studied (1995 to 2004). The study also indicates that, for the most part, DF transmission has expanded southwards in the region. This information will lead to the improvement of DF prevention and control strategies in the Asia-Pacific region by prioritizing control efforts and directing them where they are most needed.
Bagheri, Nasser; Wangdi, Kinley; Cherbuin, Nicolas; Anstey, Kaarin J
2018-01-01
We have a poor understanding of whether dementia clusters geographically, how this occurs, and how dementia may relate to socio-demographic factors. To shed light on these important questions, this study aimed to compute a dementia risk score for individuals to assess spatial variation of dementia risk, identify significant clusters (hotspots), and explore their association with socioeconomic status. We used clinical records from 16 general practices (468 Statistical Area level 1 s, N = 14,746) from the city of west Adelaide, Australia for the duration of 1 January 2012 to 31 December 2014. Dementia risk was estimated using The Australian National University-Alzheimer's Disease Risk Index. Hotspot analyses were applied to examine potential clusters in dementia risk at small area level. Significant hotspots were observed in eastern and southern areas while coldspots were observed in the western area within the study perimeter. Additionally, significant hotspots were observed in low socio-economic communities. We found dementia risk scores increased with age, sex (female), high cholesterol, no physical activity, living alone (widow, divorced, separated, or never married), and co-morbidities such as diabetes and depression. Similarly, smoking was associated with a lower dementia risk score. The identification of dementia risk clusters may provide insight into possible geographical variations in risk factors for dementia and quantify these risks at the community level. As such, this research may enable policy makers to tailor early prevention strategies to the correct individuals within their precise locations.
Chaillon, Antoine; Essat, Asma; Frange, Pierre; Smith, Davey M; Delaugerre, Constance; Barin, Francis; Ghosn, Jade; Pialoux, Gilles; Robineau, Olivier; Rouzioux, Christine; Goujard, Cécile; Meyer, Laurence; Chaix, Marie-Laure
2017-02-21
Characterizing HIV-1 transmission networks can be important in understanding the evolutionary patterns and geospatial spread of the epidemic. We reconstructed the broad molecular epidemiology of HIV from individuals with primary HIV-1 infection (PHI) enrolled in France in the ANRS PRIMO C06 cohort over 15 years. Sociodemographic, geographic, clinical, biological and pol sequence data from 1356 patients were collected between 1999 and 2014. Network analysis was performed to infer genetic relationships, i.e. clusters of transmission, between HIV-1 sequences. Bayesian coalescent-based methods were used to examine the temporal and spatial dynamics of identified clusters from different regions in France. We also evaluated the use of network information to target prevention efforts. Participants were mostly Caucasian (85.9%) and men (86.7%) who reported sex with men (MSM, 71.4%). Overall, 387 individuals (28.5%) were involved in clusters: 156 patients (11.5%) in 78 dyads and 231 participants (17%) in 42 larger clusters (median size: 4, range 3-41). Compared to individuals with single PHI (n = 969), those in clusters were more frequently men (95.9 vs 83%, p < 0.01), MSM (85.8 vs 65.6%, p < 0.01) and infected with CRF02_AG (20.4 vs 13.4%, p < 0.01). Reconstruction of viral migrations across time suggests that Paris area was the major hub of dissemination of both subtype B and CRF02_AG epidemics. By targeting clustering individuals belonging to the identified active transmission network before 2010, 60 of the 143 onward transmissions could have been prevented. These analyses support the hypothesis of a recent and rapid rise of CRF02_AG within the French HIV-1 epidemic among MSM. Combined with a short turnaround time for sample processing, targeting prevention efforts based on phylogenetic monitoring may be an efficient way to deliver prevention interventions but would require near real time targeted interventions on the identified index cases and their partners.
Business and Marketing Cluster. Task Analyses.
ERIC Educational Resources Information Center
Henrico County Public Schools, Glen Allen, VA. Virginia Vocational Curriculum and Resource Center.
Developed in Virginia, this publication contains task analysis guides to support selected tech prep programs that prepare students for careers in the business and marketing cluster. Guides are included for accounting systems, legal systems administration, office systems technology, and retail marketing. Each task analyses guide has the following…
Clinical interpretation of the Spinal Cord Injury Functional Index (SCI-FI)
Fyffe, Denise; Kalpakjian, Claire Z.; Slavin, Mary; Kisala, Pamela; Ni, Pengsheng; Kirshblum, Steven C.; Tulsky, David S.; Jette, Alan M.
2016-01-01
Objective: To provide validation of functional ability levels for the Spinal Cord Injury – Functional Index (SCI-FI). Design: Cross-sectional. Setting: Inpatient rehabilitation hospital and community settings. Participants: A sample of 855 individuals with traumatic spinal cord injury enrolled in 6 rehabilitation centers participating in the National Spinal Cord Injury Model Systems Network. Interventions: Not Applicable. Main Outcome Measures: Spinal Cord Injury-Functional Index (SCI-FI). Results: Cluster analyses identified three distinct groups that represent low, mid-range and high SCI-FI functional ability levels. Comparison of clusters on personal and other injury characteristics suggested some significant differences between groups. Conclusions: These results strongly support the use of SCI-FI functional ability levels to document the perceived functional abilities of persons with SCI. Results of the cluster analysis suggest that the SCI-FI functional ability levels capture function by injury characteristics. Clinical implications regarding tracking functional activity trajectories during follow-up visits are discussed. PMID:26781769
Structural changes in white matter are uniquely related to children’s reading development
Myers, Chelsea A.; Vandermosten, Maaike; Farris, Emily A.; Hancock, Roeland; Gimenez, Paul; Black, Jessica M.; Casto, Brandi; Drahos, Miroslav; Tumber, Mandeep; Hendren, Robert L.; Hulme, Charles; Hoeft, Fumiko
2014-01-01
This study examined whether variations in brain development between kindergarten and Grade 3 predicted individual differences in reading ability at the latter time point. Structural MRI measurements indicated that increases in volume of two left temporo-parietal white matter clusters are unique predictors of reading outcome at Grade 3. Using diffusion MRI, the larger of these two clusters was identified as a location where fibers of the long segment of arcuate fasciculus and superior corona radiata intersect, and the smaller cluster as the posterior arcuate fasciculus. Bias-free regression analyses using regions-of-interest from prior literature revealed white matter volume changes in temporo-parietal white matter, together with preliteracy measures, predicted 56% of the variance in reading outcomes. Our findings demonstrate the important contribution of developmental differences in areas of left dorsal white matter, often implicated in phonological processing, as a sensitive early biomarker for later reading abilities, and by extension, reading difficulties. PMID:25212581
Single-particle cryo-EM-Improved ab initio 3D reconstruction with SIMPLE/PRIME.
Reboul, Cyril F; Eager, Michael; Elmlund, Dominika; Elmlund, Hans
2018-01-01
Cryogenic electron microscopy (cryo-EM) and single-particle analysis now enables the determination of high-resolution structures of macromolecular assemblies that have resisted X-ray crystallography and other approaches. We developed the SIMPLE open-source image-processing suite for analysing cryo-EM images of single-particles. A core component of SIMPLE is the probabilistic PRIME algorithm for identifying clusters of images in 2D and determine relative orientations of single-particle projections in 3D. Here, we extend our previous work on PRIME and introduce new stochastic optimization algorithms that improve the robustness of the approach. Our refined method for identification of homogeneous subsets of images in accurate register substantially improves the resolution of the cluster centers and of the ab initio 3D reconstructions derived from them. We now obtain maps with a resolution better than 10 Å by exclusively processing cluster centers. Excellent parallel code performance on over-the-counter laptops and CPU workstations is demonstrated. © 2017 The Protein Society.
WAIS-III index score profiles in the Canadian standardization sample.
Lange, Rael T
2007-01-01
Representative index score profiles were examined in the Canadian standardization sample of the Wechsler Adult Intelligence Scale-Third Edition (WAIS-III). The identification of profile patterns was based on the methodology proposed by Lange, Iverson, Senior, and Chelune (2002) that aims to maximize the influence of profile shape and minimize the influence of profile magnitude on the cluster solution. A two-step cluster analysis procedure was used (i.e., hierarchical and k-means analyses). Cluster analysis of the four index scores (i.e., Verbal Comprehension [VCI], Perceptual Organization [POI], Working Memory [WMI], Processing Speed [PSI]) identified six profiles in this sample. Profiles were differentiated by pattern of performance and were primarily characterized as (a) high VCI/POI, low WMI/PSI, (b) low VCI/POI, high WMI/PSI, (c) high PSI, (d) low PSI, (e) high VCI/WMI, low POI/PSI, and (f) low VCI, high POI. These profiles are potentially useful for determining whether a patient's WAIS-III performance is unusual in a normal population.
Lochner, Christine; Hemmings, Sian M J; Kinnear, Craig J; Niehaus, Dana J H; Nel, Daniel G; Corfield, Valerie A; Moolman-Smook, Johanna C; Seedat, Soraya; Stein, Dan J
2005-01-01
Comorbidity of certain obsessive-compulsive spectrum disorders (OCSDs; such as Tourette's disorder) in obsessive-compulsive disorder (OCD) may serve to define important OCD subtypes characterized by differing phenomenology and neurobiological mechanisms. Comorbidity of the putative OCSDs in OCD has, however, not often been systematically investigated. The Structured Clinical Interview for Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition , Axis I Disorders-Patient Version as well as a Structured Clinical Interview for Putative OCSDs (SCID-OCSD) were administered to 210 adult patients with OCD (N = 210, 102 men and 108 women; mean age, 35.7 +/- 13.3). A subset of Caucasian subjects (with OCD, n = 171; control subjects, n = 168), including subjects from the genetically homogeneous Afrikaner population (with OCD, n = 77; control subjects, n = 144), was genotyped for polymorphisms in genes involved in monoamine function. Because the items of the SCID-OCSD are binary (present/absent), a cluster analysis (Ward's method) using the items of SCID-OCSD was conducted. The association of identified clusters with demographic variables (age, gender), clinical variables (age of onset, obsessive-compulsive symptom severity and dimensions, level of insight, temperament/character, treatment response), and monoaminergic genotypes was examined. Cluster analysis of the OCSDs in our sample of patients with OCD identified 3 separate clusters at a 1.1 linkage distance level. The 3 clusters were named as follows: (1) "reward deficiency" (including trichotillomania, Tourette's disorder, pathological gambling, and hypersexual disorder), (2) "impulsivity" (including compulsive shopping, kleptomania, eating disorders, self-injury, and intermittent explosive disorder), and (3) "somatic" (including body dysmorphic disorder and hypochondriasis). Several significant associations were found between cluster scores and other variables; for example, cluster I scores were associated with earlier age of onset of OCD and the presence of tics, cluster II scores were associated with female gender and childhood emotional abuse, and cluster III scores were associated with less insight and with somatic obsessions and compulsions. However, none of these clusters were associated with any particular genetic variant. Analysis of comorbid OCSDs in OCD suggested that these lie on a number of different dimensions. These dimensions are partially consistent with previous theoretical approaches taken toward classifying OCD spectrum disorders. The lack of genetic validation of these clusters in the present study may indicate the involvement of other, as yet untested, genes. Further genetic and cluster analyses of comorbid OCSDs in OCD may ultimately contribute to a better delineation of OCD endophenotypes.
Van Cann, Joannes; Virgilio, Massimiliano; Jordaens, Kurt; De Meyer, Marc
2015-01-01
Previous attempts to resolve the Ceratitis FAR complex (Ceratitis fasciventris, Ceratitis anonae, Ceratitis rosa, Diptera, Tephritidae) showed contrasting results and revealed the occurrence of five microsatellite genotypic clusters (A, F1, F2, R1, R2). In this paper we explore the potential of wing morphometrics for the diagnosis of FAR morphospecies and genotypic clusters. We considered a set of 227 specimens previously morphologically identified and genotyped at 16 microsatellite loci. Seventeen wing landmarks and 6 wing band areas were used for morphometric analyses. Permutational multivariate analysis of variance detected significant differences both across morphospecies and genotypic clusters (for both males and females). Unconstrained and constrained ordinations did not properly resolve groups corresponding to morphospecies or genotypic clusters. However, posterior group membership probabilities (PGMPs) of the Discriminant Analysis of Principal Components (DAPC) allowed the consistent identification of a relevant proportion of specimens (but with performances differing across morphospecies and genotypic clusters). This study suggests that wing morphometrics and PGMPs might represent a possible tool for the diagnosis of species within the FAR complex. Here, we propose a tentative diagnostic method and provide a first reference library of morphometric measures that might be used for the identification of additional and unidentified FAR specimens.
Jimenez-Infante, Francy; Ngugi, David Kamanda; Vinu, Manikandan; Alam, Intikhab; Kamau, Allan Anthony; Blom, Jochen; Bajic, Vladimir B.
2015-01-01
The OM43 clade within the family Methylophilaceae of Betaproteobacteria represents a group of methylotrophs that play important roles in the metabolism of C1 compounds in marine environments and other aquatic environments around the globe. Using dilution-to-extinction cultivation techniques, we successfully isolated a novel species of this clade (here designated MBRS-H7) from the ultraoligotrophic open ocean waters of the central Red Sea. Phylogenomic analyses indicate that MBRS-H7 is a novel species that forms a distinct cluster together with isolate KB13 from Hawaii (Hawaii-Red Sea [H-RS] cluster) that is separate from the cluster represented by strain HTCC2181 (from the Oregon coast). Phylogenetic analyses using the robust 16S-23S internal transcribed spacer revealed a potential ecotype separation of the marine OM43 clade members, which was further confirmed by metagenomic fragment recruitment analyses that showed trends of higher abundance in low-chlorophyll and/or high-temperature provinces for the H-RS cluster but a preference for colder, highly productive waters for the HTCC2181 cluster. This potential environmentally driven niche differentiation is also reflected in the metabolic gene inventories, which in the case of the H-RS cluster include those conferring resistance to high levels of UV irradiation, temperature, and salinity. Interestingly, we also found different energy conservation modules between these OM43 subclades, namely, the existence of the NADH:quinone oxidoreductase complex I (NUO) system in the H-RS cluster and the nonhomologous NADH:quinone oxidoreductase (NQR) system in the HTCC2181 cluster, which might have implications for their overall energetic yields. PMID:26655752
Poole, William; Leinonen, Kalle; Shmulevich, Ilya
2017-01-01
Cancer researchers have long recognized that somatic mutations are not uniformly distributed within genes. However, most approaches for identifying cancer mutations focus on either the entire-gene or single amino-acid level. We have bridged these two methodologies with a multiscale mutation clustering algorithm that identifies variable length mutation clusters in cancer genes. We ran our algorithm on 539 genes using the combined mutation data in 23 cancer types from The Cancer Genome Atlas (TCGA) and identified 1295 mutation clusters. The resulting mutation clusters cover a wide range of scales and often overlap with many kinds of protein features including structured domains, phosphorylation sites, and known single nucleotide variants. We statistically associated these multiscale clusters with gene expression and drug response data to illuminate the functional and clinical consequences of mutations in our clusters. Interestingly, we find multiple clusters within individual genes that have differential functional associations: these include PTEN, FUBP1, and CDH1. This methodology has potential implications in identifying protein regions for drug targets, understanding the biological underpinnings of cancer, and personalizing cancer treatments. Toward this end, we have made the mutation clusters and the clustering algorithm available to the public. Clusters and pathway associations can be interactively browsed at m2c.systemsbiology.net. The multiscale mutation clustering algorithm is available at https://github.com/IlyaLab/M2C. PMID:28170390
Poole, William; Leinonen, Kalle; Shmulevich, Ilya; Knijnenburg, Theo A; Bernard, Brady
2017-02-01
Cancer researchers have long recognized that somatic mutations are not uniformly distributed within genes. However, most approaches for identifying cancer mutations focus on either the entire-gene or single amino-acid level. We have bridged these two methodologies with a multiscale mutation clustering algorithm that identifies variable length mutation clusters in cancer genes. We ran our algorithm on 539 genes using the combined mutation data in 23 cancer types from The Cancer Genome Atlas (TCGA) and identified 1295 mutation clusters. The resulting mutation clusters cover a wide range of scales and often overlap with many kinds of protein features including structured domains, phosphorylation sites, and known single nucleotide variants. We statistically associated these multiscale clusters with gene expression and drug response data to illuminate the functional and clinical consequences of mutations in our clusters. Interestingly, we find multiple clusters within individual genes that have differential functional associations: these include PTEN, FUBP1, and CDH1. This methodology has potential implications in identifying protein regions for drug targets, understanding the biological underpinnings of cancer, and personalizing cancer treatments. Toward this end, we have made the mutation clusters and the clustering algorithm available to the public. Clusters and pathway associations can be interactively browsed at m2c.systemsbiology.net. The multiscale mutation clustering algorithm is available at https://github.com/IlyaLab/M2C.
NASA Astrophysics Data System (ADS)
Jose, Jessy; Pandey, A. K.; Ogura, K.; Samal, M. R.; Ojha, D. K.; Bhatt, B. C.; Chauhan, N.; Eswaraiah, C.; Mito, H.; Kobayashi, N.; Yadav, R. K.
2012-08-01
We present the analyses of the stellar contents associated with the extended H II region Sh2-252 using deep optical UBVRI photometry, slit and slitless spectroscopy along with the near-infrared (NIR) data from Two-Micron All-Sky Survey (2MASS) for an area ˜ 1 × 1 deg2. We have studied the sub-regions of Sh2-252, which includes four compact-H II (CH II ) regions, namely A, B, C and E, and two clusters, NGC 2175s and Teutsch 136 (Teu 136). Of the 15 spectroscopically observed bright stars, eight have been identified as massive members of spectral class earlier than B3. From the spectrophotometric analyses, we derived the average distance of the region as 2.4 ± 0.2 kpc, and the reddening E(B - V) of the massive members is found to vary between 0.35 and 2.1 mag. We found that NGC 2175s and Teu 136, located towards the eastern edge of the complex, are the sub-clusters of Sh2-252. The stellar surface density distribution in K band shows clustering associated with the regions A, C, E, NGC 2175s and Teu 136. We have also identified the candidate ionizing sources of the CH II regions. 61 Hα emission sources are identified using slitless spectroscopy. The distribution of the Hα emission sources and candidate young stellar objects (YSOs) with IR excess on the V/(V - I) colour-magnitude diagram (CMD) shows that a majority of them have approximate ages between 0.1 and 5 Myr and masses in the range of 0.3-2.5 M⊙. The optical CMDs of the candidate pre-main-sequence (PMS) sources in the individual regions also show an age spread of 0.1-5 Myr for each of them. We calculated the K-band luminosity functions (KLFs) for the sub-regions A, C, E, NGC 2175s and Teu 136. Within errors, the KLFs for all the sub-regions are found to be similar and comparable to that of young clusters of age <5 Myr. We also estimated the mass function of the PMS sample of the individual regions in the mass range of 0.3-2.5 M⊙. In general, the slopes of the MFs of all the sub-regions are found comparable to the Salpeter value.
Comparative Microbial Modules Resource: Generation and Visualization of Multi-species Biclusters
Bate, Ashley; Eichenberger, Patrick; Bonneau, Richard
2011-01-01
The increasing abundance of large-scale, high-throughput datasets for many closely related organisms provides opportunities for comparative analysis via the simultaneous biclustering of datasets from multiple species. These analyses require a reformulation of how to organize multi-species datasets and visualize comparative genomics data analyses results. Recently, we developed a method, multi-species cMonkey, which integrates heterogeneous high-throughput datatypes from multiple species to identify conserved regulatory modules. Here we present an integrated data visualization system, built upon the Gaggle, enabling exploration of our method's results (available at http://meatwad.bio.nyu.edu/cmmr.html). The system can also be used to explore other comparative genomics datasets and outputs from other data analysis procedures – results from other multiple-species clustering programs or from independent clustering of different single-species datasets. We provide an example use of our system for two bacteria, Escherichia coli and Salmonella Typhimurium. We illustrate the use of our system by exploring conserved biclusters involved in nitrogen metabolism, uncovering a putative function for yjjI, a currently uncharacterized gene that we predict to be involved in nitrogen assimilation. PMID:22144874
Comparative microbial modules resource: generation and visualization of multi-species biclusters.
Kacmarczyk, Thadeous; Waltman, Peter; Bate, Ashley; Eichenberger, Patrick; Bonneau, Richard
2011-12-01
The increasing abundance of large-scale, high-throughput datasets for many closely related organisms provides opportunities for comparative analysis via the simultaneous biclustering of datasets from multiple species. These analyses require a reformulation of how to organize multi-species datasets and visualize comparative genomics data analyses results. Recently, we developed a method, multi-species cMonkey, which integrates heterogeneous high-throughput datatypes from multiple species to identify conserved regulatory modules. Here we present an integrated data visualization system, built upon the Gaggle, enabling exploration of our method's results (available at http://meatwad.bio.nyu.edu/cmmr.html). The system can also be used to explore other comparative genomics datasets and outputs from other data analysis procedures - results from other multiple-species clustering programs or from independent clustering of different single-species datasets. We provide an example use of our system for two bacteria, Escherichia coli and Salmonella Typhimurium. We illustrate the use of our system by exploring conserved biclusters involved in nitrogen metabolism, uncovering a putative function for yjjI, a currently uncharacterized gene that we predict to be involved in nitrogen assimilation. © 2011 Kacmarczyk et al.
Tanaka-Tsuno, Fumiko; Mizukami-Murata, Satomi; Murata, Yoshinori; Nakamura, Toshihide; Ando, Akira; Takagi, Hiroshi; Shima, Jun
2007-10-01
In the modern baking industry, high-sucrose-tolerant (HS) and maltose-utilizing (LS) yeast were developed using breeding techniques and are now used commercially. Sugar utilization and high-sucrose tolerance differ significantly between HS and LS yeasts. We analysed the gene expression profiles of HS and LS yeasts under different sucrose conditions in order to determine their basic physiology. Two-way hierarchical clustering was performed to obtain the overall patterns of gene expression. The clustering clearly showed that the gene expression patterns of LS yeast differed from those of HS yeast. Quality threshold clustering was used to identify the gene clusters containing upregulated genes (cluster 1) and downregulated genes (cluster 2) under high-sucrose conditions. Clusters 1 and 2 contained numerous genes involved in carbon and nitrogen metabolism, respectively. The expression level of the genes involved in the metabolism of glycerol and trehalose, which are known to be osmoprotectants, in LS yeast was higher than that in HS yeast under sucrose concentrations of 5-40%. No clear correlation was found between the expression level of the genes involved in the biosynthesis of the osmoprotectants and the intracellular contents of the osmoprotectants. The present gene expression data were compared with data previously reported in a comprehensive analysis of a gene deletion strain collection. Welch's t-test for this comparison showed that the relative growth rates of the deletion strains whose deletion occurred in genes belonging to cluster 1 were significantly higher than the average growth rates of all deletion strains. Copyright 2007 John Wiley & Sons, Ltd.
Replicability and 40-Year Predictive Power of Childhood ARC Types
Chapman, Benjamin P.; Goldberg, Lewis R.
2011-01-01
We examined three questions surrounding the Undercontrolled, Overcontrolled, and Resilient--or Asendorpf-Robins-Caspi (ARC)--personality types originally identified by Block (1971). In analyses of the teacher personality assessments of over 2,000 children in 1st through 6th grade in 1959-1967, and follow-up data on general and cardiovascular health outcomes in over 1,100 adults recontacted 40 years later, we found: (1) Bootstrapped internal replication clustering suggested that Big Five scores were best characterized by a tripartite cluster structure corresponding to the ARC types; (2) this cluster structure was fuzzy, rather than discrete, indicating that ARC constructs are best represented as gradients of similarity to three prototype Big Five profiles; and (3) ARC types and degrees of ARC prototypicality showed associations with multiple health outcomes 40 years later. ARC constructs were more parsimonious, but neither better nor more consistent predictors than the dimensional Big Five traits. Forty-year incident cases of heart disease could be correctly identified with 68% accuracy by personality information alone, a figure approaching the 12-year accuracy of a leading medical cardiovascular risk model. Findings support the theoretical validity of ARC constructs, their treatment as continua of prototypicality rather than discrete categories, and the need for further understanding the robust predictive power of childhood personality traits for mid-life health. PMID:21744975
CNL Disease Resistance Genes in Soybean and Their Evolutionary Divergence
Nepal, Madhav P; Benson, Benjamin V
2015-01-01
Disease resistance genes (R-genes) encode proteins involved in detecting pathogen attack and activating downstream defense molecules. Recent availability of soybean genome sequences makes it possible to examine the diversity of gene families including disease-resistant genes. The objectives of this study were to identify coiled-coil NBS-LRR (= CNL) R-genes in soybean, infer their evolutionary relationships, and assess structural as well as functional divergence of the R-genes. Profile hidden Markov models were used for sequence identification and model-based maximum likelihood was used for phylogenetic analysis, and variation in chromosomal positioning, gene clustering, and functional divergence were assessed. We identified 188 soybean CNL genes nested into four clades consistent to their orthologs in Arabidopsis. Gene clustering analysis revealed the presence of 41 gene clusters located on 13 different chromosomes. Analyses of the Ks-values and chromosomal positioning suggest duplication events occurring at varying timescales, and an extrapericentromeric positioning may have facilitated their rapid evolution. Each of the four CNL clades exhibited distinct patterns of gene expression. Phylogenetic analysis further supported the extrapericentromeric positioning effect on the divergence and retention of the CNL genes. The results are important for understanding the diversity and divergence of CNL genes in soybean, which would have implication in soybean crop improvement in future. PMID:25922568
Dawson, Anna P; Cargo, Margaret; Stewart, Harold; Chong, Alwin; Daniel, Mark
2013-02-01
Aboriginal Australians, including Aboriginal Health Workers (AHWs), smoke at rates double the non-Aboriginal population. This study utilized concept mapping methodology to identify and prioritize culturally relevant strategies to promote smoking cessation in AHWs. Stakeholder participants included AHWs, other health service employees and tobacco control personnel. Smoking cessation strategies (n = 74) were brainstormed using 34 interviews, 3 focus groups and a stakeholder workshop. Stakeholders sorted strategies into meaningful groups and rated them on perceived importance and feasibility. A concept map was developed using multi-dimensional scaling and hierarchical cluster analyses. Ten unique clusters of smoking cessation strategies were depicted that targeted individuals, family and peers, community, workplace and public policy. Smoking cessation resources and services were represented in addition to broader strategies addressing social and environmental stressors that perpetuate smoking and make quitting difficult. The perceived importance and feasibility of clusters were rated differently by participants working in health services that were government-coordinated compared with community-controlled. For health service workers within vulnerable populations, these findings clearly implicate a need for contextualized strategies that mitigate social and environmental stressors in addition to conventional strategies for tobacco control. The concept map is being applied in knowledge translation to guide development of smoking cessation programs for AHWs.
CNL Disease Resistance Genes in Soybean and Their Evolutionary Divergence.
Nepal, Madhav P; Benson, Benjamin V
2015-01-01
Disease resistance genes (R-genes) encode proteins involved in detecting pathogen attack and activating downstream defense molecules. Recent availability of soybean genome sequences makes it possible to examine the diversity of gene families including disease-resistant genes. The objectives of this study were to identify coiled-coil NBS-LRR (= CNL) R-genes in soybean, infer their evolutionary relationships, and assess structural as well as functional divergence of the R-genes. Profile hidden Markov models were used for sequence identification and model-based maximum likelihood was used for phylogenetic analysis, and variation in chromosomal positioning, gene clustering, and functional divergence were assessed. We identified 188 soybean CNL genes nested into four clades consistent to their orthologs in Arabidopsis. Gene clustering analysis revealed the presence of 41 gene clusters located on 13 different chromosomes. Analyses of the K s-values and chromosomal positioning suggest duplication events occurring at varying timescales, and an extrapericentromeric positioning may have facilitated their rapid evolution. Each of the four CNL clades exhibited distinct patterns of gene expression. Phylogenetic analysis further supported the extrapericentromeric positioning effect on the divergence and retention of the CNL genes. The results are important for understanding the diversity and divergence of CNL genes in soybean, which would have implication in soybean crop improvement in future.
Sun, Keping; Kimball, Rebecca T.; Liu, Tong; Wei, Xuewen; Jin, Longru; Jiang, Tinglei; Lin, Aiqing; Feng, Jiang
2016-01-01
Palaeoclimatic oscillations and different landscapes frequently result in complex population-level structure or the evolution of cryptic species. Elucidating the potential mechanisms is vital to understanding speciation events. However, such complex evolutionary patterns have rarely been reported in bats. In China, the Rhinolophus macrotis complex contains a large form and a small form, suggesting the existence of a cryptic bat species. Our field surveys found these two sibling species have a continuous and widespread distribution with partial sympatry. However, their evolutionary history has received little attention. Here, we used extensive sampling, morphological and acoustic data, as well as different genetic markers to investigate their evolutionary history. Genetic analyses revealed discordance between the mitochondrial and nuclear data. Mitochondrial data identified three reciprocally monophyletic lineages: one representing all small forms from Southwest China, and the other two containing all large forms from Central and Southeast China, respectively. The large form showed paraphyly with respect to the small form. However, clustering analyses of microsatellite and Chd1 gene sequences support two divergent clusters separating the large form and the small form. Moreover, morphological and acoustic analyses were consistent with nuclear data. This unusual pattern in the R. macrotis complex might be accounted for by palaeoclimatic oscillations, shared ancestral polymorphism and/or interspecific hybridization. PMID:27748429
Michetti, Davide; Brandsdal, Bjørn Olav; Bon, Davide; Isaksen, Geir Villy; Tiberti, Matteo; Papaleo, Elena
2017-01-01
The psychrophilic and mesophilic endonucleases A (EndA) from Aliivibrio salmonicida (VsEndA) and Vibrio cholera (VcEndA) have been studied experimentally in terms of the biophysical properties related to thermal adaptation. The analyses of their static X-ray structures was no sufficient to rationalize the determinants of their adaptive traits at the molecular level. Thus, we used Molecular Dynamics (MD) simulations to compare the two proteins and unveil their structural and dynamical differences. Our simulations did not show a substantial increase in flexibility in the cold-adapted variant on the nanosecond time scale. The only exception is a more rigid C-terminal region in VcEndA, which is ascribable to a cluster of electrostatic interactions and hydrogen bonds, as also supported by MD simulations of the VsEndA mutant variant where the cluster of interactions was introduced. Moreover, we identified three additional amino acidic substitutions through multiple sequence alignment and the analyses of MD-based protein structure networks. In particular, T120V occurs in the proximity of the catalytic residue H80 and alters the interaction with the residue Y43, which belongs to the second coordination sphere of the Mg2+ ion. This makes T120V an amenable candidate for future experimental mutagenesis.
Tuttolomondo, Teresa; La Bella, Salvatore; Licata, Mario; Virga, Giuseppe; Leto, Claudio; Saija, Antonella; Trombetta, Domenico; Tomaino, Antonio; Speciale, Antonio; Napoli, Edoardo M; Siracusa, Laura; Pasquale, Andrea; Curcuruto, Giusy; Ruberto, Giuseppe
2013-03-01
An extensive survey of wild Sicilian oregano was made. A total of 57 samples were collected from various sites, followed by taxonomic characterization from an agronomic perspective. Based on morphological and production characteristics obtained from the 57 samples, cluster analysis was used to divide the samples into homogeneous groups, to identify the best biotypes. All samples were analyzed for their phytochemical content, applying a cascade-extraction protocol and hydrodistillation, to obtain the non volatile components and the essential oils, respectively. The extracts contained thirteen polyphenol derivatives, i.e., four flavanones, seven flavones, and two organic acids. Their qualitative and quantitative characterization was carried out by LC/MS analyses. The essential oils were characterized using a combination of GC-FID and GC/MS analyses; a total of 81 components were identified. The major components of the oils were thymol, p-cymene, and γ-terpinene. Cluster analysis was carried out on both phytochemical profiles and resulted in the division of the oregano samples into different chemical groups. The antioxidant activity of the essential oils and extracts was investigated by the Folin-Ciocalteau (FC) colorimetric assay, by UV radiation-induced peroxidation in liposomal membranes (UV-IP test), and by determining the O(2)(∙-)-scavenging activity. Copyright © 2013 Verlag Helvetica Chimica Acta AG, Zürich.
Motor and Executive Function Profiles in Adult Residents ...
Objective: Exposure to elevated levels of manganese (Mn) may be associated with tremor, motor and executive dysfunction (EF), clinically resembling Parkinson’s disease (PD). PD research has identified tremor-dominant (TD) and non-tremor dominant (NTD) profiles. NTD PD presents with bradykinesia, rigidity, and postural sway, and is associated with EF impairment with lower quality of life (QoL). Presence and impact of tremor, motor, and executive dysfunction profiles on health-related QoL and life satisfaction were examined in air-Mn exposed residents of two Ohio, USA towns. Participants and Methods: From two Ohio towns exposed to air-Mn, 186 residents (76 males) aged 30-75 years were administered measures of EF (Animal Naming, ACT, Rey-O Copy, Stroop Color-Word, and Trails B), motor and tremor symptoms (UPDRS), QoL (BRFSS), life satisfaction (SWLS), and positive symptom distress (SCL-90-R). Air-Mn exposure in the two towns was modeled with 10 years of air-monitoring data. Cluster analyses detected the presence of symptom profiles by grouping together residents with similar scores on these measures. Results: Overall, mean air-Mn concentration for the two towns was 0.53 µg/m3 (SD=.92). Two-step cluster analyses identified TD and NTD symptom profiles. Residents in the NTD group lacked EF impairment; EF impairment represented a separate profile. An unimpaired group also emerged. The NTD and EF impairment groups were qualitatively similar, with relatively lo
Hajdari, Avni; Mustafa, Behxhet; Nebija, Dashnor; Miftari, Elheme; Quave, Cassandra L; Novak, Johannes
2015-11-01
Ripe cones of Juniperus communis L. (Cupressaceae) were collected from five wild populations in Kosovo, with the aim of investigating the chemical composition and natural variation of essential oils between and within wild populations. Ripe cones were collected, air dried, crushed, and the essential oils obtained by hydrodistillation. The essential-oil constituents were identified by GC-FID and GC/MS analyses. The yield of essential oil differed depending on the population origins and ranged from 0.4 to 3.8% (v/w, based on the dry weight). In total, 42 compounds were identified in the essential oils of all populations. The principal components of the cone-essential oils were α-pinene, followed by β-myrcene, sabinene, and D-limonene. Taking into consideration the yield and chemical composition, the essential oil originating from various collection sites in Kosovo fulfilled the minimum requirements for J. communis essential oils of the European Pharmacopoeia. Hierarchical cluster analysis (HCA) and principal component analysis (PCA) were used to determine the influence of the geographical variations on the essential-oil composition. These statistical analyses suggested that the clustering of populations was not related to their geographic location, but rather appeared to be linked to local selective forces acting on the chemotype diversity. Copyright © 2015 Verlag Helvetica Chimica Acta AG, Zürich.
Esplin, M Sean; Manuck, Tracy A.; Varner, Michael W.; Christensen, Bryce; Biggio, Joseph; Bukowski, Radek; Parry, Samuel; Zhang, Heping; Huang, Hao; Andrews, William; Saade, George; Sadovsky, Yoel; Reddy, Uma M.; Ilekis, John
2015-01-01
Objective We sought to employ an innovative tool based on common biological pathways to identify specific phenotypes among women with spontaneous preterm birth (SPTB), in order to enhance investigators' ability to identify to highlight common mechanisms and underlying genetic factors responsible for SPTB. Study Design A secondary analysis of a prospective case-control multicenter study of SPTB. All cases delivered a preterm singleton at SPTB ≤34.0 weeks gestation. Each woman was assessed for the presence of underlying SPTB etiologies. A hierarchical cluster analysis was used to identify groups of women with homogeneous phenotypic profiles. One of the phenotypic clusters was selected for candidate gene association analysis using VEGAS software. Results 1028 women with SPTB were assigned phenotypes. Hierarchical clustering of the phenotypes revealed five major clusters. Cluster 1 (N=445) was characterized by maternal stress, cluster 2 (N=294) by premature membrane rupture, cluster 3 (N=120) by familial factors, and cluster 4 (N=63) by maternal comorbidities. Cluster 5 (N=106) was multifactorial, characterized by infection (INF), decidual hemorrhage (DH) and placental dysfunction (PD). These three phenotypes were highly correlated by Chi-square analysis [PD and DH (p<2.2e-6); PD and INF (p=6.2e-10); INF and DH (p=0.0036)]. Gene-based testing identified the INS (insulin) gene as significantly associated with cluster 3 of SPTB. Conclusion We identified 5 major clusters of SPTB based on a phenotype tool and hierarchal clustering. There was significant correlation between several of the phenotypes. The INS gene was associated with familial factors underlying SPTB. PMID:26070700
Caesar, Lindsay K; Kvalheim, Olav M; Cech, Nadja B
2018-08-27
Mass spectral data sets often contain experimental artefacts, and data filtering prior to statistical analysis is crucial to extract reliable information. This is particularly true in untargeted metabolomics analyses, where the analyte(s) of interest are not known a priori. It is often assumed that chemical interferents (i.e. solvent contaminants such as plasticizers) are consistent across samples, and can be removed by background subtraction from blank injections. On the contrary, it is shown here that chemical contaminants may vary in abundance across each injection, potentially leading to their misidentification as relevant sample components. With this metabolomics study, we demonstrate the effectiveness of hierarchical cluster analysis (HCA) of replicate injections (technical replicates) as a methodology to identify chemical interferents and reduce their contaminating contribution to metabolomics models. Pools of metabolites with varying complexity were prepared from the botanical Angelica keiskei Koidzumi and spiked with known metabolites. Each set of pools was analyzed in triplicate and at multiple concentrations using ultraperformance liquid chromatography coupled to mass spectrometry (UPLC-MS). Before filtering, HCA failed to cluster replicates in the data sets. To identify contaminant peaks, we developed a filtering process that evaluated the relative peak area variance of each variable within triplicate injections. These interferent peaks were found across all samples, but did not show consistent peak area from injection to injection, even when evaluating the same chemical sample. This filtering process identified 128 ions that appear to originate from the UPLC-MS system. Data sets collected for a high number of pools with comparatively simple chemical composition were highly influenced by these chemical interferents, as were samples that were analyzed at a low concentration. When chemical interferent masses were removed, technical replicates clustered in all data sets. This work highlights the importance of technical replication in mass spectrometry-based studies, and presents a new application of HCA as a tool for evaluating the effectiveness of data filtering prior to statistical analysis. Copyright © 2018 Elsevier B.V. All rights reserved.
Mapping patient safety: a large-scale literature review using bibliometric visualisation techniques
Rodrigues, S P; van Eck, N J; Waltman, L; Jansen, F W
2014-01-01
Background The amount of scientific literature available is often overwhelming, making it difficult for researchers to have a good overview of the literature and to see relations between different developments. Visualisation techniques based on bibliometric data are helpful in obtaining an overview of the literature on complex research topics, and have been applied here to the topic of patient safety (PS). Methods On the basis of title words and citation relations, publications in the period 2000–2010 related to PS were identified in the Scopus bibliographic database. A visualisation of the most frequently cited PS publications was produced based on direct and indirect citation relations between publications. Terms were extracted from titles and abstracts of the publications, and a visualisation of the most important terms was created. The main PS-related topics studied in the literature were identified using a technique for clustering publications and terms. Results A total of 8480 publications were identified, of which the 1462 most frequently cited ones were included in the visualisation. The publications were clustered into 19 clusters, which were grouped into three categories: (1) magnitude of PS problems (42% of all included publications); (2) PS risk factors (31%) and (3) implementation of solutions (19%). In the visualisation of PS-related terms, five clusters were identified: (1) medication; (2) measuring harm; (3) PS culture; (4) physician; (5) training, education and communication. Both analysis at publication and term level indicate an increasing focus on risk factors. Conclusions A bibliometric visualisation approach makes it possible to analyse large amounts of literature. This approach is very useful for improving one's understanding of a complex research topic such as PS and for suggesting new research directions or alternative research priorities. For PS research, the approach suggests that more research on implementing PS improvement initiatives might be needed. PMID:24625640
Nguyen, V H; Pham, H T; Diep, T T; Phan, C D H; Nguyen, T Q; Nguyen, N T N; Ngo, T C; Nguyen, T V; Do, Q K; Phan, H C; Nguyen, B M; Ehara, M; Ohnishi, M; Yamashiro, T; Nguyen, L T P; Izumiya, H
2016-04-01
The Vibrio cholerae O1 (VCO1) El Tor biotype appeared during the seventh cholera pandemic starting in 1961, and new variants of this biotype have been identified since the early 1990s. This pandemic has affected Vietnam, and a large outbreak was reported in southern Vietnam in 2010. Pulsed-field gel electrophoresis (PFGE) and multilocus variable-number tandem-repeat analyses (MLVA) were used to screen 34 VCO1 isolates from the southern Vietnam 2010 outbreak (23 patients, five contact persons, and six environmental isolates) to determine if it was genetically distinct from 18 isolates from outbreaks in southern Vietnam from 1999 to 2004, and two isolates from northern Vietnam (2008). Twenty-seven MLVA types and seven PFGE patterns were identified. Both analyses showed that the 2008 and 2010 isolates were distinctly clustered and separated from the 1999-2004 isolates.
Evaluation of (GTG)5-PCR for identification of Enterococcus spp.
Svec, Pavel; Vancanneyt, Marc; Seman, Milan; Snauwaert, Cindy; Lefebvre, Karen; Sedlácek, Ivo; Swings, Jean
2005-06-01
A set of reference strains and a group of previously unidentified enterococci were analysed by rep-PCR with the (GTG)(5) primer to evaluate the discriminatory power and suitability of this method for typing and identification of enterococcal species. A total of 49 strains representing all validly described species were obtained from bacterial collections. For more extensive evaluation of this identification approach 112 well-defined and identified enterococci isolated from bryndza cheese were tested. The (GTG)(5)-PCR fingerprinting assigned all strains into well-differentiated clusters representing individual species. Subsequently, a group including 44 unidentified enterococci isolated from surface waters was analysed to evaluate this method for identification of unknown isolates. Obtained band patterns allowed us to identify all the strains clearly to the species level. This study proved that rep-PCR with (GTG)(5) primer is a reliable and fast method for species identification of enterococci.
Pan-genome and phylogeny of Bacillus cereus sensu lato.
Bazinet, Adam L
2017-08-02
Bacillus cereus sensu lato (s. l.) is an ecologically diverse bacterial group of medical and agricultural significance. In this study, I use publicly available genomes and novel bioinformatic workflows to characterize the B. cereus s. l. pan-genome and perform the largest phylogenetic and population genetic analyses of this group to date in terms of the number of genes and taxa included. With these fundamental data in hand, I identify genes associated with particular phenotypic traits (i.e., "pan-GWAS" analysis), and quantify the degree to which taxa sharing common attributes are phylogenetically clustered. A rapid k-mer based approach (Mash) was used to create reduced representations of selected Bacillus genomes, and a fast distance-based phylogenetic analysis of this data (FastME) was performed to determine which species should be included in B. cereus s. l. The complete genomes of eight B. cereus s. l. species were annotated de novo with Prokka, and these annotations were used by Roary to produce the B. cereus s. l. pan-genome. Scoary was used to associate gene presence and absence patterns with various phenotypes. The orthologous protein sequence clusters produced by Roary were filtered and used to build HaMStR databases of gene models that were used in turn to construct phylogenetic data matrices. Phylogenetic analyses used RAxML, DendroPy, ClonalFrameML, PAUP*, and SplitsTree. Bayesian model-based population genetic analysis assigned taxa to clusters using hierBAPS. The genealogical sorting index was used to quantify the phylogenetic clustering of taxa sharing common attributes. The B. cereus s. l. pan-genome currently consists of ≈60,000 genes, ≈600 of which are "core" (common to at least 99% of taxa sampled). Pan-GWAS analysis revealed genes associated with phenotypes such as isolation source, oxygen requirement, and ability to cause diseases such as anthrax or food poisoning. Extensive phylogenetic analyses using an unprecedented amount of data produced phylogenies that were largely concordant with each other and with previous studies. Phylogenetic support as measured by bootstrap probabilities increased markedly when all suitable pan-genome data was included in phylogenetic analyses, as opposed to when only core genes were used. Bayesian population genetic analysis recommended subdividing the three major clades of B. cereus s. l. into nine clusters. Taxa sharing common traits and species designations exhibited varying degrees of phylogenetic clustering. All phylogenetic analyses recapitulated two previously used classification systems, and taxa were consistently assigned to the same major clade and group. By including accessory genes from the pan-genome in the phylogenetic analyses, I produced an exceptionally well-supported phylogeny of 114 complete B. cereus s. l. genomes. The best-performing methods were used to produce a phylogeny of all 498 publicly available B. cereus s. l. genomes, which was in turn used to compare three different classification systems and to test the monophyly status of various B. cereus s. l. species. The majority of the methodology used in this study is generic and could be leveraged to produce pan-genome estimates and similarly robust phylogenetic hypotheses for other bacterial groups.
A null model for microbial diversification
Straub, Timothy J.
2017-01-01
Whether prokaryotes (Bacteria and Archaea) are naturally organized into phenotypically and genetically cohesive units comparable to animal or plant species remains contested, frustrating attempts to estimate how many such units there might be, or to identify the ecological roles they play. Analyses of gene sequences in various closely related prokaryotic groups reveal that sequence diversity is typically organized into distinct clusters, and processes such as periodic selection and extensive recombination are understood to be drivers of cluster formation (“speciation”). However, observed patterns are rarely compared with those obtainable with simple null models of diversification under stochastic lineage birth and death and random genetic drift. Via a combination of simulations and analyses of core and phylogenetic marker genes, we show that patterns of diversity for the genera Escherichia, Neisseria, and Borrelia are generally indistinguishable from patterns arising under a null model. We suggest that caution should thus be taken in interpreting observed clustering as a result of selective evolutionary forces. Unknown forces do, however, appear to play a role in Helicobacter pylori, and some individual genes in all groups fail to conform to the null model. Taken together, we recommend the presented birth−death model as a null hypothesis in prokaryotic speciation studies. It is only when the real data are statistically different from the expectations under the null model that some speciation process should be invoked. PMID:28630293
Admixture and gene flow from Russia in the recovering Northern European brown bear (Ursus arctos).
Kopatz, Alexander; Eiken, Hans Geir; Aspi, Jouni; Kojola, Ilpo; Tobiassen, Camilla; Tirronen, Konstantin F; Danilov, Pjotr I; Hagen, Snorre B
2014-01-01
Large carnivores were persecuted to near extinction during the last centuries, but have now recovered in some countries. It has been proposed earlier that the recovery of the Northern European brown bear is supported by migration from Russia. We tested this hypothesis by obtaining for the first time continuous sampling of the whole Finnish bear population, which is located centrally between the Russian and Scandinavian bear populations. The Finnish population is assumed to experience high gene flow from Russian Karelia. If so, no or a low degree of genetic differentiation between Finnish and Russian bears could be expected. We have genotyped bears extensively from all over Finland using 12 validated microsatellite markers and compared their genetic composition to bears from Russian Karelia, Sweden, and Norway. Our fine masked investigation identified two overlapping genetic clusters structured by isolation-by-distance in Finland (pairwise FST = 0.025). One cluster included Russian bears, and migration analyses showed a high number of migrants from Russia into Finland, providing evidence of eastern gene flow as an important driver during recovery. In comparison, both clusters excluded bears from Sweden and Norway, and we found no migrants from Finland in either country, indicating that eastern gene flow was probably not important for the population recovery in Scandinavia. Our analyses on different spatial scales suggest a continuous bear population in Finland and Russian Karelia, separated from Scandinavia.
Admixture and Gene Flow from Russia in the Recovering Northern European Brown Bear (Ursus arctos)
Kopatz, Alexander; Eiken, Hans Geir; Aspi, Jouni; Kojola, Ilpo; Tobiassen, Camilla; Tirronen, Konstantin F.; Danilov, Pjotr I.; Hagen, Snorre B.
2014-01-01
Large carnivores were persecuted to near extinction during the last centuries, but have now recovered in some countries. It has been proposed earlier that the recovery of the Northern European brown bear is supported by migration from Russia. We tested this hypothesis by obtaining for the first time continuous sampling of the whole Finnish bear population, which is located centrally between the Russian and Scandinavian bear populations. The Finnish population is assumed to experience high gene flow from Russian Karelia. If so, no or a low degree of genetic differentiation between Finnish and Russian bears could be expected. We have genotyped bears extensively from all over Finland using 12 validated microsatellite markers and compared their genetic composition to bears from Russian Karelia, Sweden, and Norway. Our fine masked investigation identified two overlapping genetic clusters structured by isolation-by-distance in Finland (pairwise FST = 0.025). One cluster included Russian bears, and migration analyses showed a high number of migrants from Russia into Finland, providing evidence of eastern gene flow as an important driver during recovery. In comparison, both clusters excluded bears from Sweden and Norway, and we found no migrants from Finland in either country, indicating that eastern gene flow was probably not important for the population recovery in Scandinavia. Our analyses on different spatial scales suggest a continuous bear population in Finland and Russian Karelia, separated from Scandinavia. PMID:24839968
Dubbs, Nicole L; Bazzoli, Gloria J; Shortell, Stephen M; Kralovec, Peter D
2004-02-01
To (a) assess how the original cluster categories of hospital-led health networks and systems have changed over time; (b) identify any new patterns of cluster configurations; and (c) demonstrate how additional data can be used to refine and enhance the taxonomy measures. DATA SOURCES; 1994 and 1998 American Hospital Association (AHA) Annual Survey of Hospitals. As in the original taxonomy, separate cluster solutions are identified for health networks and health systems by applying three strategic/structural dimensions (differentiation, integration, and centralization) to three components of the health service/product continuum (hospital services, physician arrangements, and provider-based insurance activities). Factor, cluster, and discriminant analyses are used to analyze the 1998 data. Descriptive and comparative methods are used to analyze the updated 1998 taxonomy relative to the original 1994 version. The 1998 cluster categories are similar to the original taxonomy, however, they reveal some new organizational configurations. For the health networks, centralization of product/service lines is occurring more selectively than in the past. For the health systems, participation has grown in and dispersed across a more diverse set of decentralized organizational forms. For both networks and systems, the definition of centralization has changed over time. In its updated form, the taxonomy continues to provide policymakers and practitioners with a descriptive and contextual framework against which to assess organizational programs and policies. There is a need to continue to revisit the taxonomy from time to time because of the persistent evolution of the U.S. health care industry and the consequent shifting of organizational configurations in this arena. There is also value in continuing to move the taxonomy in the direction of refinement/expansion as new opportunities become available.
Wheat EST resources for functional genomics of abiotic stress
Houde, Mario; Belcaid, Mahdi; Ouellet, François; Danyluk, Jean; Monroy, Antonio F; Dryanova, Ani; Gulick, Patrick; Bergeron, Anne; Laroche, André; Links, Matthew G; MacCarthy, Luke; Crosby, William L; Sarhan, Fathey
2006-01-01
Background Wheat is an excellent species to study freezing tolerance and other abiotic stresses. However, the sequence of the wheat genome has not been completely characterized due to its complexity and large size. To circumvent this obstacle and identify genes involved in cold acclimation and associated stresses, a large scale EST sequencing approach was undertaken by the Functional Genomics of Abiotic Stress (FGAS) project. Results We generated 73,521 quality-filtered ESTs from eleven cDNA libraries constructed from wheat plants exposed to various abiotic stresses and at different developmental stages. In addition, 196,041 ESTs for which tracefiles were available from the National Science Foundation wheat EST sequencing program and DuPont were also quality-filtered and used in the analysis. Clustering of the combined ESTs with d2_cluster and TGICL yielded a few large clusters containing several thousand ESTs that were refractory to routine clustering techniques. To resolve this problem, the sequence proximity and "bridges" were identified by an e-value distance graph to manually break clusters into smaller groups. Assembly of the resolved ESTs generated a 75,488 unique sequence set (31,580 contigs and 43,908 singletons/singlets). Digital expression analyses indicated that the FGAS dataset is enriched in stress-regulated genes compared to the other public datasets. Over 43% of the unique sequence set was annotated and classified into functional categories according to Gene Ontology. Conclusion We have annotated 29,556 different sequences, an almost 5-fold increase in annotated sequences compared to the available wheat public databases. Digital expression analysis combined with gene annotation helped in the identification of several pathways associated with abiotic stress. The genomic resources and knowledge developed by this project will contribute to a better understanding of the different mechanisms that govern stress tolerance in wheat and other cereals. PMID:16772040
Bandyopadhyay, Somnath; Connolly, Sean E; Jabado, Omar; Ye, June; Kelly, Sheila; Maldonado, Michael A; Westhovens, Rene; Nash, Peter; Merrill, Joan T; Townsend, Robert M
2017-01-01
To characterise patients with active SLE based on pretreatment gene expression-defined peripheral immune cell patterns and identify clusters enriched for potential responders to abatacept treatment. This post hoc analysis used baseline peripheral whole blood transcriptomic data from patients in a phase IIb trial of intravenous abatacept (~10 mg/kg/month). Cell-specific genes were used with a published deconvolution algorithm to identify immune cell proportions in patient samples, and unsupervised consensus clustering was generated. Efficacy data were re-analysed. Patient data (n=144: abatacept: n=98; placebo: n=46) were grouped into four main clusters (C) by predominant characteristic cells: C1-neutrophils; C2-cytotoxic T cells, B-cell receptor-ligated B cells, monocytes, IgG memory B cells, activated T helper cells; C3-plasma cells, activated dendritic cells, activated natural killer cells, neutrophils; C4-activated dendritic cells, cytotoxic T cells. C3 had the highest baseline total British Isles Lupus Assessment Group (BILAG) scores, highest antidouble-stranded DNA autoantibody levels and shortest time to flare (TTF), plus trends in favour of response to abatacept over placebo: adjusted mean difference in BILAG score over 1 year, -4.78 (95% CI -12.49 to 2.92); median TTF, 56 vs 6 days; greater normalisation of complement component 3 and 4 levels. Differential improvements with abatacept were not seen in other clusters, except for median TTF in C1 (201 vs 109 days). Immune cell clustering segmented disease severity and responsiveness to abatacept. Definition of immune response cell types may inform design and interpretation of SLE trials and treatment decisions. NCT00119678; results.
AMMI adjustment for statistical analysis of an international wheat yield trial.
Crossa, J; Fox, P N; Pfeiffer, W H; Rajaram, S; Gauch, H G
1991-01-01
Multilocation trials are important for the CIMMYT Bread Wheat Program in producing high-yielding, adapted lines for a wide range of environments. This study investigated procedures for improving predictive success of a yield trial, grouping environments and genotypes into homogeneous subsets, and determining the yield stability of 18 CIMMYT bread wheats evaluated at 25 locations. Additive Main effects and Multiplicative Interaction (AMMI) analysis gave more precise estimates of genotypic yields within locations than means across replicates. This precision facilitated formation by cluster analysis of more cohesive groups of genotypes and locations for biological interpretation of interactions than occurred with unadjusted means. Locations were clustered into two subsets for which genotypes with positive interactions manifested in high, stable yields were identified. The analyses highlighted superior selections with both broad and specific adaptation.
Dong, Ying; Matigian, Nick; Harvey, Tracey J; Samaratunga, Hemamali; Hooper, John D; Clements, Judith A
2008-02-01
Abstract Tissue kallikrein (kallikrein 1) was first identified in pancreas and is the namesake of the kallikrein-related peptidase (KLK) family. KLK1 and the other 14 members of the human KLK family are encoded by 15 serine protease genes clustered at chromosome 19q13.4. Our Northern blot analysis of 19 normal human tissues for expression of KLK4 to KLK15 identified pancreas as a common expression site for the gene cluster spanning KLK5 to KLK13, as well as for KLK15 which is located adjacent to KLK1. Consistent with previous reports detailing the ability of KLK genes to generate organ- and disease-specific transcripts, detailed molecular and in silico analyses indicated that KLK5 and KLK7 generate transcripts in pancreas variant from those in skin or ovary. Consistently, we identified in the promoters of these KLK genes motifs which conform with consensus binding sites for transcription factors conferring pancreatic expression. In addition, immunohistochemical analysis revealed predominant localisation of KLK5 and KLK7 in acinar cells of the exocrine pancreas, suggesting roles for these enzymes in digestion. Our data also support expression patterns derived from gene duplication events in the human KLK cluster. These findings suggest that, in addition to KLK1, other related KLK enzymes will function in the exocrine pancreas.