time cluster analysis: Topics by Science.gov

Sample records for time cluster analysis

Near real-time space-time cluster analysis for detection of enteric disease outbreaks in a community setting.

PubMed

Glatman-Freedman, Aharona; Kaufman, Zalman; Kopel, Eran; Bassal, Ravit; Taran, Diana; Valinsky, Lea; Agmon, Vered; Shpriz, Manor; Cohen, Daniel; Anis, Emilia; Shohat, Tamy

2016-08-01

To enhance timely surveillance of bacterial enteric pathogens, space-time cluster analysis was introduced in Israel in May 2013. Stool isolation data of Salmonella, Shigella, and Campylobacter from patients of a large Health Maintenance Organization were analyzed weekly by ArcGIS and SaTScan, and cluster results were sent promptly to local departments of health (LDOHs). During eighteen months, we identified 52 Shigella sonnei clusters, two Salmonella clusters, and no Campylobacter clusters. S. sonnei clusters lasted from one to 33 days and included three to 30 individuals. Thirty-one (60%) of the S. sonnei clusters were known to LDOHs prior to cluster analysis. Clusters not previously known by the LDOHs prompted epidemiologic investigations. In 31 of the 37 (84%) confirmed clusters, educational institutes (nursery schools, kindergartens, and a primary school) were involved. Cluster analysis demonstrated capability to complement enteric disease surveillance. Scaling up the system can further enhance timely detection and control of outbreaks. Copyright © 2016 The British Infection Association. Published by Elsevier Ltd. All rights reserved.
Regression analysis of clustered failure time data with informative cluster size under the additive transformation models.

PubMed

Chen, Ling; Feng, Yanqin; Sun, Jianguo

2017-10-01

This paper discusses regression analysis of clustered failure time data, which occur when the failure times of interest are collected from clusters. In particular, we consider the situation where the correlated failure times of interest may be related to cluster sizes. For inference, we present two estimation procedures, the weighted estimating equation-based method and the within-cluster resampling-based method, when the correlated failure times of interest arise from a class of additive transformation models. The former makes use of the inverse of cluster sizes as weights in the estimating equations, while the latter can be easily implemented by using the existing software packages for right-censored failure time data. An extensive simulation study is conducted and indicates that the proposed approaches work well in both the situations with and without informative cluster size. They are applied to a dental study that motivated this study.
TimesVector: a vectorized clustering approach to the analysis of time series transcriptome data from multiple phenotypes.

PubMed

Jung, Inuk; Jo, Kyuri; Kang, Hyejin; Ahn, Hongryul; Yu, Youngjae; Kim, Sun

2017-12-01

Identifying biologically meaningful gene expression patterns from time series gene expression data is important to understand the underlying biological mechanisms. To identify significantly perturbed gene sets between different phenotypes, analysis of time series transcriptome data requires consideration of time and sample dimensions. Thus, the analysis of such time series data seeks to search gene sets that exhibit similar or different expression patterns between two or more sample conditions, constituting the three-dimensional data, i.e. gene-time-condition. Computational complexity for analyzing such data is very high, compared to the already difficult NP-hard two dimensional biclustering algorithms. Because of this challenge, traditional time series clustering algorithms are designed to capture co-expressed genes with similar expression pattern in two sample conditions. We present a triclustering algorithm, TimesVector, specifically designed for clustering three-dimensional time series data to capture distinctively similar or different gene expression patterns between two or more sample conditions. TimesVector identifies clusters with distinctive expression patterns in three steps: (i) dimension reduction and clustering of time-condition concatenated vectors, (ii) post-processing clusters for detecting similar and distinct expression patterns and (iii) rescuing genes from unclassified clusters. Using four sets of time series gene expression data, generated by both microarray and high throughput sequencing platforms, we demonstrated that TimesVector successfully detected biologically meaningful clusters of high quality. TimesVector improved the clustering quality compared to existing triclustering tools and only TimesVector detected clusters with differential expression patterns across conditions successfully. The TimesVector software is available at http://biohealth.snu.ac.kr/software/TimesVector/. sunkim.bioinfo@snu.ac.kr. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Spatiotemporal Analysis of the Ebola Hemorrhagic Fever in West Africa in 2014

NASA Astrophysics Data System (ADS)

Xu, M.; Cao, C. X.; Guo, H. F.

2017-09-01

Ebola hemorrhagic fever (EHF) is an acute hemorrhagic diseases caused by the Ebola virus, which is highly contagious. This paper aimed to explore the possible gathering area of EHF cases in West Africa in 2014, and identify endemic areas and their tendency by means of time-space analysis. We mapped distribution of EHF incidences and explored statistically significant space, time and space-time disease clusters. We utilized hotspot analysis to find the spatial clustering pattern on the basis of the actual outbreak cases. spatial-temporal cluster analysis is used to analyze the spatial or temporal distribution of agglomeration disease, examine whether its distribution is statistically significant. Local clusters were investigated using Kulldorff's scan statistic approach. The result reveals that the epidemic mainly gathered in the western part of Africa near north Atlantic with obvious regional distribution. For the current epidemic, we have found areas in high incidence of EVD by means of spatial cluster analysis.
Sample size calculation for stepped wedge and other longitudinal cluster randomised trials.

PubMed

Hooper, Richard; Teerenstra, Steven; de Hoop, Esther; Eldridge, Sandra

2016-11-20

The sample size required for a cluster randomised trial is inflated compared with an individually randomised trial because outcomes of participants from the same cluster are correlated. Sample size calculations for longitudinal cluster randomised trials (including stepped wedge trials) need to take account of at least two levels of clustering: the clusters themselves and times within clusters. We derive formulae for sample size for repeated cross-section and closed cohort cluster randomised trials with normally distributed outcome measures, under a multilevel model allowing for variation between clusters and between times within clusters. Our formulae agree with those previously described for special cases such as crossover and analysis of covariance designs, although simulation suggests that the formulae could underestimate required sample size when the number of clusters is small. Whether using a formula or simulation, a sample size calculation requires estimates of nuisance parameters, which in our model include the intracluster correlation, cluster autocorrelation, and individual autocorrelation. A cluster autocorrelation less than 1 reflects a situation where individuals sampled from the same cluster at different times have less correlated outcomes than individuals sampled from the same cluster at the same time. Nuisance parameters could be estimated from time series obtained in similarly clustered settings with the same outcome measure, using analysis of variance to estimate variance components. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Are clusters of dietary patterns and cluster membership stable over time? Results of a longitudinal cluster analysis study.

PubMed

Walthouwer, Michel Jean Louis; Oenema, Anke; Soetens, Katja; Lechner, Lilian; de Vries, Hein

2014-11-01

Developing nutrition education interventions based on clusters of dietary patterns can only be done adequately when it is clear if distinctive clusters of dietary patterns can be derived and reproduced over time, if cluster membership is stable, and if it is predictable which type of people belong to a certain cluster. Hence, this study aimed to: (1) identify clusters of dietary patterns among Dutch adults, (2) test the reproducibility of these clusters and stability of cluster membership over time, and (3) identify sociodemographic predictors of cluster membership and cluster transition. This study had a longitudinal design with online measurements at baseline (N=483) and 6 months follow-up (N=379). Dietary intake was assessed with a validated food frequency questionnaire. A hierarchical cluster analysis was performed, followed by a K-means cluster analysis. Multinomial logistic regression analyses were conducted to identify the sociodemographic predictors of cluster membership and cluster transition. At baseline and follow-up, a comparable three-cluster solution was derived, distinguishing a healthy, moderately healthy, and unhealthy dietary pattern. Male and lower educated participants were significantly more likely to have a less healthy dietary pattern. Further, 251 (66.2%) participants remained in the same cluster, 45 (11.9%) participants changed to an unhealthier cluster, and 83 (21.9%) participants shifted to a healthier cluster. Men and people living alone were significantly more likely to shift toward a less healthy dietary pattern. Distinctive clusters of dietary patterns can be derived. Yet, cluster membership is unstable and only few sociodemographic factors were associated with cluster membership and cluster transition. These findings imply that clusters based on dietary intake may not be suitable as a basis for nutrition education interventions. Copyright © 2014 Elsevier Ltd. All rights reserved.
A Bimodal Hybrid Model for Time-Dependent Probabilistic Seismic Hazard Analysis

NASA Astrophysics Data System (ADS)

Yaghmaei-Sabegh, Saman; Shoaeifar, Nasser; Shoaeifar, Parva

2018-03-01

The evaluation of evidence provided by geological studies and historical catalogs indicates that in some seismic regions and faults, multiple large earthquakes occur in cluster. Then, the occurrences of large earthquakes confront with quiescence and only the small-to-moderate earthquakes take place. Clustering of large earthquakes is the most distinguishable departure from the assumption of constant hazard of random occurrence of earthquakes in conventional seismic hazard analysis. In the present study, a time-dependent recurrence model is proposed to consider a series of large earthquakes that occurs in clusters. The model is flexible enough to better reflect the quasi-periodic behavior of large earthquakes with long-term clustering, which can be used in time-dependent probabilistic seismic hazard analysis with engineering purposes. In this model, the time-dependent hazard results are estimated by a hazard function which comprises three parts. A decreasing hazard of last large earthquake cluster and an increasing hazard of the next large earthquake cluster, along with a constant hazard of random occurrence of small-to-moderate earthquakes. In the final part of the paper, the time-dependent seismic hazard of the New Madrid Seismic Zone at different time intervals has been calculated for illustrative purpose.
Transcriptional and Chromatin Dynamics of Muscle Regeneration After Severe Trauma

DTIC Science & Technology

2016-10-12

performed pathway analysis of the time-clustered RNA- Seq data16 and showed an initial burst of pro-inflammatory and immune-response transcripts in the...143 showed dynamic behavior (See Methods) and analysis of the dynamic miRNAs reinforced many of the results observed from the RNA-Seq datasets...excellent agreement was viewed. Hierarchical clustering of the datasets through time revealed 5 clusters, and gene ontology (GO) analysis of the
Coronal Mass Ejection Data Clustering and Visualization of Decision Trees

NASA Astrophysics Data System (ADS)

Ma, Ruizhe; Angryk, Rafal A.; Riley, Pete; Filali Boubrahimi, Soukaina

2018-05-01

Coronal mass ejections (CMEs) can be categorized as either “magnetic clouds” (MCs) or non-MCs. Features such as a large magnetic field, low plasma-beta, and low proton temperature suggest that a CME event is also an MC event; however, so far there is neither a definitive method nor an automatic process to distinguish the two. Human labeling is time-consuming, and results can fluctuate owing to the imprecise definition of such events. In this study, we approach the problem of MC and non-MC distinction from a time series data analysis perspective and show how clustering can shed some light on this problem. Although many algorithms exist for traditional data clustering in the Euclidean space, they are not well suited for time series data. Problems such as inadequate distance measure, inaccurate cluster center description, and lack of intuitive cluster representations need to be addressed for effective time series clustering. Our data analysis in this work is twofold: clustering and visualization. For clustering we compared the results from the popular hierarchical agglomerative clustering technique to a distance density clustering heuristic we developed previously for time series data clustering. In both cases, dynamic time warping will be used for similarity measure. For classification as well as visualization, we use decision trees to aggregate single-dimensional clustering results to form a multidimensional time series decision tree, with averaged time series to present each decision. In this study, we achieved modest accuracy and, more importantly, an intuitive interpretation of how different parameters contribute to an MC event.
Functional clustering of time series gene expression data by Granger causality

PubMed Central

2012-01-01

Background A common approach for time series gene expression data analysis includes the clustering of genes with similar expression patterns throughout time. Clustered gene expression profiles point to the joint contribution of groups of genes to a particular cellular process. However, since genes belong to intricate networks, other features, besides comparable expression patterns, should provide additional information for the identification of functionally similar genes. Results In this study we perform gene clustering through the identification of Granger causality between and within sets of time series gene expression data. Granger causality is based on the idea that the cause of an event cannot come after its consequence. Conclusions This kind of analysis can be used as a complementary approach for functional clustering, wherein genes would be clustered not solely based on their expression similarity but on their topological proximity built according to the intensity of Granger causality among them. PMID:23107425
Narcolepsy with and without cataplexy, idiopathic hypersomnia with and without long sleep time: a cluster analysis.

PubMed

Šonka, Karel; Šusta, Marek; Billiard, Michel

2015-02-01

The successive editions of the International Classification of Sleep Disorders (ICSD) reflect the evolution of the concepts of various sleep disorders. This is particularly the case for central disorders of hypersomnolence, with continuous changes in terminology and divisions of narcolepsy, idiopathic hypersomnia, and recurrent hypersomnia. According to the ICSD 2nd Edition (ICSD-2), narcolepsy with cataplexy (NwithC), narcolepsy without cataplexy (Nw/oC), idiopathic hypersomnia with long sleep time (IHwithLST), and idiopathic hypersomnia without long sleep time (IHw/oLST) are four, well-defined hypersomnias of central origin. However, in the absence of biological markers, doubts have been raised as to the relevance of a division of idiopathic hypersomnia into two forms, and it is not yet clear whether Nw/oC and IHw/oLST are two distinct entities. With this in mind, it was decided to empirically review the ICSD-2 classification by using a hierarchical cluster analysis to see whether this division has some relevance, even though the terms "with long sleep time" and "without long sleep time" are inappropriate. The cluster analysis differentiated three main clusters: Cluster 1, "combined monosymptomatic hypersomnia/narcolepsy type 2" (people initially diagnosed with IHw/oLST and Nw/oC); Cluster 2 "polysymptomatic hypersomnia" (people initially diagnosed with IHwithLST); and Cluster 3, narcolepsy type 1 (people initially diagnosed with NwithC). Cluster analysis confirmed that narcolepsy type 1 and polysymptomatic hypersomnia are independent sleep disorders. People who were initially diagnosed with Nw/oC and IHw/oLST formed a single cluster, referred to as "combined monosymptomatic hypersomnia/narcolepsy type 2." Copyright © 2014 Elsevier B.V. All rights reserved.
Comparison of Salmonella enteritidis phage types isolated from layers and humans in Belgium in 2005.

PubMed

Welby, Sarah; Imberechts, Hein; Riocreux, Flavien; Bertrand, Sophie; Dierick, Katelijne; Wildemauwe, Christa; Hooyberghs, Jozef; Van der Stede, Yves

2011-08-01

The aim of this study was to investigate the available results for Belgium of the European Union coordinated monitoring program (2004/665 EC) on Salmonella in layers in 2005, as well as the results of the monthly outbreak reports of Salmonella Enteritidis in humans in 2005 to identify a possible statistical significant trend in both populations. Separate descriptive statistics and univariate analysis were carried out and the parametric and/or non-parametric hypothesis tests were conducted. A time cluster analysis was performed for all Salmonella Enteritidis phage types (PTs) isolated. The proportions of each Salmonella Enteritidis PT in layers and in humans were compared and the monthly distribution of the most common PT, isolated in both populations, was evaluated. The time cluster analysis revealed significant clusters during the months May and June for layers and May, July, August, and September for humans. PT21, the most frequently isolated PT in both populations in 2005, seemed to be responsible of these significant clusters. PT4 was the second most frequently isolated PT. No significant difference was found for the monthly trend evolution of both PT in both populations based on parametric and non-parametric methods. A similar monthly trend of PT distribution in humans and layers during the year 2005 was observed. The time cluster analysis and the statistical significance testing confirmed these results. Moreover, the time cluster analysis showed significant clusters during the summer time and slightly delayed in time (humans after layers). These results suggest a common link between the prevalence of Salmonella Enteritidis in layers and the occurrence of the pathogen in humans. Phage typing was confirmed to be a useful tool for identifying temporal trends.
Cluster Analysis of Time-Dependent Crystallographic Data: Direct Identification of Time-Independent Structural Intermediates

PubMed Central

Kostov, Konstantin S.; Moffat, Keith

2011-01-01

The initial output of a time-resolved macromolecular crystallography experiment is a time-dependent series of difference electron density maps that displays the time-dependent changes in underlying structure as a reaction progresses. The goal is to interpret such data in terms of a small number of crystallographically refinable, time-independent structures, each associated with a reaction intermediate; to establish the pathways and rate coefficients by which these intermediates interconvert; and thereby to elucidate a chemical kinetic mechanism. One strategy toward achieving this goal is to use cluster analysis, a statistical method that groups objects based on their similarity. If the difference electron density at a particular voxel in the time-dependent difference electron density (TDED) maps is sensitive to the presence of one and only one intermediate, then its temporal evolution will exactly parallel the concentration profile of that intermediate with time. The rationale is therefore to cluster voxels with respect to the shapes of their TDEDs, so that each group or cluster of voxels corresponds to one structural intermediate. Clusters of voxels whose TDEDs reflect the presence of two or more specific intermediates can also be identified. From such groupings one can then infer the number of intermediates, obtain their time-independent difference density characteristics, and refine the structure of each intermediate. We review the principles of cluster analysis and clustering algorithms in a crystallographic context, and describe the application of the method to simulated and experimental time-resolved crystallographic data for the photocycle of photoactive yellow protein. PMID:21244840
Space-time analysis of pneumonia hospitalisations in the Netherlands.

PubMed

Benincà, Elisa; van Boven, Michiel; Hagenaars, Thomas; van der Hoek, Wim

2017-01-01

Community acquired pneumonia is a major global public health problem. In the Netherlands there are 40,000-50,000 hospital admissions for pneumonia per year. In the large majority of these hospital admissions the etiologic agent is not determined and a real-time surveillance system is lacking. Localised and temporal increases in hospital admissions for pneumonia are therefore only detected retrospectively and the etiologic agents remain unknown. Here, we perform spatio-temporal analyses of pneumonia hospital admission data in the Netherlands. To this end, we scanned for spatial clusters on yearly and seasonal basis, and applied wavelet cluster analysis on the time series of five main regions. The pneumonia hospital admissions show strong clustering in space and time superimposed on a regular yearly cycle with high incidence in winter and low incidence in summer. Cluster analysis reveals a heterogeneous pattern, with most significant clusters occurring in the western, highly urbanised, and in the eastern, intensively farmed, part of the Netherlands. Quantitatively, the relative risk (RR) of the significant clusters for the age-standardised incidence varies from a minimum of 1.2 to a maximum of 2.2. We discuss possible underlying causes for the patterns observed, such as variations in air pollution.
Clustering of Dietary Patterns, Lifestyles, and Overweight among Spanish Children and Adolescents in the ANIBES Study

PubMed Central

Pérez-Rodrigo, Carmen; Gil, Ángel; González-Gross, Marcela; Ortega, Rosa M.; Serra-Majem, Lluis; Varela-Moreiras, Gregorio; Aranceta-Bartrina, Javier

2015-01-01

Weight gain has been associated with behaviors related to diet, sedentary lifestyle, and physical activity. We investigated dietary patterns and possible meaningful clustering of physical activity, sedentary behavior, and sleep time in Spanish children and adolescents and whether the identified clusters could be associated with overweight. Analysis was based on a subsample (n = 415) of the cross-sectional ANIBES study in Spain. We performed exploratory factor analysis and subsequent cluster analysis of dietary patterns, physical activity, sedentary behaviors, and sleep time. Logistic regression analysis was used to explore the association between the cluster solutions and overweight. Factor analysis identified four dietary patterns, one reflecting a profile closer to the traditional Mediterranean diet. Dietary patterns, physical activity behaviors, sedentary behaviors and sleep time on weekdays in Spanish children and adolescents clustered into two different groups. A low physical activity-poorer diet lifestyle pattern, which included a higher proportion of girls, and a high physical activity, low sedentary behavior, longer sleep duration, healthier diet lifestyle pattern. Although increased risk of being overweight was not significant, the Prevalence Ratios (PRs) for the low physical activity-poorer diet lifestyle pattern were >1 in children and in adolescents. The healthier lifestyle pattern included lower proportions of children and adolescents from low socioeconomic status backgrounds. PMID:26729155
Time fluctuation analysis of forest fire sequences

NASA Astrophysics Data System (ADS)

Vega Orozco, Carmen D.; Kanevski, Mikhaïl; Tonini, Marj; Golay, Jean; Pereira, Mário J. G.

2013-04-01

Forest fires are complex events involving both space and time fluctuations. Understanding of their dynamics and pattern distribution is of great importance in order to improve the resource allocation and support fire management actions at local and global levels. This study aims at characterizing the temporal fluctuations of forest fire sequences observed in Portugal, which is the country that holds the largest wildfire land dataset in Europe. This research applies several exploratory data analysis measures to 302,000 forest fires occurred from 1980 to 2007. The applied clustering measures are: Morisita clustering index, fractal and multifractal dimensions (box-counting), Ripley's K-function, Allan Factor, and variography. These algorithms enable a global time structural analysis describing the degree of clustering of a point pattern and defining whether the observed events occur randomly, in clusters or in a regular pattern. The considered methods are of general importance and can be used for other spatio-temporal events (i.e. crime, epidemiology, biodiversity, geomarketing, etc.). An important contribution of this research deals with the analysis and estimation of local measures of clustering that helps understanding their temporal structure. Each measure is described and executed for the raw data (forest fires geo-database) and results are compared to reference patterns generated under the null hypothesis of randomness (Poisson processes) embedded in the same time period of the raw data. This comparison enables estimating the degree of the deviation of the real data from a Poisson process. Generalizations to functional measures of these clustering methods, taking into account the phenomena, were also applied and adapted to detect time dependences in a measured variable (i.e. burned area). The time clustering of the raw data is compared several times with the Poisson processes at different thresholds of the measured function. Then, the clustering measure value depends on the threshold which helps to understand the time pattern of the studied events. Our findings detected the presence of overdensity of events in particular time periods and showed that the forest fire sequences in Portugal can be considered as a multifractal process with a degree of time-clustering of the events. Key words: time sequences, Morisita index, fractals, multifractals, box-counting, Ripley's K-function, Allan Factor, variography, forest fires, point process. Acknowledgements This work was partly supported by the SNFS Project No. 200021-140658, "Analysis and Modelling of Space-Time Patterns in Complex Regions". References - Kanevski M. (Editor). 2008. Advanced Mapping of Environmental Data: Geostatistics, Machine Learning and Bayesian Maximum Entropy. London / Hoboken: iSTE / Wiley. - Telesca L. and Pereira M.G. 2010. Time-clustering investigation of fire temporal fluctuations in Portugal, Nat. Hazards Earth Syst. Sci., vol. 10(4): 661-666. - Vega Orozco C., Tonini M., Conedera M., Kanevski M. (2012) Cluster recognition in spatial-temporal sequences: the case of forest fires, Geoinformatica, vol. 16(4): 653-673.
Subgroups of advanced cancer patients clustered by their symptom profiles: quality-of-life outcomes.

PubMed

Husain, Amna; Myers, Jeff; Selby, Debbie; Thomson, Barbara; Chow, Edward

2011-11-01

Symptom cluster analysis is a new frontier of research in symptom management. This study clustered patients by their symptom profiles to identify subgroups that may be at higher risk for poor quality of life (QOL) and that may, therefore, benefit most from targeted interventions. Longitudinal study of metastatic cancer patients using the Edmonton Symptom Assessment Scale (ESAS). We generated two-, three-, and four-cluster subgroups and examined the relationship of cluster membership with patient outcomes. To address the problem of missing longitudinal data, we developed a novel outcome variable (QualTime) that measures both QOL and time in study. Two hundred and twenty-one patients with a mean Palliative Performance Scale (PPS) of 59.1 were enrolled. The three-cluster model was chosen for further analysis. The low-burden subgroup had all low severity symptom scores. The intermediate subgroup separates from the low-burden group on the "debility" profile of fatigue, drowsiness, appetite, and well-being. The high-burden group separates from the intermediate-burden group on pain, depression, and anxiety. At baseline, PPS (p=0.0003) and cluster membership (p<0.0001) contributed significantly to global QOL. In univariate analysis, cluster membership was related to the longitudinal outcome, QualTime. In a multivariate model, the relationship of PPS to QualTime was still significant (p=0.0002), but subgroup membership was no longer significant (p=0.1009). PPS is a stronger predictor of the longitudinal variable than cluster subgroups; however, cluster subgroups provide a target for clinical interventions that may improve QOL.
[Analysis of Time-to-onset of Interstitial Lung Disease after the Administration of Small Molecule Molecularly-targeted Drugs].

PubMed

Komada, Fusao

2018-01-01

　The aim of this study was to investigate the time-to-onset of drug-induced interstitial lung disease (DILD) following the administration of small molecule molecularly-targeted drugs via the use of the spontaneous adverse reaction reporting system of the Japanese Adverse Drug Event Report database. DILD datasets for afatinib, alectinib, bortezomib, crizotinib, dasatinib, erlotinib, everolimus, gefitinib, imatinib, lapatinib, nilotinib, osimertinib, sorafenib, sunitinib, temsirolimus, and tofacitinib were used to calculate the median onset times of DILD and the Weibull distribution parameters, and to perform the hierarchical cluster analysis. The median onset times of DILD for afatinib, bortezomib, crizotinib, erlotinib, gefitinib, and nilotinib were within one month. The median onset times of DILD for dasatinib, everolimus, lapatinib, osimertinib, and temsirolimus ranged from 1 to 2 months. The median onset times of the DILD for alectinib, imatinib, and tofacitinib ranged from 2 to 3 months. The median onset times of the DILD for sunitinib and sorafenib ranged from 8 to 9 months. Weibull distributions for these drugs when using the cluster analysis showed that there were 4 clusters. Cluster 1 described a subgroup with early to later onset DILD and early failure type profiles or a random failure type profile. Cluster 2 exhibited early failure type profiles or a random failure type profile with early onset DILD. Cluster 3 exhibited a random failure type profile or wear out failure type profiles with later onset DILD. Cluster 4 exhibited an early failure type profile or a random failure type profile with the latest onset DILD.
Multi-scale visual analysis of time-varying electrocorticography data via clustering of brain regions

DOE PAGES

Murugesan, Sugeerth; Bouchard, Kristofer; Chang, Edward; ...

2017-06-06

There exists a need for effective and easy-to-use software tools supporting the analysis of complex Electrocorticography (ECoG) data. Understanding how epileptic seizures develop or identifying diagnostic indicators for neurological diseases require the in-depth analysis of neural activity data from ECoG. Such data is multi-scale and is of high spatio-temporal resolution. Comprehensive analysis of this data should be supported by interactive visual analysis methods that allow a scientist to understand functional patterns at varying levels of granularity and comprehend its time-varying behavior. We introduce a novel multi-scale visual analysis system, ECoG ClusterFlow, for the detailed exploration of ECoG data. Our systemmore » detects and visualizes dynamic high-level structures, such as communities, derived from the time-varying connectivity network. The system supports two major views: 1) an overview summarizing the evolution of clusters over time and 2) an electrode view using hierarchical glyph-based design to visualize the propagation of clusters in their spatial, anatomical context. We present case studies that were performed in collaboration with neuroscientists and neurosurgeons using simulated and recorded epileptic seizure data to demonstrate our system's effectiveness. ECoG ClusterFlow supports the comparison of spatio-temporal patterns for specific time intervals and allows a user to utilize various clustering algorithms. Neuroscientists can identify the site of seizure genesis and its spatial progression during various the stages of a seizure. Our system serves as a fast and powerful means for the generation of preliminary hypotheses that can be used as a basis for subsequent application of rigorous statistical methods, with the ultimate goal being the clinical treatment of epileptogenic zones.« less
Multivariate time series clustering on geophysical data recorded at Mt. Etna from 1996 to 2003

NASA Astrophysics Data System (ADS)

Di Salvo, Roberto; Montalto, Placido; Nunnari, Giuseppe; Neri, Marco; Puglisi, Giuseppe

2013-02-01

Time series clustering is an important task in data analysis issues in order to extract implicit, previously unknown, and potentially useful information from a large collection of data. Finding useful similar trends in multivariate time series represents a challenge in several areas including geophysics environment research. While traditional time series analysis methods deal only with univariate time series, multivariate time series analysis is a more suitable approach in the field of research where different kinds of data are available. Moreover, the conventional time series clustering techniques do not provide desired results for geophysical datasets due to the huge amount of data whose sampling rate is different according to the nature of signal. In this paper, a novel approach concerning geophysical multivariate time series clustering is proposed using dynamic time series segmentation and Self Organizing Maps techniques. This method allows finding coupling among trends of different geophysical data recorded from monitoring networks at Mt. Etna spanning from 1996 to 2003, when the transition from summit eruptions to flank eruptions occurred. This information can be used to carry out a more careful evaluation of the state of volcano and to define potential hazard assessment at Mt. Etna.

Clusters of Insomnia Disorder: An Exploratory Cluster Analysis of Objective Sleep Parameters Reveals Differences in Neurocognitive Functioning, Quantitative EEG, and Heart Rate Variability.

PubMed

Miller, Christopher B; Bartlett, Delwyn J; Mullins, Anna E; Dodds, Kirsty L; Gordon, Christopher J; Kyle, Simon D; Kim, Jong Won; D'Rozario, Angela L; Lee, Rico S C; Comas, Maria; Marshall, Nathaniel S; Yee, Brendon J; Espie, Colin A; Grunstein, Ronald R

2016-11-01

To empirically derive and evaluate potential clusters of Insomnia Disorder through cluster analysis from polysomnography (PSG). We hypothesized that clusters would differ on neurocognitive performance, sleep-onset measures of quantitative ( q )-EEG and heart rate variability (HRV). Research volunteers with Insomnia Disorder (DSM-5) completed a neurocognitive assessment and overnight PSG measures of total sleep time (TST), wake time after sleep onset (WASO), and sleep onset latency (SOL) were used to determine clusters. From 96 volunteers with Insomnia Disorder, cluster analysis derived at least two clusters from objective sleep parameters: Insomnia with normal objective sleep duration (I-NSD: n = 53) and Insomnia with short sleep duration (I-SSD: n = 43). At sleep onset, differences in HRV between I-NSD and I-SSD clusters suggest attenuated parasympathetic activity in I-SSD (P < 0.05). Preliminary work suggested three clusters by retaining the I-NSD and splitting the I-SSD cluster into two: I-SSD A (n = 29): defined by high WASO and I-SSD B (n = 14): a second I-SSD cluster with high SOL and medium WASO. The I-SSD B cluster performed worse than I-SSD A and I-NSD for sustained attention (P ≤ 0.05). In an exploratory analysis, q -EEG revealed reduced spectral power also in I-SSD B before (Delta, Alpha, Beta-1) and after sleep-onset (Beta-2) compared to I-SSD A and I-NSD (P ≤ 0.05). Two insomnia clusters derived from cluster analysis differ in sleep onset HRV. Preliminary data suggest evidence for three clusters in insomnia with differences for sustained attention and sleep-onset q -EEG. Insomnia 100 sleep study: Australia New Zealand Clinical Trials Registry (ANZCTR) identification number 12612000049875. URL: https://www.anzctr.org.au/Trial/Registration/TrialReview.aspx?id=347742. © 2016 Associated Professional Sleep Societies, LLC.
Hierarchical Aligned Cluster Analysis for Temporal Clustering of Human Motion.

PubMed

Zhou, Feng; De la Torre, Fernando; Hodgins, Jessica K

2013-03-01

Temporal segmentation of human motion into plausible motion primitives is central to understanding and building computational models of human motion. Several issues contribute to the challenge of discovering motion primitives: the exponential nature of all possible movement combinations, the variability in the temporal scale of human actions, and the complexity of representing articulated motion. We pose the problem of learning motion primitives as one of temporal clustering, and derive an unsupervised hierarchical bottom-up framework called hierarchical aligned cluster analysis (HACA). HACA finds a partition of a given multidimensional time series into m disjoint segments such that each segment belongs to one of k clusters. HACA combines kernel k-means with the generalized dynamic time alignment kernel to cluster time series data. Moreover, it provides a natural framework to find a low-dimensional embedding for time series. HACA is efficiently optimized with a coordinate descent strategy and dynamic programming. Experimental results on motion capture and video data demonstrate the effectiveness of HACA for segmenting complex motions and as a visualization tool. We also compare the performance of HACA to state-of-the-art algorithms for temporal clustering on data of a honey bee dance. The HACA code is available online.
Automated classification of mouse pup isolation syllables: from cluster analysis to an Excel-based "mouse pup syllable classification calculator".

PubMed

Grimsley, Jasmine M S; Gadziola, Marie A; Wenstrup, Jeffrey J

2012-01-01

Mouse pups vocalize at high rates when they are cold or isolated from the nest. The proportions of each syllable type produced carry information about disease state and are being used as behavioral markers for the internal state of animals. Manual classifications of these vocalizations identified 10 syllable types based on their spectro-temporal features. However, manual classification of mouse syllables is time consuming and vulnerable to experimenter bias. This study uses an automated cluster analysis to identify acoustically distinct syllable types produced by CBA/CaJ mouse pups, and then compares the results to prior manual classification methods. The cluster analysis identified two syllable types, based on their frequency bands, that have continuous frequency-time structure, and two syllable types featuring abrupt frequency transitions. Although cluster analysis computed fewer syllable types than manual classification, the clusters represented well the probability distributions of the acoustic features within syllables. These probability distributions indicate that some of the manually classified syllable types are not statistically distinct. The characteristics of the four classified clusters were used to generate a Microsoft Excel-based mouse syllable classifier that rapidly categorizes syllables, with over a 90% match, into the syllable types determined by cluster analysis.
Hierarchical Spatio-temporal Visual Analysis of Cluster Evolution in Electrocorticography Data

DOE PAGES

Murugesan, Sugeerth; Bouchard, Kristofer; Chang, Edward; ...

2016-10-02

Here, we present ECoG ClusterFlow, a novel interactive visual analysis tool for the exploration of high-resolution Electrocorticography (ECoG) data. Our system detects and visualizes dynamic high-level structures, such as communities, using the time-varying spatial connectivity network derived from the high-resolution ECoG data. ECoG ClusterFlow provides a multi-scale visualization of the spatio-temporal patterns underlying the time-varying communities using two views: 1) an overview summarizing the evolution of clusters over time and 2) a hierarchical glyph-based technique that uses data aggregation and small multiples techniques to visualize the propagation of clusters in their spatial domain. ECoG ClusterFlow makes it possible 1) tomore » compare the spatio-temporal evolution patterns across various time intervals, 2) to compare the temporal information at varying levels of granularity, and 3) to investigate the evolution of spatial patterns without occluding the spatial context information. Lastly, we present case studies done in collaboration with neuroscientists on our team for both simulated and real epileptic seizure data aimed at evaluating the effectiveness of our approach.« less
Text mining to decipher free-response consumer complaints: insights from the NHTSA vehicle owner's complaint database.

PubMed

Ghazizadeh, Mahtab; McDonald, Anthony D; Lee, John D

2014-09-01

This study applies text mining to extract clusters of vehicle problems and associated trends from free-response data in the National Highway Traffic Safety Administration's vehicle owner's complaint database. As the automotive industry adopts new technologies, it is important to systematically assess the effect of these changes on traffic safety. Driving simulators, naturalistic driving data, and crash databases all contribute to a better understanding of how drivers respond to changing vehicle technology, but other approaches, such as automated analysis of incident reports, are needed. Free-response data from incidents representing two severity levels (fatal incidents and incidents involving injury) were analyzed using a text mining approach: latent semantic analysis (LSA). LSA and hierarchical clustering identified clusters of complaints for each severity level, which were compared and analyzed across time. Cluster analysis identified eight clusters of fatal incidents and six clusters of incidents involving injury. Comparisons showed that although the airbag clusters across the two severity levels have the same most frequent terms, the circumstances around the incidents differ. The time trends show clear increases in complaints surrounding the Ford/Firestone tire recall and the Toyota unintended acceleration recall. Increases in complaints may be partially driven by these recall announcements and the associated media attention. Text mining can reveal useful information from free-response databases that would otherwise be prohibitively time-consuming and difficult to summarize manually. Text mining can extend human analysis capabilities for large free-response databases to support earlier detection of problems and more timely safety interventions.
Bootstrap-based methods for estimating standard errors in Cox's regression analyses of clustered event times.

PubMed

Xiao, Yongling; Abrahamowicz, Michal

2010-03-30

We propose two bootstrap-based methods to correct the standard errors (SEs) from Cox's model for within-cluster correlation of right-censored event times. The cluster-bootstrap method resamples, with replacement, only the clusters, whereas the two-step bootstrap method resamples (i) the clusters, and (ii) individuals within each selected cluster, with replacement. In simulations, we evaluate both methods and compare them with the existing robust variance estimator and the shared gamma frailty model, which are available in statistical software packages. We simulate clustered event time data, with latent cluster-level random effects, which are ignored in the conventional Cox's model. For cluster-level covariates, both proposed bootstrap methods yield accurate SEs, and type I error rates, and acceptable coverage rates, regardless of the true random effects distribution, and avoid serious variance under-estimation by conventional Cox-based standard errors. However, the two-step bootstrap method over-estimates the variance for individual-level covariates. We also apply the proposed bootstrap methods to obtain confidence bands around flexible estimates of time-dependent effects in a real-life analysis of cluster event times.
The `TTIME' Package: Performance Evaluation in a Cluster Computing Environment

NASA Astrophysics Data System (ADS)

Howe, Marico; Berleant, Daniel; Everett, Albert

2011-06-01

The objective of translating developmental event time across mammalian species is to gain an understanding of the timing of human developmental events based on known time of those events in animals. The potential benefits include improvements to diagnostic and intervention capabilities. The CRAN `ttime' package provides the functionality to infer unknown event timings and investigate phylogenetic proximity utilizing hierarchical clustering of both known and predicted event timings. The original generic mammalian model included nine eutherian mammals: Felis domestica (cat), Mustela putorius furo (ferret), Mesocricetus auratus (hamster), Macaca mulatta (monkey), Homo sapiens (humans), Mus musculus (mouse), Oryctolagus cuniculus (rabbit), Rattus norvegicus (rat), and Acomys cahirinus (spiny mouse). However, the data for this model is expected to grow as more data about developmental events is identified and incorporated into the analysis. Performance evaluation of the `ttime' package across a cluster computing environment versus a comparative analysis in a serial computing environment provides an important computational performance assessment. A theoretical analysis is the first stage of a process in which the second stage, if justified by the theoretical analysis, is to investigate an actual implementation of the `ttime' package in a cluster computing environment and to understand the parallelization process that underlies implementation.
Two worlds collide: Image analysis methods for quantifying structural variation in cluster molecular dynamics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Steenbergen, K. G., E-mail: kgsteen@gmail.com; Gaston, N.

2014-02-14

Inspired by methods of remote sensing image analysis, we analyze structural variation in cluster molecular dynamics (MD) simulations through a unique application of the principal component analysis (PCA) and Pearson Correlation Coefficient (PCC). The PCA analysis characterizes the geometric shape of the cluster structure at each time step, yielding a detailed and quantitative measure of structural stability and variation at finite temperature. Our PCC analysis captures bond structure variation in MD, which can be used to both supplement the PCA analysis as well as compare bond patterns between different cluster sizes. Relying only on atomic position data, without requirement formore » a priori structural input, PCA and PCC can be used to analyze both classical and ab initio MD simulations for any cluster composition or electronic configuration. Taken together, these statistical tools represent powerful new techniques for quantitative structural characterization and isomer identification in cluster MD.« less
Two worlds collide: image analysis methods for quantifying structural variation in cluster molecular dynamics.

PubMed

Steenbergen, K G; Gaston, N

2014-02-14

Inspired by methods of remote sensing image analysis, we analyze structural variation in cluster molecular dynamics (MD) simulations through a unique application of the principal component analysis (PCA) and Pearson Correlation Coefficient (PCC). The PCA analysis characterizes the geometric shape of the cluster structure at each time step, yielding a detailed and quantitative measure of structural stability and variation at finite temperature. Our PCC analysis captures bond structure variation in MD, which can be used to both supplement the PCA analysis as well as compare bond patterns between different cluster sizes. Relying only on atomic position data, without requirement for a priori structural input, PCA and PCC can be used to analyze both classical and ab initio MD simulations for any cluster composition or electronic configuration. Taken together, these statistical tools represent powerful new techniques for quantitative structural characterization and isomer identification in cluster MD.
Scientific Cluster Deployment and Recovery - Using puppet to simplify cluster management

NASA Astrophysics Data System (ADS)

Hendrix, Val; Benjamin, Doug; Yao, Yushu

2012-12-01

Deployment, maintenance and recovery of a scientific cluster, which has complex, specialized services, can be a time consuming task requiring the assistance of Linux system administrators, network engineers as well as domain experts. Universities and small institutions that have a part-time FTE with limited time for and knowledge of the administration of such clusters can be strained by such maintenance tasks. This current work is the result of an effort to maintain a data analysis cluster (DAC) with minimal effort by a local system administrator. The realized benefit is the scientist, who is the local system administrator, is able to focus on the data analysis instead of the intricacies of managing a cluster. Our work provides a cluster deployment and recovery process (CDRP) based on the puppet configuration engine allowing a part-time FTE to easily deploy and recover entire clusters with minimal effort. Puppet is a configuration management system (CMS) used widely in computing centers for the automatic management of resources. Domain experts use Puppet's declarative language to define reusable modules for service configuration and deployment. Our CDRP has three actors: domain experts, a cluster designer and a cluster manager. The domain experts first write the puppet modules for the cluster services. A cluster designer would then define a cluster. This includes the creation of cluster roles, mapping the services to those roles and determining the relationships between the services. Finally, a cluster manager would acquire the resources (machines, networking), enter the cluster input parameters (hostnames, IP addresses) and automatically generate deployment scripts used by puppet to configure it to act as a designated role. In the event of a machine failure, the originally generated deployment scripts along with puppet can be used to easily reconfigure a new machine. The cluster definition produced in our CDRP is an integral part of automating cluster deployment in a cloud environment. Our future cloud efforts will further build on this work.
Aftershock identification problem via the nearest-neighbor analysis for marked point processes

NASA Astrophysics Data System (ADS)

Gabrielov, A.; Zaliapin, I.; Wong, H.; Keilis-Borok, V.

2007-12-01

The centennial observations on the world seismicity have revealed a wide variety of clustering phenomena that unfold in the space-time-energy domain and provide most reliable information about the earthquake dynamics. However, there is neither a unifying theory nor a convenient statistical apparatus that would naturally account for the different types of seismic clustering. In this talk we present a theoretical framework for nearest-neighbor analysis of marked processes and obtain new results on hierarchical approach to studying seismic clustering introduced by Baiesi and Paczuski (2004). Recall that under this approach one defines an asymmetric distance D in space-time-energy domain such that the nearest-neighbor spanning graph with respect to D becomes a time- oriented tree. We demonstrate how this approach can be used to detect earthquake clustering. We apply our analysis to the observed seismicity of California and synthetic catalogs from ETAS model and show that the earthquake clustering part is statistically different from the homogeneous part. This finding may serve as a basis for an objective aftershock identification procedure.
Assessment of cluster yield components by image analysis.

PubMed

Diago, Maria P; Tardaguila, Javier; Aleixos, Nuria; Millan, Borja; Prats-Montalban, Jose M; Cubero, Sergio; Blasco, Jose

2015-04-01

Berry weight, berry number and cluster weight are key parameters for yield estimation for wine and tablegrape industry. Current yield prediction methods are destructive, labour-demanding and time-consuming. In this work, a new methodology, based on image analysis was developed to determine cluster yield components in a fast and inexpensive way. Clusters of seven different red varieties of grapevine (Vitis vinifera L.) were photographed under laboratory conditions and their cluster yield components manually determined after image acquisition. Two algorithms based on the Canny and the logarithmic image processing approaches were tested to find the contours of the berries in the images prior to berry detection performed by means of the Hough Transform. Results were obtained in two ways: by analysing either a single image of the cluster or using four images per cluster from different orientations. The best results (R(2) between 69% and 95% in berry detection and between 65% and 97% in cluster weight estimation) were achieved using four images and the Canny algorithm. The model's capability based on image analysis to predict berry weight was 84%. The new and low-cost methodology presented here enabled the assessment of cluster yield components, saving time and providing inexpensive information in comparison with current manual methods. © 2014 Society of Chemical Industry.
Towards Effective Clustering Techniques for the Analysis of Electric Power Grids

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hogan, Emilie A.; Cotilla Sanchez, Jose E.; Halappanavar, Mahantesh

2013-11-30

Clustering is an important data analysis technique with numerous applications in the analysis of electric power grids. Standard clustering techniques are oblivious to the rich structural and dynamic information available for power grids. Therefore, by exploiting the inherent topological and electrical structure in the power grid data, we propose new methods for clustering with applications to model reduction, locational marginal pricing, phasor measurement unit (PMU or synchrophasor) placement, and power system protection. We focus our attention on model reduction for analysis based on time-series information from synchrophasor measurement devices, and spectral techniques for clustering. By comparing different clustering techniques onmore » two instances of realistic power grids we show that the solutions are related and therefore one could leverage that relationship for a computational advantage. Thus, by contrasting different clustering techniques we make a case for exploiting structure inherent in the data with implications for several domains including power systems.« less
Clusters of Insomnia Disorder: An Exploratory Cluster Analysis of Objective Sleep Parameters Reveals Differences in Neurocognitive Functioning, Quantitative EEG, and Heart Rate Variability

PubMed Central

Miller, Christopher B.; Bartlett, Delwyn J.; Mullins, Anna E.; Dodds, Kirsty L.; Gordon, Christopher J.; Kyle, Simon D.; Kim, Jong Won; D'Rozario, Angela L.; Lee, Rico S.C.; Comas, Maria; Marshall, Nathaniel S.; Yee, Brendon J.; Espie, Colin A.; Grunstein, Ronald R.

2016-01-01

Study Objectives: To empirically derive and evaluate potential clusters of Insomnia Disorder through cluster analysis from polysomnography (PSG). We hypothesized that clusters would differ on neurocognitive performance, sleep-onset measures of quantitative (q)-EEG and heart rate variability (HRV). Methods: Research volunteers with Insomnia Disorder (DSM-5) completed a neurocognitive assessment and overnight PSG measures of total sleep time (TST), wake time after sleep onset (WASO), and sleep onset latency (SOL) were used to determine clusters. Results: From 96 volunteers with Insomnia Disorder, cluster analysis derived at least two clusters from objective sleep parameters: Insomnia with normal objective sleep duration (I-NSD: n = 53) and Insomnia with short sleep duration (I-SSD: n = 43). At sleep onset, differences in HRV between I-NSD and I-SSD clusters suggest attenuated parasympathetic activity in I-SSD (P < 0.05). Preliminary work suggested three clusters by retaining the I-NSD and splitting the I-SSD cluster into two: I-SSD A (n = 29): defined by high WASO and I-SSD B (n = 14): a second I-SSD cluster with high SOL and medium WASO. The I-SSD B cluster performed worse than I-SSD A and I-NSD for sustained attention (P ≤ 0.05). In an exploratory analysis, q-EEG revealed reduced spectral power also in I-SSD B before (Delta, Alpha, Beta-1) and after sleep-onset (Beta-2) compared to I-SSD A and I-NSD (P ≤ 0.05). Conclusions: Two insomnia clusters derived from cluster analysis differ in sleep onset HRV. Preliminary data suggest evidence for three clusters in insomnia with differences for sustained attention and sleep-onset q-EEG. Clinical Trial Registration: Insomnia 100 sleep study: Australia New Zealand Clinical Trials Registry (ANZCTR) identification number 12612000049875. URL: https://www.anzctr.org.au/Trial/Registration/TrialReview.aspx?id=347742. Citation: Miller CB, Bartlett DJ, Mullins AE, Dodds KL, Gordon CJ, Kyle SD, Kim JW, D'Rozario AL, Lee RS, Comas M, Marshall NS, Yee BJ, Espie CA, Grunstein RR. Clusters of Insomnia Disorder: an exploratory cluster analysis of objective sleep parameters reveals differences in neurocognitive functioning, quantitative EEG, and heart rate variability. SLEEP 2016;39(11):1993–2004. PMID:27568796
Comparative study of two protocols for quantitative image-analysis of serotonin transporter clustering in lymphocytes, a putative biomarker of therapeutic efficacy in major depression.

PubMed

Romay-Tallon, Raquel; Rivera-Baltanas, Tania; Allen, Josh; Olivares, Jose M; Kalynchuk, Lisa E; Caruncho, Hector J

2017-01-01

The pattern of serotonin transporter clustering on the plasma membrane of lymphocytes extracted from human whole blood samples has been identified as a putative biomarker of therapeutic efficacy in major depression. Here we evaluated the possibility of performing a similar analysis using blood smears obtained from rats, and from control human subjects and depression patients. We hypothesized that we could optimize a protocol to make the analysis of serotonin protein clustering in blood smears comparable to the analysis of serotonin protein clustering using isolated lymphocytes. Our data indicate that blood smears require a longer fixation time and longer times of incubation with primary and secondary antibodies. In addition, one needs to optimize the image analysis settings for the analysis of smears. When these steps are followed, the quantitative analysis of both the number and size of serotonin transporter clusters on the plasma membrane of lymphocytes is similar using both blood smears and isolated lymphocytes. The development of this novel protocol will greatly facilitate the collection of appropriate samples by eliminating the necessity and cost of specialized personnel for drawing blood samples, and by being a less invasive procedure. Therefore, this protocol will help us advance the validation of membrane protein clustering in lymphocytes as a biomarker of therapeutic efficacy in major depression, and bring it closer to its clinical application.
A cluster merging method for time series microarray with production values.

PubMed

Chira, Camelia; Sedano, Javier; Camara, Monica; Prieto, Carlos; Villar, Jose R; Corchado, Emilio

2014-09-01

A challenging task in time-course microarray data analysis is to cluster genes meaningfully combining the information provided by multiple replicates covering the same key time points. This paper proposes a novel cluster merging method to accomplish this goal obtaining groups with highly correlated genes. The main idea behind the proposed method is to generate a clustering starting from groups created based on individual temporal series (representing different biological replicates measured in the same time points) and merging them by taking into account the frequency by which two genes are assembled together in each clustering. The gene groups at the level of individual time series are generated using several shape-based clustering methods. This study is focused on a real-world time series microarray task with the aim to find co-expressed genes related to the production and growth of a certain bacteria. The shape-based clustering methods used at the level of individual time series rely on identifying similar gene expression patterns over time which, in some models, are further matched to the pattern of production/growth. The proposed cluster merging method is able to produce meaningful gene groups which can be naturally ranked by the level of agreement on the clustering among individual time series. The list of clusters and genes is further sorted based on the information correlation coefficient and new problem-specific relevant measures. Computational experiments and results of the cluster merging method are analyzed from a biological perspective and further compared with the clustering generated based on the mean value of time series and the same shape-based algorithm.
Identification of complex metabolic states in critically injured patients using bioinformatic cluster analysis.

PubMed

Cohen, Mitchell J; Grossman, Adam D; Morabito, Diane; Knudson, M Margaret; Butte, Atul J; Manley, Geoffrey T

2010-01-01

Advances in technology have made extensive monitoring of patient physiology the standard of care in intensive care units (ICUs). While many systems exist to compile these data, there has been no systematic multivariate analysis and categorization across patient physiological data. The sheer volume and complexity of these data make pattern recognition or identification of patient state difficult. Hierarchical cluster analysis allows visualization of high dimensional data and enables pattern recognition and identification of physiologic patient states. We hypothesized that processing of multivariate data using hierarchical clustering techniques would allow identification of otherwise hidden patient physiologic patterns that would be predictive of outcome. Multivariate physiologic and ventilator data were collected continuously using a multimodal bioinformatics system in the surgical ICU at San Francisco General Hospital. These data were incorporated with non-continuous data and stored on a server in the ICU. A hierarchical clustering algorithm grouped each minute of data into 1 of 10 clusters. Clusters were correlated with outcome measures including incidence of infection, multiple organ failure (MOF), and mortality. We identified 10 clusters, which we defined as distinct patient states. While patients transitioned between states, they spent significant amounts of time in each. Clusters were enriched for our outcome measures: 2 of the 10 states were enriched for infection, 6 of 10 were enriched for MOF, and 3 of 10 were enriched for death. Further analysis of correlations between pairs of variables within each cluster reveals significant differences in physiology between clusters. Here we show for the first time the feasibility of clustering physiological measurements to identify clinically relevant patient states after trauma. These results demonstrate that hierarchical clustering techniques can be useful for visualizing complex multivariate data and may provide new insights for the care of critically injured patients.
Space-Time Cluster Analysis to Detect Innovative Clinical Practices: A Case Study of Aripiprazole in the Department of Veterans Affairs.

PubMed

Penfold, Robert B; Burgess, James F; Lee, Austin F; Li, Mingfei; Miller, Christopher J; Nealon Seibert, Marjorie; Semla, Todd P; Mohr, David C; Kazis, Lewis E; Bauer, Mark S

2018-02-01

To identify space-time clusters of changes in prescribing aripiprazole for bipolar disorder among providers in the VA. VA administrative data from 2002 to 2010 were used to identify prescriptions of aripiprazole for bipolar disorder. Prescriber characteristics were obtained using the Personnel and Accounting Integrated Database. We conducted a retrospective space-time cluster analysis using the space-time permutation statistic. All VA service users with a diagnosis of bipolar disorder were included in the patient population. Individuals with any schizophrenia spectrum diagnoses were excluded. We also identified all clinicians who wrote a prescription for any bipolar disorder medication. The study population included 32,630 prescribers. Of these, 8,643 wrote qualifying prescriptions. We identified three clusters of aripiprazole prescribing centered in Massachusetts, Ohio, and the Pacific Northwest. Clusters were associated with prescribing by VA-employed (vs. contracted) prescribers. Nurses with prescribing privileges were more likely to make a prescription for aripiprazole in cluster locations compared with psychiatrists. Primary care physicians were less likely. Early prescribing of aripiprazole for bipolar disorder clustered geographically and was associated with prescriber subgroups. These methods support prospective surveillance of practice changes and identification of associated health system characteristics. © Health Research and Educational Trust.
Detecting space-time disease clusters with arbitrary shapes and sizes using a co-clustering approach.

PubMed

Ullah, Sami; Daud, Hanita; Dass, Sarat C; Khan, Habib Nawaz; Khalil, Alamgir

2017-11-06

Ability to detect potential space-time clusters in spatio-temporal data on disease occurrences is necessary for conducting surveillance and implementing disease prevention policies. Most existing techniques use geometrically shaped (circular, elliptical or square) scanning windows to discover disease clusters. In certain situations, where the disease occurrences tend to cluster in very irregularly shaped areas, these algorithms are not feasible in practise for the detection of space-time clusters. To address this problem, a new algorithm is proposed, which uses a co-clustering strategy to detect prospective and retrospective space-time disease clusters with no restriction on shape and size. The proposed method detects space-time disease clusters by tracking the changes in space-time occurrence structure instead of an in-depth search over space. This method was utilised to detect potential clusters in the annual and monthly malaria data in Khyber Pakhtunkhwa Province, Pakistan from 2012 to 2016 visualising the results on a heat map. The results of the annual data analysis showed that the most likely hotspot emerged in three sub-regions in the years 2013-2014. The most likely hotspots in monthly data appeared in the month of July to October in each year and showed a strong periodic trend.
Exploring the individual patterns of spiritual well-being in people newly diagnosed with advanced cancer: a cluster analysis.

PubMed

Bai, Mei; Dixon, Jane; Williams, Anna-Leila; Jeon, Sangchoon; Lazenby, Mark; McCorkle, Ruth

2016-11-01

Research shows that spiritual well-being correlates positively with quality of life (QOL) for people with cancer, whereas contradictory findings are frequently reported with respect to the differentiated associations between dimensions of spiritual well-being, namely peace, meaning and faith, and QOL. This study aimed to examine individual patterns of spiritual well-being among patients newly diagnosed with advanced cancer. Cluster analysis was based on the twelve items of the 12-item Functional Assessment of Chronic Illness Therapy-Spiritual Well-Being Scale at Time 1. A combination of hierarchical and k-means (non-hierarchical) clustering methods was employed to jointly determine the number of clusters. Self-rated health, depressive symptoms, peace, meaning and faith, and overall QOL were compared at Time 1 and Time 2. Hierarchical and k-means clustering methods both suggested four clusters. Comparison of the four clusters supported statistically significant and clinically meaningful differences in QOL outcomes among clusters while revealing contrasting relations of faith with QOL. Cluster 1, Cluster 3, and Cluster 4 represented high, medium, and low levels of overall QOL, respectively, with correspondingly high, medium, and low levels of peace, meaning, and faith. Cluster 2 was distinguished from other clusters by its medium levels of overall QOL, peace, and meaning and low level of faith. This study provides empirical support for individual difference in response to a newly diagnosed cancer and brings into focus conceptual and methodological challenges associated with the measure of spiritual well-being, which may partly contribute to the attenuated relation between faith and QOL.

Monitoring of changes in cluster structures in water under AC magnetic field

NASA Astrophysics Data System (ADS)

Usanov, A. D.; Ulyanov, S. S.; Ilyukhina, N. S.; Usanov, D. A.

2016-01-01

A fundamental possibility of visualizing cluster structures formed in distilled water by an optical method based on the analysis of dynamic speckle structures is demonstrated. It is shown for the first time that, in contrast to the existing concepts, water clusters can be rather large (up to 200 -m in size), and their lifetime is several tens of seconds. These clusters are found to have an internal spatially inhomogeneous structure, constantly changing in time. The properties of magnetized and non-magnetized water are found to differ significantly. In particular, the number of clusters formed in magnetized water is several times larger than that formed in the same volume of non-magnetized water.
A Systematic Approach for Determining Vertical Pile Depth of Embedment in Cohensionless Soils to Withstand Lateral Barge Train Impact Loads

DTIC Science & Technology

2017-01-30

dynamic structural time- history response analysis of flexible approach walls founded on clustered pile groups using Impact_Deck. In Preparation, ERDC...research (Ebeling et al. 2012) has developed simplified analysis procedures for flexible approach wall systems founded on clustered groups of vertical...history response analysis of flexible approach walls founded on clustered pile groups using Impact_Deck. In Preparation, ERDC/ITL TR-16-X. Vicksburg, MS
Microforms in gravel bed rivers: Formation, disintegration, and effects on bedload transport

USGS Publications Warehouse

Strom, K.; Papanicolaou, A.N.; Evangelopoulos, N.; Odeh, M.

2004-01-01

This research aims to advance current knowledge on cluster formation and evolution by tackling some of the aspects associated with cluster microtopography and the effects of clusters on bedload transport. The specific objectives of the study are (1) to identify the bed shear stress range in which clusters form and disintegrate, (2) to quantitatively describe the spacing characteristics and orientation of clusters with respect to flow characteristics, (3) to quantify the effects clusters have on the mean bedload rate, and (4) to assess the effects of clusters on the pulsating nature of bedload. In order to meet the objectives of this study, two main experimental scenarios, namely, Test Series A and B (20 experiments overall) are considered in a laboratory flume under well-controlled conditions. Series A tests are performed to address objectives (1) and (2) while Series B is designed to meet objectives (3) and (4). Results show that cluster microforms develop in uniform sediment at 1.25 to 2 times the Shields parameter of an individual particle and start disintegrating at about 2.25 times the Shields parameter. It is found that during an unsteady flow event, effects of clusters on bedload transport rate can be classified in three different phases: a sink phase where clusters absorb incoming sediment, a neutral phase where clusters do not affect bedload, and a source phase where clusters release particles. Clusters also increase the magnitude of the fluctuations in bedload transport rate, showing that clusters amplify the unsteady nature of bedload transport. A fourth-order autoregressive, autoregressive integrated moving average model is employed to describe the time series of bedload and provide a predictive formula for predicting bedload at different periods. Finally, a change-point analysis enhanced with a binary segmentation procedure is performed to identify the abrupt changes in the bedload statistic characteristics due to the effects of clusters and detect the different phases in bedload time series using probability theory. The analysis verifies the experimental findings that three phases are detected in the bedload rate time series structure, namely, sink, neutral, and source. ?? ASCE / JUNE 2004.
Wavelet-based clustering of resting state MRI data in the rat.

PubMed

Medda, Alessio; Hoffmann, Lukas; Magnuson, Matthew; Thompson, Garth; Pan, Wen-Ju; Keilholz, Shella

2016-01-01

While functional connectivity has typically been calculated over the entire length of the scan (5-10min), interest has been growing in dynamic analysis methods that can detect changes in connectivity on the order of cognitive processes (seconds). Previous work with sliding window correlation has shown that changes in functional connectivity can be observed on these time scales in the awake human and in anesthetized animals. This exciting advance creates a need for improved approaches to characterize dynamic functional networks in the brain. Previous studies were performed using sliding window analysis on regions of interest defined based on anatomy or obtained from traditional steady-state analysis methods. The parcellation of the brain may therefore be suboptimal, and the characteristics of the time-varying connectivity between regions are dependent upon the length of the sliding window chosen. This manuscript describes an algorithm based on wavelet decomposition that allows data-driven clustering of voxels into functional regions based on temporal and spectral properties. Previous work has shown that different networks have characteristic frequency fingerprints, and the use of wavelets ensures that both the frequency and the timing of the BOLD fluctuations are considered during the clustering process. The method was applied to resting state data acquired from anesthetized rats, and the resulting clusters agreed well with known anatomical areas. Clusters were highly reproducible across subjects. Wavelet cross-correlation values between clusters from a single scan were significantly higher than the values from randomly matched clusters that shared no temporal information, indicating that wavelet-based analysis is sensitive to the relationship between areas. Copyright © 2015 Elsevier Inc. All rights reserved.
A time-series approach for clustering farms based on slaughterhouse health aberration data.

PubMed

Hulsegge, B; de Greef, K H

2018-05-01

A large amount of data is collected routinely in meat inspection in pig slaughterhouses. A time series clustering approach is presented and applied that groups farms based on similar statistical characteristics of meat inspection data over time. A three step characteristic-based clustering approach was used from the idea that the data contain more info than the incidence figures. A stratified subset containing 511,645 pigs was derived as a study set from 3.5 years of meat inspection data. The monthly averages of incidence of pleuritis and of pneumonia of 44 Dutch farms (delivering 5149 batches to 2 pig slaughterhouses) were subjected to 1) derivation of farm level data characteristics 2) factor analysis and 3) clustering into groups of farms. The characteristic-based clustering was able to cluster farms for both lung aberrations. Three groups of data characteristics were informative, describing incidence, time pattern and degree of autocorrelation. The consistency of clustering similar farms was confirmed by repetition of the analysis in a larger dataset. The robustness of the clustering was tested on a substantially extended dataset. This confirmed the earlier results, three data distribution aspects make up the majority of distinction between groups of farms and in these groups (clusters) the majority of the farms was allocated comparable to the earlier allocation (75% and 62% for pleuritis and pneumonia, respectively). The difference between pleuritis and pneumonia in their seasonal dependency was confirmed, supporting the biological relevance of the clustering. Comparison of the identified clusters of statistically comparable farms can be used to detect farm level risk factors causing the health aberrations beyond comparison on disease incidence and trend alone. Copyright © 2018 Elsevier B.V. All rights reserved.
Clustering Financial Time Series by Network Community Analysis

NASA Astrophysics Data System (ADS)

Piccardi, Carlo; Calatroni, Lisa; Bertoni, Fabio

In this paper, we describe a method for clustering financial time series which is based on community analysis, a recently developed approach for partitioning the nodes of a network (graph). A network with N nodes is associated to the set of N time series. The weight of the link (i, j), which quantifies the similarity between the two corresponding time series, is defined according to a metric based on symbolic time series analysis, which has recently proved effective in the context of financial time series. Then, searching for network communities allows one to identify groups of nodes (and then time series) with strong similarity. A quantitative assessment of the significance of the obtained partition is also provided. The method is applied to two distinct case-studies concerning the US and Italy Stock Exchange, respectively. In the US case, the stability of the partitions over time is also thoroughly investigated. The results favorably compare with those obtained with the standard tools typically used for clustering financial time series, such as the minimal spanning tree and the hierarchical tree.
Symptom clusters and treatment time delay in Korean patients with ST-elevation myocardial infarction on admission.

PubMed

Kim, Hee-Sook; Eun, Sang Jun; Hwang, Jin Yong; Lee, Kun-Sei; Cho, Sung-Il

2018-05-01

Most patients with acute myocardial infarction (AMI) experience more than one symptom at onset. Although symptoms are an important early indicator, patients and physicians may have difficulty interpreting symptoms and detecting AMI at an early stage. This study aimed to identify symptom clusters among Korean patients with ST-elevation myocardial infarction (STEMI), to examine the relationship between symptom clusters and patient-related variables, and to investigate the influence of symptom clusters on treatment time delay (decision time [DT], onset-to-balloon time [OTB]). This was a prospective multicenter study with a descriptive design that used face-to-face interviews. A total of 342 patients with STEMI were included in this study. To identify symptom clusters, two-step cluster analysis was performed using SPSS software. Multinomial logistic regression to explore factors related to each cluster and multiple logistic regression to determine the effect of symptom clusters on treatment time delay were conducted. Three symptom clusters were identified: cluster 1 (classic MI; characterized by chest pain); cluster 2 (stress symptoms; sweating and chest pain); and cluster 3 (multiple symptoms; dizziness, sweating, chest pain, weakness, and dyspnea). Compared with patients in clusters 2 and 3, those in cluster 1 were more likely to have diabetes or prior MI. Patients in clusters 2 and 3, who predominantly showed other symptoms in addition to chest pain, had a significantly shorter DT and OTB than those in cluster 1. In conclusion, to decrease treatment time delay, it seems important that patients and clinicians recognize symptom clusters, rather than relying on chest pain alone. Further research is necessary to translate our findings into clinical practice and to improve patient education and public education campaigns.
Clustering of Health Behaviors and Cardiorespiratory Fitness Among U.S. Adolescents.

PubMed

Hartz, Jacob; Yingling, Leah; Ayers, Colby; Adu-Brimpong, Joel; Rivers, Joshua; Ahuja, Chaarushi; Powell-Wiley, Tiffany M

2018-05-01

Decreased cardiorespiratory fitness (CRF) is associated with an increased risk of cardiovascular disease. However, little is known how the interaction of diet, physical activity (PA), and sedentary time (ST) affects CRF among adolescents. By using a nationally representative sample of U.S. adolescents, we used cluster analysis to investigate the interactions of these behaviors with CRF. We hypothesized that distinct clustering patterns exist and that less healthy clusters are associated with lower CRF. We used 2003-2004 National Health and Nutrition Examination Survey data for persons aged 12-19 years (N = 1,225). PA and ST were measured objectively by an accelerometer, and the American Heart Association Healthy Diet Score quantified diet quality. Maximal oxygen consumption (V˙O 2 max) was measured by submaximal treadmill exercise test. We performed cluster analysis to identify sex-specific clustering of diet, PA, and ST. Adjusting for accelerometer wear time, age, body mass index, race/ethnicity, and the poverty-to-income ratio, we performed sex-stratified linear regression analysis to evaluate the association of cluster with V˙O 2 max. Three clusters were identified for girls and boys. For girls, there was no difference across clusters for age (p = .1), weight (p = .3), and BMI (p = .5), and no relationship between clusters and V˙O 2 max. For boys, the youngest cluster (p < .01) had three healthy behaviors, weighed less, and was associated with a higher V˙O 2 max compared with the two older clusters. We observed clustering of diet, PA, and ST in U.S. adolescents. Specific patterns were associated with lower V˙O 2 max for boys, suggesting that our clusters may help identify adolescent boys most in need of interventions. Published by Elsevier Inc.
An improved clustering algorithm based on reverse learning in intelligent transportation

NASA Astrophysics Data System (ADS)

Qiu, Guoqing; Kou, Qianqian; Niu, Ting

2017-05-01

With the development of artificial intelligence and data mining technology, big data has gradually entered people's field of vision. In the process of dealing with large data, clustering is an important processing method. By introducing the reverse learning method in the clustering process of PAM clustering algorithm, to further improve the limitations of one-time clustering in unsupervised clustering learning, and increase the diversity of clustering clusters, so as to improve the quality of clustering. The algorithm analysis and experimental results show that the algorithm is feasible.
Detection of Functional Change Using Cluster Trend Analysis in Glaucoma.

PubMed

Gardiner, Stuart K; Mansberger, Steven L; Demirel, Shaban

2017-05-01

Global analyses using mean deviation (MD) assess visual field progression, but can miss localized changes. Pointwise analyses are more sensitive to localized progression, but more variable so require confirmation. This study assessed whether cluster trend analysis, averaging information across subsets of locations, could improve progression detection. A total of 133 test-retest eyes were tested 7 to 10 times. Rates of change and P values were calculated for possible re-orderings of these series to generate global analysis ("MD worsening faster than x dB/y with P < y"), pointwise and cluster analyses ("n locations [or clusters] worsening faster than x dB/y with P < y") with specificity exactly 95%. These criteria were applied to 505 eyes tested over a mean of 10.5 years, to find how soon each detected "deterioration," and compared using survival models. This was repeated including two subsequent visual fields to determine whether "deterioration" was confirmed. The best global criterion detected deterioration in 25% of eyes in 5.0 years (95% confidence interval [CI], 4.7-5.3 years), compared with 4.8 years (95% CI, 4.2-5.1) for the best cluster analysis criterion, and 4.1 years (95% CI, 4.0-4.5) for the best pointwise criterion. However, for pointwise analysis, only 38% of these changes were confirmed, compared with 61% for clusters and 76% for MD. The time until 25% of eyes showed subsequently confirmed deterioration was 6.3 years (95% CI, 6.0-7.2) for global, 6.3 years (95% CI, 6.0-7.0) for pointwise, and 6.0 years (95% CI, 5.3-6.6) for cluster analyses. Although the specificity is still suboptimal, cluster trend analysis detects subsequently confirmed deterioration sooner than either global or pointwise analyses.
Interactive visual exploration and refinement of cluster assignments.

PubMed

Kern, Michael; Lex, Alexander; Gehlenborg, Nils; Johnson, Chris R

2017-09-12

With ever-increasing amounts of data produced in biology research, scientists are in need of efficient data analysis methods. Cluster analysis, combined with visualization of the results, is one such method that can be used to make sense of large data volumes. At the same time, cluster analysis is known to be imperfect and depends on the choice of algorithms, parameters, and distance measures. Most clustering algorithms don't properly account for ambiguity in the source data, as records are often assigned to discrete clusters, even if an assignment is unclear. While there are metrics and visualization techniques that allow analysts to compare clusterings or to judge cluster quality, there is no comprehensive method that allows analysts to evaluate, compare, and refine cluster assignments based on the source data, derived scores, and contextual data. In this paper, we introduce a method that explicitly visualizes the quality of cluster assignments, allows comparisons of clustering results and enables analysts to manually curate and refine cluster assignments. Our methods are applicable to matrix data clustered with partitional, hierarchical, and fuzzy clustering algorithms. Furthermore, we enable analysts to explore clustering results in context of other data, for example, to observe whether a clustering of genomic data results in a meaningful differentiation in phenotypes. Our methods are integrated into Caleydo StratomeX, a popular, web-based, disease subtype analysis tool. We show in a usage scenario that our approach can reveal ambiguities in cluster assignments and produce improved clusterings that better differentiate genotypes and phenotypes.
The quantitative analysis of silicon carbide surface smoothing by Ar and Xe cluster ions

NASA Astrophysics Data System (ADS)

Ieshkin, A. E.; Kireev, D. S.; Ermakov, Yu. A.; Trifonov, A. S.; Presnov, D. E.; Garshev, A. V.; Anufriev, Yu. V.; Prokhorova, I. G.; Krupenin, V. A.; Chernysh, V. S.

2018-04-01

The gas cluster ion beam technique was used for the silicon carbide crystal surface smoothing. The effect of processing by two inert cluster ions, argon and xenon, was quantitatively compared. While argon is a standard element for GCIB, results for xenon clusters were not reported yet. Scanning probe microscopy and high resolution transmission electron microscopy techniques were used for the analysis of the surface roughness and surface crystal layer quality. The gas cluster ion beam processing results in surface relief smoothing down to average roughness about 1 nm for both elements. It was shown that xenon as the working gas is more effective: sputtering rate for xenon clusters is 2.5 times higher than for argon at the same beam energy. High resolution transmission electron microscopy analysis of the surface defect layer gives values of 7 ± 2 nm and 8 ± 2 nm for treatment with argon and xenon clusters.
The contribution of psychological factors to recovery after mild traumatic brain injury: is cluster analysis a useful approach?

PubMed

Snell, Deborah L; Surgenor, Lois J; Hay-Smith, E Jean C; Williman, Jonathan; Siegert, Richard J

2015-01-01

Outcomes after mild traumatic brain injury (MTBI) vary, with slow or incomplete recovery for a significant minority. This study examines whether groups of cases with shared psychological factors but with different injury outcomes could be identified using cluster analysis. This is a prospective observational study following 147 adults presenting to a hospital-based emergency department or concussion services in Christchurch, New Zealand. This study examined associations between baseline demographic, clinical, psychological variables (distress, injury beliefs and symptom burden) and outcome 6 months later. A two-step approach to cluster analysis was applied (Ward's method to identify clusters, K-means to refine results). Three meaningful clusters emerged (high-adapters, medium-adapters, low-adapters). Baseline cluster-group membership was significantly associated with outcomes over time. High-adapters appeared recovered by 6-weeks and medium-adapters revealed improvements by 6-months. The low-adapters continued to endorse many symptoms, negative recovery expectations and distress, being significantly at risk for poor outcome more than 6-months after injury (OR (good outcome) = 0.12; CI = 0.03-0.53; p < 0.01). Cluster analysis supported the notion that groups could be identified early post-injury based on psychological factors, with group membership associated with differing outcomes over time. Implications for clinical care providers regarding therapy targets and cases that may benefit from different intensities of intervention are discussed.
Development of small scale cluster computer for numerical analysis

NASA Astrophysics Data System (ADS)

Zulkifli, N. H. N.; Sapit, A.; Mohammed, A. N.

2017-09-01

In this study, two units of personal computer were successfully networked together to form a small scale cluster. Each of the processor involved are multicore processor which has four cores in it, thus made this cluster to have eight processors. Here, the cluster incorporate Ubuntu 14.04 LINUX environment with MPI implementation (MPICH2). Two main tests were conducted in order to test the cluster, which is communication test and performance test. The communication test was done to make sure that the computers are able to pass the required information without any problem and were done by using simple MPI Hello Program where the program written in C language. Additional, performance test was also done to prove that this cluster calculation performance is much better than single CPU computer. In this performance test, four tests were done by running the same code by using single node, 2 processors, 4 processors, and 8 processors. The result shows that with additional processors, the time required to solve the problem decrease. Time required for the calculation shorten to half when we double the processors. To conclude, we successfully develop a small scale cluster computer using common hardware which capable of higher computing power when compare to single CPU processor, and this can be beneficial for research that require high computing power especially numerical analysis such as finite element analysis, computational fluid dynamics, and computational physics analysis.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Murugesan, Sugeerth; Bouchard, Kristofer; Chang, Edward

There exists a need for effective and easy-to-use software tools supporting the analysis of complex Electrocorticography (ECoG) data. Understanding how epileptic seizures develop or identifying diagnostic indicators for neurological diseases require the in-depth analysis of neural activity data from ECoG. Such data is multi-scale and is of high spatio-temporal resolution. Comprehensive analysis of this data should be supported by interactive visual analysis methods that allow a scientist to understand functional patterns at varying levels of granularity and comprehend its time-varying behavior. We introduce a novel multi-scale visual analysis system, ECoG ClusterFlow, for the detailed exploration of ECoG data. Our systemmore » detects and visualizes dynamic high-level structures, such as communities, derived from the time-varying connectivity network. The system supports two major views: 1) an overview summarizing the evolution of clusters over time and 2) an electrode view using hierarchical glyph-based design to visualize the propagation of clusters in their spatial, anatomical context. We present case studies that were performed in collaboration with neuroscientists and neurosurgeons using simulated and recorded epileptic seizure data to demonstrate our system's effectiveness. ECoG ClusterFlow supports the comparison of spatio-temporal patterns for specific time intervals and allows a user to utilize various clustering algorithms. Neuroscientists can identify the site of seizure genesis and its spatial progression during various the stages of a seizure. Our system serves as a fast and powerful means for the generation of preliminary hypotheses that can be used as a basis for subsequent application of rigorous statistical methods, with the ultimate goal being the clinical treatment of epileptogenic zones.« less
Multifractal Approach to Time Clustering of Earthquakes. Application to Mt. Vesuvio Seismicity

NASA Astrophysics Data System (ADS)

Codano, C.; Alonzo, M. L.; Vilardo, G.

The clustering structure of the Vesuvian earthquakes occurring is investigated by means of statistical tools: the inter-event time distribution, the running mean and the multifractal analysis. The first cannot clearly distinguish between a Poissonian process and a clustered one due to the difficulties of clearly distinguishing between an exponential distribution and a power law one. The running mean test reveals the clustering of the earthquakes, but looses information about the structure of the distribution at global scales. The multifractal approach can enlighten the clustering at small scales, while the global behaviour remains Poissonian. Subsequently the clustering of the events is interpreted in terms of diffusive processes of the stress in the earth crust.
Multichannel biomedical time series clustering via hierarchical probabilistic latent semantic analysis.

PubMed

Wang, Jin; Sun, Xiangping; Nahavandi, Saeid; Kouzani, Abbas; Wu, Yuchuan; She, Mary

2014-11-01

Biomedical time series clustering that automatically groups a collection of time series according to their internal similarity is of importance for medical record management and inspection such as bio-signals archiving and retrieval. In this paper, a novel framework that automatically groups a set of unlabelled multichannel biomedical time series according to their internal structural similarity is proposed. Specifically, we treat a multichannel biomedical time series as a document and extract local segments from the time series as words. We extend a topic model, i.e., the Hierarchical probabilistic Latent Semantic Analysis (H-pLSA), which was originally developed for visual motion analysis to cluster a set of unlabelled multichannel time series. The H-pLSA models each channel of the multichannel time series using a local pLSA in the first layer. The topics learned in the local pLSA are then fed to a global pLSA in the second layer to discover the categories of multichannel time series. Experiments on a dataset extracted from multichannel Electrocardiography (ECG) signals demonstrate that the proposed method performs better than previous state-of-the-art approaches and is relatively robust to the variations of parameters including length of local segments and dictionary size. Although the experimental evaluation used the multichannel ECG signals in a biometric scenario, the proposed algorithm is a universal framework for multichannel biomedical time series clustering according to their structural similarity, which has many applications in biomedical time series management. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Detecting synchronization clusters in multivariate time series via coarse-graining of Markov chains.

PubMed

Allefeld, Carsten; Bialonski, Stephan

2007-12-01

Synchronization cluster analysis is an approach to the detection of underlying structures in data sets of multivariate time series, starting from a matrix R of bivariate synchronization indices. A previous method utilized the eigenvectors of R for cluster identification, analogous to several recent attempts at group identification using eigenvectors of the correlation matrix. All of these approaches assumed a one-to-one correspondence of dominant eigenvectors and clusters, which has however been shown to be wrong in important cases. We clarify the usefulness of eigenvalue decomposition for synchronization cluster analysis by translating the problem into the language of stochastic processes, and derive an enhanced clustering method harnessing recent insights from the coarse-graining of finite-state Markov processes. We illustrate the operation of our method using a simulated system of coupled Lorenz oscillators, and we demonstrate its superior performance over the previous approach. Finally we investigate the question of robustness of the algorithm against small sample size, which is important with regard to field applications.
Associations of physical activity and sedentary time with weight and weight status among 10- to 12-year-old boys and girls in Europe: a cluster analysis within the ENERGY project.

PubMed

De Bourdeaudhuij, I; Verloigne, M; Maes, L; Van Lippevelde, W; Chinapaw, M J M; Te Velde, S J; Manios, Y; Androutsos, O; Kovacs, E; Dössegger, A; Brug, J

2013-10-01

Moderate-to-vigorous physical activity (MVPA) plays an important role in childhood overweight prevention. Sedentary time appears to be independently associated with overweight, but most research has been done in adults. The objective of this study were to identify subgroups of children based on their MVPA and sedentary time, and explore differences in body mass index (BMI), waist circumference and overweight prevalence between among these subgroups. A sample of 766 10- to 12-year-old children (52.9% girls, 11.6 ± 0.8 years) were recruited from Hungary (n = 158), Belgium (n = 111), the Netherlands (n = 113), Greece (n = 169) and Switzerland (n = 215). Children wore an accelerometer to measure MVPA and sedentary time. Cluster analysis revealed four clusters in both gender groups showing an unhealthy pattern (low MVPA/high sedentary time), a healthy pattern (high MVPA/low sedentary time), a low mixed pattern (low MVPA/low sedentary time) and a moderate to high mixed pattern (moderate to high MVPA/moderate sedentary time). In girls, the high MVPA/low sedentary time cluster had a significantly lower BMI (P ≤ 0.05), a lower waist circumference (P ≤ 0.01) and the lowest percentage of overweight (P ≤ 0.10) compared with the other three clusters. In boys, both clusters with higher activity levels had a significantly lower BMI (P ≤ 0.001) and waist circumference (P ≤ 0.001) than the two low activity clusters, independent of sedentary time. Engagement in more MVPA and less sedentary time is associated with a more favourable weight status among 10- to 12-year-old girls. Among boys, MVPA seems most important for weight status, while sedentary time appears to be less relevant. © 2012 The Authors. Pediatric Obesity © 2012 International Association for the Study of Obesity.
Statistical detection of geographic clusters of resistant Escherichia coli in a regional network with WHONET and SaTScan.

PubMed

Park, Rachel; O'Brien, Thomas F; Huang, Susan S; Baker, Meghan A; Yokoe, Deborah S; Kulldorff, Martin; Barrett, Craig; Swift, Jamie; Stelling, John

2016-11-01

While antimicrobial resistance threatens the prevention, treatment, and control of infectious diseases, systematic analysis of routine microbiology laboratory test results worldwide can alert new threats and promote timely response. This study explores statistical algorithms for recognizing geographic clustering of multi-resistant microbes within a healthcare network and monitoring the dissemination of new strains over time. Escherichia coli antimicrobial susceptibility data from a three-year period stored in WHONET were analyzed across ten facilities in a healthcare network utilizing SaTScan's spatial multinomial model with two models for defining geographic proximity. We explored geographic clustering of multi-resistance phenotypes within the network and changes in clustering over time. Geographic clustering identified from both latitude/longitude and non-parametric facility groupings geographic models were similar, while the latter was offers greater flexibility and generalizability. Iterative application of the clustering algorithms suggested the possible recognition of the initial appearance of invasive E. coli ST131 in the clinical database of a single hospital and subsequent dissemination to others. Systematic analysis of routine antimicrobial resistance susceptibility test results supports the recognition of geographic clustering of microbial phenotypic subpopulations with WHONET and SaTScan, and iterative application of these algorithms can detect the initial appearance in and dissemination across a region prompting early investigation, response, and containment measures.

X-ray illumination of globular cluster puzzles. [globular cluster X ray sources as clues to Milky Way Galaxy age and evolution

NASA Technical Reports Server (NTRS)

Lightman, A. P.; Grindlay, J. E.

1982-01-01

Globular clusters are thought to be among the oldest objects in the Galaxy, and provide, in this connection, important clues for determining the age and process of formation of the Galaxy. The present investigation is concerned with puzzles relating to the X-ray emission of globular clusters, taking into account questions regarding the location of X-ray emitting clusters (XEGC) unusually near the galactic plane and/or galactic center. An adopted model is discussed for the nature, formation, and lifetime of X-ray sources in globular clusters. An analysis of the available data is conducted in connection with a search for correlations between binary formation time scales, central relaxation times, galactic locations, and X-ray emission. The positive correlation found between distance from galactic center and two-body binary formation time for globular clusters, explanations for this correlation, and the hypothesis that X-ray sources in globular clusters require binary star systems provide a possible explanation of the considered puzzles.
Time spent on health-related activities by senior Australians with chronic diseases: what is the role of multimorbidity and comorbidity?

PubMed

Islam, M Mofizul; McRae, Ian S; Yen, Laurann; Jowsey, Tanisha; Valderas, Jose M

2015-06-01

To examine the effect of various morbidity clusters of chronic diseases on health-related time use and to explore factors associated with heavy time burden (more than 30 hours/month) of health-related activities. Using a national survey, data were collected from 2,540 senior Australians. Natural clusters were identified using cluster analysis and clinical clusters using clinical expert opinion. We undertook a set of linear regressions to model people's time use, and logistic regressions to model heavy time burden. Time use increases with the number of chronic diseases. Six of the 12 diseases are significantly associated with higher time use, with the highest effect for diabetes followed by depression; 18% reported a heavy time burden, with diabetes again being the most significant disease. Clusters and dominant comorbid groupings do not contribute to predicting time use or time burden. Total number of diseases and specific diseases are useful determinants of time use and heavy time burden. Dominant groupings and disease clusters do not predict time use. In considering time demands on patients and the need for care co-ordination, care providers need to be aware of how many and what specific diseases the patient faces. © 2015 Public Health Association of Australia.
Spatio-Temporal Analysis of Smear-Positive Tuberculosis in the Sidama Zone, Southern Ethiopia

PubMed Central

Dangisso, Mesay Hailu; Datiko, Daniel Gemechu; Lindtjørn, Bernt

2015-01-01

Background Tuberculosis (TB) is a disease of public health concern, with a varying distribution across settings depending on socio-economic status, HIV burden, availability and performance of the health system. Ethiopia is a country with a high burden of TB, with regional variations in TB case notification rates (CNRs). However, TB program reports are often compiled and reported at higher administrative units that do not show the burden at lower units, so there is limited information about the spatial distribution of the disease. We therefore aim to assess the spatial distribution and presence of the spatio-temporal clustering of the disease in different geographic settings over 10 years in the Sidama Zone in southern Ethiopia. Methods A retrospective space–time and spatial analysis were carried out at the kebele level (the lowest administrative unit within a district) to identify spatial and space-time clusters of smear-positive pulmonary TB (PTB). Scan statistics, Global Moran’s I, and Getis and Ordi (Gi*) statistics were all used to help analyze the spatial distribution and clusters of the disease across settings. Results A total of 22,545 smear-positive PTB cases notified over 10 years were used for spatial analysis. In a purely spatial analysis, we identified the most likely cluster of smear-positive PTB in 192 kebeles in eight districts (RR= 2, p<0.001), with 12,155 observed and 8,668 expected cases. The Gi* statistic also identified the clusters in the same areas, and the spatial clusters showed stability in most areas in each year during the study period. The space-time analysis also detected the most likely cluster in 193 kebeles in the same eight districts (RR= 1.92, p<0.001), with 7,584 observed and 4,738 expected cases in 2003-2012. Conclusion The study found variations in CNRs and significant spatio-temporal clusters of smear-positive PTB in the Sidama Zone. The findings can be used to guide TB control programs to devise effective TB control strategies for the geographic areas characterized by the highest CNRs. Further studies are required to understand the factors associated with clustering based on individual level locations and investigation of cases. PMID:26030162
Association between Clustering of Lifestyle Behaviors and Health-Related Physical Fitness in Youth: The UP&DOWN Study.

PubMed

Cabanas-Sánchez, Verónica; Martínez-Gómez, David; Izquierdo-Gómez, Rocío; Segura-Jiménez, Víctor; Castro-Piñero, José; Veiga, Oscar L

2018-05-23

To examine clustering of lifestyle behaviors in Spanish children and adolescents based on screen time, nonscreen sedentary time, moderate-to-vigorous physical activity, Mediterranean diet quality, and sleep time, and to analyze its association with health-related physical fitness. The sample consisted of 1197 children and adolescents (597 boys), aged 8-18 years, included in the baseline cohort of the UP&DOWN study. Moderate-to-vigorous physical activity was assessed by accelerometry. Screen time, nonscreen sedentary time, Mediterranean diet quality, and sleep time were self-reported by participants. Health-related physical fitness was measured following the Assessing Levels of Physical Activity battery for youth. A 2-stage cluster analysis was performed based on the 5 lifestyle behaviors. Associations of clusters with fatness and physical fitness were analyzed by 1-way ANCOVA. Five lifestyle clusters were identified: (1) active (n = 171), (2) sedentary nonscreen sedentary time-high diet quality (n = 250), (3) inactive-high sleep time (n = 249 [20.8%]), (4) sedentary nonscreen sedentary time-low diet quality (n = 273), and (5) sedentary screen time-low sleep time (n = 254). Cluster 1 was the healthiest profile in relation to health-related physical fitness in both boys and girls. In boys, cluster 3 had the worst fatness and fitness levels, whereas in girls the worst scores were found in clusters 4 and 5. Clustering of different lifestyle behaviors was identified and differences in health-related physical fitness were found among clusters, which suggests that special attention should be given to sedentary behaviors in girls and physical activity in boys when developing childhood health prevention strategies focusing on lifestyles patterns. Copyright © 2018 Elsevier Inc. All rights reserved.
A recurrence network approach for the analysis of skin blood flow dynamics in response to loading pressure.

PubMed

Liao, Fuyuan; Jan, Yih-Kuen

2012-06-01

This paper presents a recurrence network approach for the analysis of skin blood flow dynamics in response to loading pressure. Recurrence is a fundamental property of many dynamical systems, which can be explored in phase spaces constructed from observational time series. A visualization tool of recurrence analysis called recurrence plot (RP) has been proved to be highly effective to detect transitions in the dynamics of the system. However, it was found that delay embedding can produce spurious structures in RPs. Network-based concepts have been applied for the analysis of nonlinear time series recently. We demonstrate that time series with different types of dynamics exhibit distinct global clustering coefficients and distributions of local clustering coefficients and that the global clustering coefficient is robust to the embedding parameters. We applied the approach to study skin blood flow oscillations (BFO) response to loading pressure. The results showed that global clustering coefficients of BFO significantly decreased in response to loading pressure (p<0.01). Moreover, surrogate tests indicated that such a decrease was associated with a loss of nonlinearity of BFO. Our results suggest that the recurrence network approach can practically quantify the nonlinear dynamics of BFO.
a Three-Step Spatial-Temporal Clustering Method for Human Activity Pattern Analysis

NASA Astrophysics Data System (ADS)

Huang, W.; Li, S.; Xu, S.

2016-06-01

How people move in cities and what they do in various locations at different times form human activity patterns. Human activity pattern plays a key role in in urban planning, traffic forecasting, public health and safety, emergency response, friend recommendation, and so on. Therefore, scholars from different fields, such as social science, geography, transportation, physics and computer science, have made great efforts in modelling and analysing human activity patterns or human mobility patterns. One of the essential tasks in such studies is to find the locations or places where individuals stay to perform some kind of activities before further activity pattern analysis. In the era of Big Data, the emerging of social media along with wearable devices enables human activity data to be collected more easily and efficiently. Furthermore, the dimension of the accessible human activity data has been extended from two to three (space or space-time) to four dimensions (space, time and semantics). More specifically, not only a location and time that people stay and spend are collected, but also what people "say" for in a location at a time can be obtained. The characteristics of these datasets shed new light on the analysis of human mobility, where some of new methodologies should be accordingly developed to handle them. Traditional methods such as neural networks, statistics and clustering have been applied to study human activity patterns using geosocial media data. Among them, clustering methods have been widely used to analyse spatiotemporal patterns. However, to our best knowledge, few of clustering algorithms are specifically developed for handling the datasets that contain spatial, temporal and semantic aspects all together. In this work, we propose a three-step human activity clustering method based on space, time and semantics to fill this gap. One-year Twitter data, posted in Toronto, Canada, is used to test the clustering-based method. The results show that the approximate 55% spatiotemporal clusters distributed in different locations can be eventually grouped as the same type of clusters with consideration of semantic aspect.
The Technical and Biological Reproducibility of Matrix-Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry (MALDI-TOF MS) Based Typing: Employment of Bioinformatics in a Multicenter Study.

PubMed

Oberle, Michael; Wohlwend, Nadia; Jonas, Daniel; Maurer, Florian P; Jost, Geraldine; Tschudin-Sutter, Sarah; Vranckx, Katleen; Egli, Adrian

2016-01-01

The technical, biological, and inter-center reproducibility of matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI TOF MS) typing data has not yet been explored. The aim of this study is to compare typing data from multiple centers employing bioinformatics using bacterial strains from two past outbreaks and non-related strains. Participants received twelve extended spectrum betalactamase-producing E. coli isolates and followed the same standard operating procedure (SOP) including a full-protein extraction protocol. All laboratories provided visually read spectra via flexAnalysis (Bruker, Germany). Raw data from each laboratory allowed calculating the technical and biological reproducibility between centers using BioNumerics (Applied Maths NV, Belgium). Technical and biological reproducibility ranged between 96.8-99.4% and 47.6-94.4%, respectively. The inter-center reproducibility showed a comparable clustering among identical isolates. Principal component analysis indicated a higher tendency to cluster within the same center. Therefore, we used a discriminant analysis, which completely separated the clusters. Next, we defined a reference center and performed a statistical analysis to identify specific peaks to identify the outbreak clusters. Finally, we used a classifier algorithm and a linear support vector machine on the determined peaks as classifier. A validation showed that within the set of the reference center, the identification of the cluster was 100% correct with a large contrast between the score with the correct cluster and the next best scoring cluster. Based on the sufficient technical and biological reproducibility of MALDI-TOF MS based spectra, detection of specific clusters is possible from spectra obtained from different centers. However, we believe that a shared SOP and a bioinformatics approach are required to make the analysis robust and reliable.
The adiposity of children is associated with their lifestyle behaviours: a cluster analysis of school-aged children from 12 nations.

PubMed

Dumuid, Dorothea; Olds, T; Lewis, L K; Martin-Fernández, J A; Barreira, T; Broyles, S; Chaput, J-P; Fogelholm, M; Hu, G; Kuriyan, R; Kurpad, A; Lambert, E V; Maia, J; Matsudo, V; Onywera, V O; Sarmiento, O L; Standage, M; Tremblay, M S; Tudor-Locke, C; Zhao, P; Katzmarzyk, P; Gillison, F; Maher, C

2018-02-01

The relationship between children's adiposity and lifestyle behaviour patterns is an area of growing interest. The objectives of this study are to identify clusters of children based on lifestyle behaviours and compare children's adiposity among clusters. Cross-sectional data from the International Study of Childhood Obesity, Lifestyle and the Environment were used. the participants were children (9-11 years) from 12 nations (n = 5710). 24-h accelerometry and self-reported diet and screen time were clustering input variables. Objectively measured adiposity indicators were waist-to-height ratio, percent body fat and body mass index z-scores. sex-stratified analyses were performed on the global sample and repeated on a site-wise basis. Cluster analysis (using isometric log ratios for compositional data) was used to identify common lifestyle behaviour patterns. Site representation and adiposity were compared across clusters using linear models. Four clusters emerged: (1) Junk Food Screenies, (2) Actives, (3) Sitters and (4) All-Rounders. Countries were represented differently among clusters. Chinese children were over-represented in Sitters and Colombian children in Actives. Adiposity varied across clusters, being highest in Sitters and lowest in Actives. Children from different sites clustered into groups of similar lifestyle behaviours. Cluster membership was linked with differing adiposity. Findings support the implementation of activity interventions in all countries, targeting both physical activity and sedentary time. © 2016 World Obesity Federation.
Effective Analysis of NGS Metagenomic Data with Ultra-Fast Clustering Algorithms (MICW - Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

ScienceCinema

Li, Weizhong

2018-02-12

San Diego Supercomputer Center's Weizhong Li on "Effective Analysis of NGS Metagenomic Data with Ultra-fast Clustering Algorithms" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.
A harmonic linear dynamical system for prominent ECG feature extraction.

PubMed

Thi, Ngoc Anh Nguyen; Yang, Hyung-Jeong; Kim, SunHee; Do, Luu Ngoc

2014-01-01

Unsupervised mining of electrocardiography (ECG) time series is a crucial task in biomedical applications. To have efficiency of the clustering results, the prominent features extracted from preprocessing analysis on multiple ECG time series need to be investigated. In this paper, a Harmonic Linear Dynamical System is applied to discover vital prominent features via mining the evolving hidden dynamics and correlations in ECG time series. The discovery of the comprehensible and interpretable features of the proposed feature extraction methodology effectively represents the accuracy and the reliability of clustering results. Particularly, the empirical evaluation results of the proposed method demonstrate the improved performance of clustering compared to the previous main stream feature extraction approaches for ECG time series clustering tasks. Furthermore, the experimental results on real-world datasets show scalability with linear computation time to the duration of the time series.
Cluster Analysis of Velocity Field Derived from Dense GNSS Network of Japan

NASA Astrophysics Data System (ADS)

Takahashi, A.; Hashimoto, M.

2015-12-01

Dense GNSS networks have been widely used to observe crustal deformation. Simpson et al. (2012) and Savage and Simpson (2013) have conducted cluster analyses of GNSS velocity field in the San Francisco Bay Area and Mojave Desert, respectively. They have successfully found velocity discontinuities. They also showed an advantage of cluster analysis for classifying GNSS velocity field. Since in western United States, strike-slip events are dominant, geometry is simple. However, the Japanese Islands are tectonically complicated due to subduction of oceanic plates. There are many types of crustal deformation such as slow slip event and large postseismic deformation. We propose a modified clustering method of GNSS velocity field in Japan to separate time variant and static crustal deformation. Our modification is performing cluster analysis every several months or years, then qualifying cluster member similarity. If a GNSS station moved differently from its neighboring GNSS stations, the station will not belong to in the cluster which includes its surrounding stations. With this method, time variant phenomena were distinguished. We applied our method to GNSS data of Japan from 1996 to 2015. According to the analyses, following conclusions were derived. The first is the clusters boundaries are consistent with known active faults. For examples, the Arima-Takatsuki-Hanaore fault system and the Shimane-Tottori segment proposed by Nishimura (2015) are recognized, though without using prior information. The second is improving detectability of time variable phenomena, such as a slow slip event in northern part of Hokkaido region detected by Ohzono et al. (2015). The last one is the classification of postseismic deformation caused by large earthquakes. The result suggested velocity discontinuities in postseismic deformation of the Tohoku-oki earthquake. This result implies that postseismic deformation is not continuously decaying proportional to distance from its epicenter.
[Predicting Incidence of Hepatitis E in Chinausing Fuzzy Time Series Based on Fuzzy C-Means Clustering Analysis].

PubMed

Luo, Yi; Zhang, Tao; Li, Xiao-song

2016-05-01

To explore the application of fuzzy time series model based on fuzzy c-means clustering in forecasting monthly incidence of Hepatitis E in mainland China. Apredictive model (fuzzy time series method based on fuzzy c-means clustering) was developed using Hepatitis E incidence data in mainland China between January 2004 and July 2014. The incidence datafrom August 2014 to November 2014 were used to test the fitness of the predictive model. The forecasting results were compared with those resulted from traditional fuzzy time series models. The fuzzy time series model based on fuzzy c-means clustering had 0.001 1 mean squared error (MSE) of fitting and 6.977 5 x 10⁻⁴ MSE of forecasting, compared with 0.0017 and 0.0014 from the traditional forecasting model. The results indicate that the fuzzy time series model based on fuzzy c-means clustering has a better performance in forecasting incidence of Hepatitis E.
Time-resolved x-ray imaging of a laser-induced nanoplasma and its neutral residuals

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fluckiger, L.; Rupp, D.; Adolph, M.

The evolution of individual, large gas-phase xenon clusters, turned into a nanoplasma by a high power infrared laser pulse, is tracked from femtoseconds up to nanoseconds after laser excitation via coherent diffractive imaging, using ultra-short soft x-ray free electron laser pulses. A decline of scattering signal at high detection angles with increasing time delay indicates a softening of the cluster surface. Here we demonstrate, for the first time a representative speckle pattern of a new stage of cluster expansion for xenon clusters after a nanosecond irradiation. The analysis of the measured average speckle size and the envelope of the intensitymore » distribution reveals a mean cluster size and length scale of internal density fluctuations. Furthermore, the measured diffraction patterns were reproduced by scattering simulations which assumed that the cluster expands with pronounced internal density fluctuations hundreds of picoseconds after excitation.« less
Time-resolved x-ray imaging of a laser-induced nanoplasma and its neutral residuals

DOE PAGES

Fluckiger, L.; Rupp, D.; Adolph, M.; ...

2016-04-13

The evolution of individual, large gas-phase xenon clusters, turned into a nanoplasma by a high power infrared laser pulse, is tracked from femtoseconds up to nanoseconds after laser excitation via coherent diffractive imaging, using ultra-short soft x-ray free electron laser pulses. A decline of scattering signal at high detection angles with increasing time delay indicates a softening of the cluster surface. Here we demonstrate, for the first time a representative speckle pattern of a new stage of cluster expansion for xenon clusters after a nanosecond irradiation. The analysis of the measured average speckle size and the envelope of the intensitymore » distribution reveals a mean cluster size and length scale of internal density fluctuations. Furthermore, the measured diffraction patterns were reproduced by scattering simulations which assumed that the cluster expands with pronounced internal density fluctuations hundreds of picoseconds after excitation.« less
Functional Connectivity Parcellation of the Human Thalamus by Independent Component Analysis.

PubMed

Zhang, Sheng; Li, Chiang-Shan R

2017-11-01

As a key structure to relay and integrate information, the thalamus supports multiple cognitive and affective functions through the connectivity between its subnuclei and cortical and subcortical regions. Although extant studies have largely described thalamic regional functions in anatomical terms, evidence accumulates to suggest a more complex picture of subareal activities and connectivities of the thalamus. In this study, we aimed to parcellate the thalamus and examine whole-brain connectivity of its functional clusters. With resting state functional magnetic resonance imaging data from 96 adults, we used independent component analysis (ICA) to parcellate the thalamus into 10 components. On the basis of the independence assumption, ICA helps to identify how subclusters overlap spatially. Whole brain functional connectivity of each subdivision was computed for independent component's time course (ICtc), which is a unique time series to represent an IC. For comparison, we computed seed-region-based functional connectivity using the averaged time course across all voxels within a thalamic subdivision. The results showed that, at p < 10 -6 , corrected, 49% of voxels on average overlapped among subdivisions. Compared with seed-region analysis, ICtc analysis revealed patterns of connectivity that were more distinguished between thalamic clusters. ICtc analysis demonstrated thalamic connectivity to the primary motor cortex, which has eluded the analysis as well as previous studies based on averaged time series, and clarified thalamic connectivity to the hippocampus, caudate nucleus, and precuneus. The new findings elucidate functional organization of the thalamus and suggest that ICA clustering in combination with ICtc rather than seed-region analysis better distinguishes whole-brain connectivities among functional clusters of a brain region.
Temporal Methods to Detect Content-Based Anomalies in Social Media

DOE Office of Scientific and Technical Information (OSTI.GOV)

Skryzalin, Jacek; Field, Jr., Richard; Fisher, Andrew N.

Here, we develop a method for time-dependent topic tracking and meme trending in social media. Our objective is to identify time periods whose content differs signifcantly from normal, and we utilize two techniques to do so. The first is an information-theoretic analysis of the distributions of terms emitted during different periods of time. In the second, we cluster documents from each time period and analyze the tightness of each clustering. We also discuss a method of combining the scores created by each technique, and we provide ample empirical analysis of our methodology on various Twitter datasets.
CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks.

PubMed

Li, Min; Li, Dongyan; Tang, Yu; Wu, Fangxiang; Wang, Jianxin

2017-08-31

Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster.
CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks

PubMed Central

Li, Min; Li, Dongyan; Tang, Yu; Wang, Jianxin

2017-01-01

Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster. PMID:28858211
Development and optimization of SPECT gated blood pool cluster analysis for the prediction of CRT outcome.

PubMed

Lalonde, Michel; Wells, R Glenn; Birnie, David; Ruddy, Terrence D; Wassenaar, Richard

2014-07-01

Phase analysis of single photon emission computed tomography (SPECT) radionuclide angiography (RNA) has been investigated for its potential to predict the outcome of cardiac resynchronization therapy (CRT). However, phase analysis may be limited in its potential at predicting CRT outcome as valuable information may be lost by assuming that time-activity curves (TAC) follow a simple sinusoidal shape. A new method, cluster analysis, is proposed which directly evaluates the TACs and may lead to a better understanding of dyssynchrony patterns and CRT outcome. Cluster analysis algorithms were developed and optimized to maximize their ability to predict CRT response. About 49 patients (N = 27 ischemic etiology) received a SPECT RNA scan as well as positron emission tomography (PET) perfusion and viability scans prior to undergoing CRT. A semiautomated algorithm sampled the left ventricle wall to produce 568 TACs from SPECT RNA data. The TACs were then subjected to two different cluster analysis techniques, K-means, and normal average, where several input metrics were also varied to determine the optimal settings for the prediction of CRT outcome. Each TAC was assigned to a cluster group based on the comparison criteria and global and segmental cluster size and scores were used as measures of dyssynchrony and used to predict response to CRT. A repeated random twofold cross-validation technique was used to train and validate the cluster algorithm. Receiver operating characteristic (ROC) analysis was used to calculate the area under the curve (AUC) and compare results to those obtained for SPECT RNA phase analysis and PET scar size analysis methods. Using the normal average cluster analysis approach, the septal wall produced statistically significant results for predicting CRT results in the ischemic population (ROC AUC = 0.73;p < 0.05 vs. equal chance ROC AUC = 0.50) with an optimal operating point of 71% sensitivity and 60% specificity. Cluster analysis results were similar to SPECT RNA phase analysis (ROC AUC = 0.78, p = 0.73 vs cluster AUC; sensitivity/specificity = 59%/89%) and PET scar size analysis (ROC AUC = 0.73, p = 1.0 vs cluster AUC; sensitivity/specificity = 76%/67%). A SPECT RNA cluster analysis algorithm was developed for the prediction of CRT outcome. Cluster analysis results produced results equivalent to those obtained from Fourier and scar analysis.
Development and optimization of SPECT gated blood pool cluster analysis for the prediction of CRT outcome

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lalonde, Michel, E-mail: mlalonde15@rogers.com; Wassenaar, Richard; Wells, R. Glenn

2014-07-15

Purpose: Phase analysis of single photon emission computed tomography (SPECT) radionuclide angiography (RNA) has been investigated for its potential to predict the outcome of cardiac resynchronization therapy (CRT). However, phase analysis may be limited in its potential at predicting CRT outcome as valuable information may be lost by assuming that time-activity curves (TAC) follow a simple sinusoidal shape. A new method, cluster analysis, is proposed which directly evaluates the TACs and may lead to a better understanding of dyssynchrony patterns and CRT outcome. Cluster analysis algorithms were developed and optimized to maximize their ability to predict CRT response. Methods: Aboutmore » 49 patients (N = 27 ischemic etiology) received a SPECT RNA scan as well as positron emission tomography (PET) perfusion and viability scans prior to undergoing CRT. A semiautomated algorithm sampled the left ventricle wall to produce 568 TACs from SPECT RNA data. The TACs were then subjected to two different cluster analysis techniques, K-means, and normal average, where several input metrics were also varied to determine the optimal settings for the prediction of CRT outcome. Each TAC was assigned to a cluster group based on the comparison criteria and global and segmental cluster size and scores were used as measures of dyssynchrony and used to predict response to CRT. A repeated random twofold cross-validation technique was used to train and validate the cluster algorithm. Receiver operating characteristic (ROC) analysis was used to calculate the area under the curve (AUC) and compare results to those obtained for SPECT RNA phase analysis and PET scar size analysis methods. Results: Using the normal average cluster analysis approach, the septal wall produced statistically significant results for predicting CRT results in the ischemic population (ROC AUC = 0.73;p < 0.05 vs. equal chance ROC AUC = 0.50) with an optimal operating point of 71% sensitivity and 60% specificity. Cluster analysis results were similar to SPECT RNA phase analysis (ROC AUC = 0.78, p = 0.73 vs cluster AUC; sensitivity/specificity = 59%/89%) and PET scar size analysis (ROC AUC = 0.73, p = 1.0 vs cluster AUC; sensitivity/specificity = 76%/67%). Conclusions: A SPECT RNA cluster analysis algorithm was developed for the prediction of CRT outcome. Cluster analysis results produced results equivalent to those obtained from Fourier and scar analysis.« less

Real-time observation of formation and relaxation dynamics of NH4 in (CH3OH)m(NH3)n clusters.

PubMed

Yamada, Yuji; Nishino, Yoko; Fujihara, Akimasa; Ishikawa, Haruki; Fuke, Kiyokazu

2009-03-26

The formation and relaxation dynamics of NH4(CH3OH)m(NH3)n clusters produced by photolysis of ammonia-methanol mixed clusters has been observed by a time-resolved pump-probe method with femtosecond pulse lasers. From the detailed analysis of the time evolutions of the protonated cluster ions, NH4(+)(CH3OH)m(NH3)n, the kinetic model has been constructed, which consists of sequential three-step reaction: ultrafast hydrogen-atom transfer producing the radical pair (NH4-NH2)*, the relaxation process of radical-pair clusters, and dissociation of the solvated NH4 clusters. The initial hydrogen transfer hardly occurs between ammonia and methanol, implying the unfavorable formation of radical pair, (CH3OH2-NH2)*. The remarkable dependence of the time constants in each step on the number and composition of solvents has been explained by the following factors: hydrogen delocalization within the clusters, the internal conversion of the excited-state radical pair, and the stabilization of NH4 by solvation. The dependence of the time profiles on the probe wavelength is attributed to the different ionization efficiency of the NH4(CH3OH)m(NH3)n clusters.
Representation of Tinnitus in the US Newspaper Media and in Facebook Pages: Cross-Sectional Analysis of Secondary Data

PubMed Central

Ratinaud, Pierre; Andersson, Gerhard

2018-01-01

Background When people with health conditions begin to manage their health issues, one important issue that emerges is the question as to what exactly do they do with the information that they have obtained through various sources (eg, news media, social media, health professionals, friends, and family). The information they gather helps form their opinions and, to some degree, influences their attitudes toward managing their condition. Objective This study aimed to understand how tinnitus is represented in the US newspaper media and in Facebook pages (ie, social media) using text pattern analysis. Methods This was a cross-sectional study based upon secondary analyses of publicly available data. The 2 datasets (ie, text corpuses) analyzed in this study were generated from US newspaper media during 1980-2017 (downloaded from the database US Major Dailies by ProQuest) and Facebook pages during 2010-2016. The text corpuses were analyzed using the Iramuteq software using cluster analysis and chi-square tests. Results The newspaper dataset had 432 articles. The cluster analysis resulted in 5 clusters, which were named as follows: (1) brain stimulation (26.2%), (2) symptoms (13.5%), (3) coping (19.8%), (4) social support (24.2%), and (5) treatment innovation (16.4%). A time series analysis of clusters indicated a change in the pattern of information presented in newspaper media during 1980-2017 (eg, more emphasis on cluster 5, focusing on treatment inventions). The Facebook dataset had 1569 texts. The cluster analysis resulted in 7 clusters, which were named as: (1) diagnosis (21.9%), (2) cause (4.1%), (3) research and development (13.6%), (4) social support (18.8%), (5) challenges (11.1%), (6) symptoms (21.4%), and (7) coping (9.2%). A time series analysis of clusters indicated no change in information presented in Facebook pages on tinnitus during 2011-2016. Conclusions The study highlights the specific aspects about tinnitus that the US newspaper media and Facebook pages focus on, as well as how these aspects change over time. These findings can help health care providers better understand the presuppositions that tinnitus patients may have. More importantly, the findings can help public health experts and health communication experts in tailoring health information about tinnitus to promote self-management, as well as assisting in appropriate choices of treatment for those living with tinnitus. PMID:29739734
Statistical detection of geographic clusters of resistant Escherichia coli in a regional network with WHONET and SaTScan

PubMed Central

Park, Rachel; O'Brien, Thomas F.; Huang, Susan S.; Baker, Meghan A.; Yokoe, Deborah S.; Kulldorff, Martin; Barrett, Craig; Swift, Jamie; Stelling, John

2016-01-01

Objectives While antimicrobial resistance threatens the prevention, treatment, and control of infectious diseases, systematic analysis of routine microbiology laboratory test results worldwide can alert new threats and promote timely response. This study explores statistical algorithms for recognizing geographic clustering of multi-resistant microbes within a healthcare network and monitoring the dissemination of new strains over time. Methods Escherichia coli antimicrobial susceptibility data from a three-year period stored in WHONET were analyzed across ten facilities in a healthcare network utilizing SaTScan's spatial multinomial model with two models for defining geographic proximity. We explored geographic clustering of multi-resistance phenotypes within the network and changes in clustering over time. Results Geographic clustering identified from both latitude/longitude and non-parametric facility groupings geographic models were similar, while the latter was offers greater flexibility and generalizability. Iterative application of the clustering algorithms suggested the possible recognition of the initial appearance of invasive E. coli ST131 in the clinical database of a single hospital and subsequent dissemination to others. Conclusion Systematic analysis of routine antimicrobial resistance susceptibility test results supports the recognition of geographic clustering of microbial phenotypic subpopulations with WHONET and SaTScan, and iterative application of these algorithms can detect the initial appearance in and dissemination across a region prompting early investigation, response, and containment measures. PMID:27530311
Modifiable lifestyle behavior patterns, sedentary time and physical activity contexts: a cluster analysis among middle school boys and girls in the SALTA study.

PubMed

Marques, Elisa A; Pizarro, Andreia N; Figueiredo, Pedro; Mota, Jorge; Santos, Maria P

2013-06-01

To analyze how modifiable health-related variables are clustered and associated with children's participation in play, active travel and structured exercise and sport among boys and girls. Data were collected from 9 middle-schools in Porto (Portugal) area. A total of 636 children in the 6th grade (340 girls and 296 boys) with a mean age of 11.64 years old participated in the study. Cluster analyses were used to identify patterns of lifestyle and healthy/unhealthy behaviors. Multinomial logistic regression analysis was used to estimate associations between cluster allocation, sedentary time and participation in three different physical activity (PA) contexts: play, active travel, and structured exercise/sport. Four distinct clusters were identified based on four lifestyle risk factors. The most disadvantaged cluster was characterized by high body mass index, low high-density lipoprotein cholesterol and cardiorespiratory fitness and a moderate level of moderate to vigorous PA. Everyday outdoor play (OR=1.85, 95%CI 0.318-0.915) and structured exercise/sport (OR=1.85, 95%CI 0.291-0.990) were associated with healthier lifestyle patterns. There were no significant associations between health patterns and sedentary time or travel mode. Outdoor play and sport/exercise participation seem more important than active travel from school in influencing children's healthy cluster profiles. Copyright © 2013 Elsevier Inc. All rights reserved.
Obstructive Sleep Apnea: A Cluster Analysis at Time of Diagnosis

PubMed Central

Grillet, Yves; Richard, Philippe; Stach, Bruno; Vivodtzev, Isabelle; Timsit, Jean-Francois; Lévy, Patrick; Tamisier, Renaud; Pépin, Jean-Louis

2016-01-01

Background The classification of obstructive sleep apnea is on the basis of sleep study criteria that may not adequately capture disease heterogeneity. Improved phenotyping may improve prognosis prediction and help select therapeutic strategies. Objectives: This study used cluster analysis to investigate the clinical clusters of obstructive sleep apnea. Methods An ascending hierarchical cluster analysis was performed on baseline symptoms, physical examination, risk factor exposure and co-morbidities from 18,263 participants in the OSFP (French national registry of sleep apnea). The probability for criteria to be associated with a given cluster was assessed using odds ratios, determined by univariate logistic regression. Results: Six clusters were identified, in which patients varied considerably in age, sex, symptoms, obesity, co-morbidities and environmental risk factors. The main significant differences between clusters were minimally symptomatic versus sleepy obstructive sleep apnea patients, lean versus obese, and among obese patients different combinations of co-morbidities and environmental risk factors. Conclusions Our cluster analysis identified six distinct clusters of obstructive sleep apnea. Our findings underscore the high degree of heterogeneity that exists within obstructive sleep apnea patients regarding clinical presentation, risk factors and consequences. This may help in both research and clinical practice for validating new prevention programs, in diagnosis and in decisions regarding therapeutic strategies. PMID:27314230
A stereoscopic system for viewing the temporal evolution of brain activity clusters in response to linguistic stimuli

NASA Astrophysics Data System (ADS)

Forbes, Angus; Villegas, Javier; Almryde, Kyle R.; Plante, Elena

2014-03-01

In this paper, we present a novel application, 3D+Time Brain View, for the stereoscopic visualization of functional Magnetic Resonance Imaging (fMRI) data gathered from participants exposed to unfamiliar spoken languages. An analysis technique based on Independent Component Analysis (ICA) is used to identify statistically significant clusters of brain activity and their changes over time during different testing sessions. That is, our system illustrates the temporal evolution of participants' brain activity as they are introduced to a foreign language through displaying these clusters as they change over time. The raw fMRI data is presented as a stereoscopic pair in an immersive environment utilizing passive stereo rendering. The clusters are presented using a ray casting technique for volume rendering. Our system incorporates the temporal information and the results of the ICA into the stereoscopic 3D rendering, making it easier for domain experts to explore and analyze the data.
Unsupervised analysis of small animal dynamic Cerenkov luminescence imaging

NASA Astrophysics Data System (ADS)

Spinelli, Antonello E.; Boschi, Federico

2011-12-01

Clustering analysis (CA) and principal component analysis (PCA) were applied to dynamic Cerenkov luminescence images (dCLI). In order to investigate the performances of the proposed approaches, two distinct dynamic data sets obtained by injecting mice with 32P-ATP and 18F-FDG were acquired using the IVIS 200 optical imager. The k-means clustering algorithm has been applied to dCLI and was implemented using interactive data language 8.1. We show that cluster analysis allows us to obtain good agreement between the clustered and the corresponding emission regions like the bladder, the liver, and the tumor. We also show a good correspondence between the time activity curves of the different regions obtained by using CA and manual region of interest analysis on dCLIT and PCA images. We conclude that CA provides an automatic unsupervised method for the analysis of preclinical dynamic Cerenkov luminescence image data.
Passion and intrinsic motivation in digital gaming.

PubMed

Wang, Chee Keng John; Khoo, Angeline; Liu, Woon Chia; Divaharan, Shanti

2008-02-01

Digital gaming is fast becoming a favorite activity all over the world. Yet very few studies have examined the underlying motivational processes involved in digital gaming. One motivational force that receives little attention in psychology is passion, which could help us understand the motivation of gamers. The purpose of the present study was to identify subgroups of young people with distinctive passion profiles on self-determined regulations, flow dispositions, affect, and engagement time in gaming. One hundred fifty-five students from two secondary schools in Singapore participated in the survey. There were 134 males and 8 females (13 unspecified). The participants completed a questionnaire to measure harmonious passion (HP), obsessive passion (OP), perceived locus of causality, disposition flow, positive and negative affects, and engagement time in gaming. Cluster analysis found three clusters with distinct passion profiles. The first cluster had an average HP/OP profile, the second cluster had a low HP/OP profile, and the third cluster had a high HP/OP profile. The three clusters displayed different levels of cognitive, affective, and behavioral outcomes. Cluster analysis, as this study shows, is useful in identifying groups of gamers with different passion profiles. It has helped us gain a deeper understanding of motivation in digital gaming.
Screen media usage, sleep time and academic performance in adolescents: clustering a self-organizing maps analysis.

PubMed

Peiró-Velert, Carmen; Valencia-Peris, Alexandra; González, Luis M; García-Massó, Xavier; Serra-Añó, Pilar; Devís-Devís, José

2014-01-01

Screen media usage, sleep time and socio-demographic features are related to adolescents' academic performance, but interrelations are little explored. This paper describes these interrelations and behavioral profiles clustered in low and high academic performance. A nationally representative sample of 3,095 Spanish adolescents, aged 12 to 18, was surveyed on 15 variables linked to the purpose of the study. A Self-Organizing Maps analysis established non-linear interrelationships among these variables and identified behavior patterns in subsequent cluster analyses. Topological interrelationships established from the 15 emerging maps indicated that boys used more passive videogames and computers for playing than girls, who tended to use mobile phones to communicate with others. Adolescents with the highest academic performance were the youngest. They slept more and spent less time using sedentary screen media when compared to those with the lowest performance, and they also showed topological relationships with higher socioeconomic status adolescents. Cluster 1 grouped boys who spent more than 5.5 hours daily using sedentary screen media. Their academic performance was low and they slept an average of 8 hours daily. Cluster 2 gathered girls with an excellent academic performance, who slept nearly 9 hours per day, and devoted less time daily to sedentary screen media. Academic performance was directly related to sleep time and socioeconomic status, but inversely related to overall sedentary screen media usage. Profiles from the two clusters were strongly differentiated by gender, age, sedentary screen media usage, sleep time and academic achievement. Girls with the highest academic results had a medium socioeconomic status in Cluster 2. Findings may contribute to establishing recommendations about the timing and duration of screen media usage in adolescents and appropriate sleep time needed to successfully meet the demands of school academics and to improve interventions targeting to affect behavioral change.
Screen Media Usage, Sleep Time and Academic Performance in Adolescents: Clustering a Self-Organizing Maps Analysis

PubMed Central

Peiró-Velert, Carmen; Valencia-Peris, Alexandra; González, Luis M.; García-Massó, Xavier; Serra-Añó, Pilar; Devís-Devís, José

2014-01-01

Screen media usage, sleep time and socio-demographic features are related to adolescents' academic performance, but interrelations are little explored. This paper describes these interrelations and behavioral profiles clustered in low and high academic performance. A nationally representative sample of 3,095 Spanish adolescents, aged 12 to 18, was surveyed on 15 variables linked to the purpose of the study. A Self-Organizing Maps analysis established non-linear interrelationships among these variables and identified behavior patterns in subsequent cluster analyses. Topological interrelationships established from the 15 emerging maps indicated that boys used more passive videogames and computers for playing than girls, who tended to use mobile phones to communicate with others. Adolescents with the highest academic performance were the youngest. They slept more and spent less time using sedentary screen media when compared to those with the lowest performance, and they also showed topological relationships with higher socioeconomic status adolescents. Cluster 1 grouped boys who spent more than 5.5 hours daily using sedentary screen media. Their academic performance was low and they slept an average of 8 hours daily. Cluster 2 gathered girls with an excellent academic performance, who slept nearly 9 hours per day, and devoted less time daily to sedentary screen media. Academic performance was directly related to sleep time and socioeconomic status, but inversely related to overall sedentary screen media usage. Profiles from the two clusters were strongly differentiated by gender, age, sedentary screen media usage, sleep time and academic achievement. Girls with the highest academic results had a medium socioeconomic status in Cluster 2. Findings may contribute to establishing recommendations about the timing and duration of screen media usage in adolescents and appropriate sleep time needed to successfully meet the demands of school academics and to improve interventions targeting to affect behavioral change. PMID:24941009
Retrospective space-time cluster analysis of whooping cough, re-emergence in Barcelona, Spain, 2000-2011.

PubMed

Solano, Rubén; Gómez-Barroso, Diana; Simón, Fernando; Lafuente, Sarah; Simón, Pere; Rius, Cristina; Gorrindo, Pilar; Toledo, Diana; Caylà, Joan A

2014-05-01

A retrospective, space-time study of whooping cough cases reported to the Public Health Agency of Barcelona, Spain between the years 2000 and 2011 is presented. It is based on 633 individual whooping cough cases and the 2006 population census from the Spanish National Statistics Institute, stratified by age and sex at the census tract level. Cluster identification was attempted using space-time scan statistic assuming a Poisson distribution and restricting temporal extent to 7 days and spatial distance to 500 m. Statistical calculations were performed with Stata 11 and SatScan and mapping was performed with ArcGis 10.0. Only clusters showing statistical significance (P <0.05) were mapped. The most likely cluster identified included five census tracts located in three neighbourhoods in central Barcelona during the week from 17 to 23 August 2011. This cluster included five cases compared with the expected level of 0.0021 (relative risk = 2436, P <0.001). In addition, 11 secondary significant space-time clusters were detected with secondary clusters occurring at different times and localizations. Spatial statistics is felt to be useful by complementing epidemiological surveillance systems through visualizing excess in the number of cases in space and time and thus increase the possibility of identifying outbreaks not reported by the surveillance system.
Clustering of health-related behaviors among early and mid-adolescents in Tuscany: results from a representative cross-sectional study.

PubMed

Lazzeri, Giacomo; Panatto, Donatella; Domnich, Alexander; Arata, Lucia; Pammolli, Andrea; Simi, Rita; Giacchi, Mariano Vincenzo; Amicizia, Daniela; Gasparini, Roberto

2018-03-01

A huge amount of literature suggests that adolescents' health-related behaviors tend to occur in clusters, and the understanding of such behavioral clustering may have direct implications for the effective tailoring of health-promotion interventions. Despite the usefulness of analyzing clustering, Italian data on this topic are scant. This study aimed to evaluate the clustering patterns of health-related behaviors. The present study is based on data from the Health Behaviors in School-aged Children (HBSC) study conducted in Tuscany in 2010, which involved 3291 11-, 13- and 15-year olds. To aggregate students' data on 22 health-related behaviors, factor analysis and subsequent cluster analysis were performed. Factor analysis revealed eight factors, which were dubbed in accordance with their main traits: 'Alcohol drinking', 'Smoking', 'Physical activity', 'Screen time', 'Signs & symptoms', 'Healthy eating', 'Violence' and 'Sweet tooth'. These factors explained 67% of variance and underwent cluster analysis. A six-cluster κ-means solution was established with a 93.8% level of classification validity. The between-cluster differences in both mean age and gender distribution were highly statistically significant. Health-compromising behaviors are common among Tuscan teens and occur in distinct clusters. These results may be used by schools, health-promotion authorities and other stakeholders to design and implement tailored preventive interventions in Tuscany.
Visual cluster analysis and pattern recognition methods

DOEpatents

Osbourn, Gordon Cecil; Martinez, Rubel Francisco

2001-01-01

A method of clustering using a novel template to define a region of influence. Using neighboring approximation methods, computation times can be significantly reduced. The template and method are applicable and improve pattern recognition techniques.
Bias and inference from misspecified mixed-effect models in stepped wedge trial analysis.

PubMed

Thompson, Jennifer A; Fielding, Katherine L; Davey, Calum; Aiken, Alexander M; Hargreaves, James R; Hayes, Richard J

2017-10-15

Many stepped wedge trials (SWTs) are analysed by using a mixed-effect model with a random intercept and fixed effects for the intervention and time periods (referred to here as the standard model). However, it is not known whether this model is robust to misspecification. We simulated SWTs with three groups of clusters and two time periods; one group received the intervention during the first period and two groups in the second period. We simulated period and intervention effects that were either common-to-all or varied-between clusters. Data were analysed with the standard model or with additional random effects for period effect or intervention effect. In a second simulation study, we explored the weight given to within-cluster comparisons by simulating a larger intervention effect in the group of the trial that experienced both the control and intervention conditions and applying the three analysis models described previously. Across 500 simulations, we computed bias and confidence interval coverage of the estimated intervention effect. We found up to 50% bias in intervention effect estimates when period or intervention effects varied between clusters and were treated as fixed effects in the analysis. All misspecified models showed undercoverage of 95% confidence intervals, particularly the standard model. A large weight was given to within-cluster comparisons in the standard model. In the SWTs simulated here, mixed-effect models were highly sensitive to departures from the model assumptions, which can be explained by the high dependence on within-cluster comparisons. Trialists should consider including a random effect for time period in their SWT analysis model. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Bias and inference from misspecified mixed‐effect models in stepped wedge trial analysis

PubMed Central

Fielding, Katherine L.; Davey, Calum; Aiken, Alexander M.; Hargreaves, James R.; Hayes, Richard J.

2017-01-01

Many stepped wedge trials (SWTs) are analysed by using a mixed‐effect model with a random intercept and fixed effects for the intervention and time periods (referred to here as the standard model). However, it is not known whether this model is robust to misspecification. We simulated SWTs with three groups of clusters and two time periods; one group received the intervention during the first period and two groups in the second period. We simulated period and intervention effects that were either common‐to‐all or varied‐between clusters. Data were analysed with the standard model or with additional random effects for period effect or intervention effect. In a second simulation study, we explored the weight given to within‐cluster comparisons by simulating a larger intervention effect in the group of the trial that experienced both the control and intervention conditions and applying the three analysis models described previously. Across 500 simulations, we computed bias and confidence interval coverage of the estimated intervention effect. We found up to 50% bias in intervention effect estimates when period or intervention effects varied between clusters and were treated as fixed effects in the analysis. All misspecified models showed undercoverage of 95% confidence intervals, particularly the standard model. A large weight was given to within‐cluster comparisons in the standard model. In the SWTs simulated here, mixed‐effect models were highly sensitive to departures from the model assumptions, which can be explained by the high dependence on within‐cluster comparisons. Trialists should consider including a random effect for time period in their SWT analysis model. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28556355
TSCAN: Pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis

PubMed Central

Ji, Zhicheng; Ji, Hongkai

2016-01-01

When analyzing single-cell RNA-seq data, constructing a pseudo-temporal path to order cells based on the gradual transition of their transcriptomes is a useful way to study gene expression dynamics in a heterogeneous cell population. Currently, a limited number of computational tools are available for this task, and quantitative methods for comparing different tools are lacking. Tools for Single Cell Analysis (TSCAN) is a software tool developed to better support in silico pseudo-Time reconstruction in Single-Cell RNA-seq ANalysis. TSCAN uses a cluster-based minimum spanning tree (MST) approach to order cells. Cells are first grouped into clusters and an MST is then constructed to connect cluster centers. Pseudo-time is obtained by projecting each cell onto the tree, and the ordered sequence of cells can be used to study dynamic changes of gene expression along the pseudo-time. Clustering cells before MST construction reduces the complexity of the tree space. This often leads to improved cell ordering. It also allows users to conveniently adjust the ordering based on prior knowledge. TSCAN has a graphical user interface (GUI) to support data visualization and user interaction. Furthermore, quantitative measures are developed to objectively evaluate and compare different pseudo-time reconstruction methods. TSCAN is available at https://github.com/zji90/TSCAN and as a Bioconductor package. PMID:27179027
TSCAN: Pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis.

PubMed

Ji, Zhicheng; Ji, Hongkai

2016-07-27

When analyzing single-cell RNA-seq data, constructing a pseudo-temporal path to order cells based on the gradual transition of their transcriptomes is a useful way to study gene expression dynamics in a heterogeneous cell population. Currently, a limited number of computational tools are available for this task, and quantitative methods for comparing different tools are lacking. Tools for Single Cell Analysis (TSCAN) is a software tool developed to better support in silico pseudo-Time reconstruction in Single-Cell RNA-seq ANalysis. TSCAN uses a cluster-based minimum spanning tree (MST) approach to order cells. Cells are first grouped into clusters and an MST is then constructed to connect cluster centers. Pseudo-time is obtained by projecting each cell onto the tree, and the ordered sequence of cells can be used to study dynamic changes of gene expression along the pseudo-time. Clustering cells before MST construction reduces the complexity of the tree space. This often leads to improved cell ordering. It also allows users to conveniently adjust the ordering based on prior knowledge. TSCAN has a graphical user interface (GUI) to support data visualization and user interaction. Furthermore, quantitative measures are developed to objectively evaluate and compare different pseudo-time reconstruction methods. TSCAN is available at https://github.com/zji90/TSCAN and as a Bioconductor package. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Space-time analysis of Down syndrome: results consistent with transient pre-disposing contagious agent.

PubMed

McNally, Richard J Q; Rankin, Judith; Shirley, Mark D F; Rushton, Stephen P; Pless-Mulloli, Tanja

2008-10-01

Whilst maternal age is an established risk factor for Patau syndrome (trisomy 13), Edwards syndrome (trisomy 18) and Down syndrome (trisomy 21), the aetiology and contribution of genetic and environmental factors remains unclear. We analysed for space-time clustering using high quality fully population-based data from a geographically defined region. The study included all cases of Patau, Edwards and Down syndrome, delivered during 1985-2003 and resident in the former Northern Region of England, including terminations of pregnancy for fetal anomaly. We applied the K-function test for space-time clustering with fixed thresholds of close in space and time using residential addresses at time of delivery. The Knox test was used to indicate the range over which the clustering effect occurred. Tests were repeated using nearest neighbour (NN) thresholds to adjust for variable population density. The study analysed 116 cases of Patau syndrome, 240 cases of Edwards syndrome and 1084 cases of Down syndrome. There was evidence of space-time clustering for Down syndrome (fixed threshold of close in space: P = 0.01, NN threshold: P = 0.02), but little or no clustering for Patau (P = 0.57, P = 0.19) or Edwards (P = 0.37, P = 0.06) syndromes. Clustering of Down syndrome was associated with cases from more densely populated areas and evidence of clustering persisted when cases were restricted to maternal age <40 years. The highly novel space-time clustering for Down syndrome suggests an aetiological role for transient environmental factors, such as infections.
Rapidly differentiating grape seeds from different sources based on characteristic fingerprints using direct analysis in real time coupled with time-of-flight mass spectrometry combined with chemometrics.

PubMed

Song, Yuqiao; Liao, Jie; Dong, Junxing; Chen, Li

2015-09-01

The seeds of grapevine (Vitis vinifera) are a byproduct of wine production. To examine the potential value of grape seeds, grape seeds from seven sources were subjected to fingerprinting using direct analysis in real time coupled with time-of-flight mass spectrometry combined with chemometrics. Firstly, we listed all reported components (56 components) from grape seeds and calculated the precise m/z values of the deprotonated ions [M-H](-) . Secondly, the experimental conditions were systematically optimized based on the peak areas of total ion chromatograms of the samples. Thirdly, the seven grape seed samples were examined using the optimized method. Information about 20 grape seed components was utilized to represent characteristic fingerprints. Finally, hierarchical clustering analysis and principal component analysis were performed to analyze the data. Grape seeds from seven different sources were classified into two clusters; hierarchical clustering analysis and principal component analysis yielded similar results. The results of this study lay the foundation for appropriate utilization and exploitation of grape seed samples. Due to the absence of complicated sample preparation methods and chromatographic separation, the method developed in this study represents one of the simplest and least time-consuming methods for grape seed fingerprinting. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Peeking Network States with Clustered Patterns

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kim, Jinoh; Sim, Alex

2015-10-20

Network traffic monitoring has long been a core element for effec- tive network management and security. However, it is still a chal- lenging task with a high degree of complexity for comprehensive analysis when considering multiple variables and ever-increasing traffic volumes to monitor. For example, one of the widely con- sidered approaches is to scrutinize probabilistic distributions, but it poses a scalability concern and multivariate analysis is not gen- erally supported due to the exponential increase of the complexity. In this work, we propose a novel method for network traffic moni- toring based on clustering, one of the powerful deep-learningmore » tech- niques. We show that the new approach enables us to recognize clustered results as patterns representing the network states, which can then be utilized to evaluate “similarity” of network states over time. In addition, we define a new quantitative measure for the similarity between two compared network states observed in dif- ferent time windows, as a supportive means for intuitive analysis. Finally, we demonstrate the clustering-based network monitoring with public traffic traces, and show that the proposed approach us- ing the clustering method has a great opportunity for feasible, cost- effective network monitoring.« less

A dynamical study of Galactic globular clusters under different relaxation conditions

NASA Astrophysics Data System (ADS)

Zocchi, A.; Bertin, G.; Varri, A. L.

2012-03-01

Aims: We perform a systematic combined photometric and kinematic analysis of a sample of globular clusters under different relaxation conditions, based on their core relaxation time (as listed in available catalogs), by means of two well-known families of spherical stellar dynamical models. Systems characterized by shorter relaxation time scales are expected to be better described by isotropic King models, while less relaxed systems might be interpreted by means of non-truncated, radially-biased anisotropic f(ν) models, originally designed to represent stellar systems produced by a violent relaxation formation process and applied here for the first time to the study of globular clusters. Methods: The comparison between dynamical models and observations is performed by fitting simultaneously surface brightness and velocity dispersion profiles. For each globular cluster, the best-fit model in each family is identified, along with a full error analysis on the relevant parameters. Detailed structural properties and mass-to-light ratios are also explicitly derived. Results: We find that King models usually offer a good representation of the observed photometric profiles, but often lead to less satisfactory fits to the kinematic profiles, independently of the relaxation condition of the systems. For some less relaxed clusters, f(ν) models provide a good description of both observed profiles. Some derived structural characteristics, such as the total mass or the half-mass radius, turn out to be significantly model-dependent. The analysis confirms that, to answer some important dynamical questions that bear on the formation and evolution of globular clusters, it would be highly desirable to acquire larger numbers of accurate kinematic data-points, well distributed over the cluster field. Appendices are available in electronic form at http://www.aanda.org
A New Approach to Identify High Burnout Medical Staffs by Kernel K-Means Cluster Analysis in a Regional Teaching Hospital in Taiwan

PubMed Central

Lee, Yii-Ching; Huang, Shian-Chang; Huang, Chih-Hsuan; Wu, Hsin-Hung

2016-01-01

This study uses kernel k-means cluster analysis to identify medical staffs with high burnout. The data collected in October to November 2014 are from the emotional exhaustion dimension of the Chinese version of Safety Attitudes Questionnaire in a regional teaching hospital in Taiwan. The number of effective questionnaires including the entire staffs such as physicians, nurses, technicians, pharmacists, medical administrators, and respiratory therapists is 680. The results show that 8 clusters are generated by kernel k-means method. Employees in clusters 1, 4, and 5 are relatively in good conditions, whereas employees in clusters 2, 3, 6, 7, and 8 need to be closely monitored from time to time because they have relatively higher degree of burnout. When employees with higher degree of burnout are identified, the hospital management can take actions to improve the resilience, reduce the potential medical errors, and, eventually, enhance the patient safety. This study also suggests that the hospital management needs to keep track of medical staffs’ fatigue conditions and provide timely assistance for burnout recovery through employee assistance programs, mindfulness-based stress reduction programs, positivity currency buildup, and forming appreciative inquiry groups. PMID:27895218
Visual cluster analysis and pattern recognition template and methods

DOEpatents

Osbourn, Gordon Cecil; Martinez, Rubel Francisco

1999-01-01

A method of clustering using a novel template to define a region of influence. Using neighboring approximation methods, computation times can be significantly reduced. The template and method are applicable and improve pattern recognition techniques.
MIXOR: a computer program for mixed-effects ordinal regression analysis.

PubMed

Hedeker, D; Gibbons, R D

1996-03-01

MIXOR provides maximum marginal likelihood estimates for mixed-effects ordinal probit, logistic, and complementary log-log regression models. These models can be used for analysis of dichotomous and ordinal outcomes from either a clustered or longitudinal design. For clustered data, the mixed-effects model assumes that data within clusters are dependent. The degree of dependency is jointly estimated with the usual model parameters, thus adjusting for dependence resulting from clustering of the data. Similarly, for longitudinal data, the mixed-effects approach can allow for individual-varying intercepts and slopes across time, and can estimate the degree to which these time-related effects vary in the population of individuals. MIXOR uses marginal maximum likelihood estimation, utilizing a Fisher-scoring solution. For the scoring solution, the Cholesky factor of the random-effects variance-covariance matrix is estimated, along with the effects of model covariates. Examples illustrating usage and features of MIXOR are provided.
High-dimensional cluster analysis with the Masked EM Algorithm

PubMed Central

Kadir, Shabnam N.; Goodman, Dan F. M.; Harris, Kenneth D.

2014-01-01

Cluster analysis faces two problems in high dimensions: first, the “curse of dimensionality” that can lead to overfitting and poor generalization performance; and second, the sheer time taken for conventional algorithms to process large amounts of high-dimensional data. We describe a solution to these problems, designed for the application of “spike sorting” for next-generation high channel-count neural probes. In this problem, only a small subset of features provide information about the cluster member-ship of any one data vector, but this informative feature subset is not the same for all data points, rendering classical feature selection ineffective. We introduce a “Masked EM” algorithm that allows accurate and time-efficient clustering of up to millions of points in thousands of dimensions. We demonstrate its applicability to synthetic data, and to real-world high-channel-count spike sorting data. PMID:25149694
Cluster analysis based on dimensional information with applications to feature selection and classification

NASA Technical Reports Server (NTRS)

Eigen, D. J.; Fromm, F. R.; Northouse, R. A.

1974-01-01

A new clustering algorithm is presented that is based on dimensional information. The algorithm includes an inherent feature selection criterion, which is discussed. Further, a heuristic method for choosing the proper number of intervals for a frequency distribution histogram, a feature necessary for the algorithm, is presented. The algorithm, although usable as a stand-alone clustering technique, is then utilized as a global approximator. Local clustering techniques and configuration of a global-local scheme are discussed, and finally the complete global-local and feature selector configuration is shown in application to a real-time adaptive classification scheme for the analysis of remote sensed multispectral scanner data.
Characterization of spatial and temporal variability in hydrochemistry of Johor Straits, Malaysia.

PubMed

Abdullah, Pauzi; Abdullah, Sharifah Mastura Syed; Jaafar, Othman; Mahmud, Mastura; Khalik, Wan Mohd Afiq Wan Mohd

2015-12-15

Characterization of hydrochemistry changes in Johor Straits within 5 years of monitoring works was successfully carried out. Water quality data sets (27 stations and 19 parameters) collected in this area were interpreted subject to multivariate statistical analysis. Cluster analysis grouped all the stations into four clusters ((Dlink/Dmax) × 100<90) and two clusters ((Dlink/Dmax) × 100<80) for site and period similarities. Principal component analysis rendered six significant components (eigenvalue>1) that explained 82.6% of the total variance of the data set. Classification matrix of discriminant analysis assigned 88.9-92.6% and 83.3-100% correctness in spatial and temporal variability, respectively. Times series analysis then confirmed that only four parameters were not significant over time change. Therefore, it is imperative that the environmental impact of reclamation and dredging works, municipal or industrial discharge, marine aquaculture and shipping activities in this area be effectively controlled and managed. Copyright © 2015 Elsevier Ltd. All rights reserved.
Using Fuzzy Clustering for Real-time Space Flight Safety

NASA Technical Reports Server (NTRS)

Lee, Charles; Haskell, Richard E.; Hanna, Darrin; Alena, Richard L.

2004-01-01

To ensure space flight safety, it is necessary to monitor myriad sensor readings on the ground and in flight. Since a space shuttle has many sensors, monitoring data and drawing conclusions from information contained within the data in real time is challenging. The nature of the information can be critical to the success of the mission and safety of the crew and therefore, must be processed with minimal data-processing time. Data analysis algorithms could be used to synthesize sensor readings and compare data associated with normal operation with the data obtained that contain fault patterns to draw conclusions. Detecting abnormal operation during early stages in the transition from safe to unsafe operation requires a large amount of historical data that can be categorized into different classes (non-risk, risk). Even though the 40 years of shuttle flight program has accumulated volumes of historical data, these data don t comprehensively represent all possible fault patterns since fault patterns are usually unknown before the fault occurs. This paper presents a method that uses a similarity measure between fuzzy clusters to detect possible faults in real time. A clustering technique based on a fuzzy equivalence relation is used to characterize temporal data. Data collected during an initial time period are separated into clusters. These clusters are characterized by their centroids. Clusters formed during subsequent time periods are either merged with an existing cluster or added to the cluster list. The resulting list of cluster centroids, called a cluster group, characterizes the behavior of a particular set of temporal data. The degree to which new clusters formed in a subsequent time period are similar to the cluster group is characterized by a similarity measure, q. This method is applied to downlink data from Columbia flights. The results show that this technique can detect an unexpected fault that has not been present in the training data set.
A novel polyketide biosynthesis gene cluster is involved in fruiting body morphogenesis in the filamentous fungi Sordaria macrospora and Neurospora crassa.

PubMed

Nowrousian, Minou

2009-04-01

During fungal fruiting body development, hyphae aggregate to form multicellular structures that protect and disperse the sexual spores. Analysis of microarray data revealed a gene cluster strongly upregulated during fruiting body development in the ascomycete Sordaria macrospora. Real time PCR analysis showed that the genes from the orthologous cluster in Neurospora crassa are also upregulated during development. The cluster encodes putative polyketide biosynthesis enzymes, including a reducing polyketide synthase. Analysis of knockout strains of a predicted dehydrogenase gene from the cluster showed that mutants in N. crassa and S. macrospora are delayed in fruiting body formation. In addition to the upregulated cluster, the N. crassa genome comprises another cluster containing a polyketide synthase gene, and five additional reducing polyketide synthase (rpks) genes that are not part of clusters. To study the role of these genes in sexual development, expression of the predicted rpks genes in S. macrospora (five genes) and N. crassa (six genes) was analyzed; all but one are upregulated during sexual development. Analysis of knockout strains for the N. crassa rpks genes showed that one of them is essential for fruiting body formation. These data indicate that polyketides produced by RPKSs are involved in sexual development in filamentous ascomycetes.
AMOEBA clustering revisited. [cluster analysis, classification, and image display program

NASA Technical Reports Server (NTRS)

Bryant, Jack

1990-01-01

A description of the clustering, classification, and image display program AMOEBA is presented. Using a difficult high resolution aircraft-acquired MSS image, the steps the program takes in forming clusters are traced. A number of new features are described here for the first time. Usage of the program is discussed. The theoretical foundation (the underlying mathematical model) is briefly presented. The program can handle images of any size and dimensionality.
Visual verification and analysis of cluster detection for molecular dynamics.

PubMed

Grottel, Sebastian; Reina, Guido; Vrabec, Jadran; Ertl, Thomas

2007-01-01

A current research topic in molecular thermodynamics is the condensation of vapor to liquid and the investigation of this process at the molecular level. Condensation is found in many physical phenomena, e.g. the formation of atmospheric clouds or the processes inside steam turbines, where a detailed knowledge of the dynamics of condensation processes will help to optimize energy efficiency and avoid problems with droplets of macroscopic size. The key properties of these processes are the nucleation rate and the critical cluster size. For the calculation of these properties it is essential to make use of a meaningful definition of molecular clusters, which currently is a not completely resolved issue. In this paper a framework capable of interactively visualizing molecular datasets of such nucleation simulations is presented, with an emphasis on the detected molecular clusters. To check the quality of the results of the cluster detection, our framework introduces the concept of flow groups to highlight potential cluster evolution over time which is not detected by the employed algorithm. To confirm the findings of the visual analysis, we coupled the rendering view with a schematic view of the clusters' evolution. This allows to rapidly assess the quality of the molecular cluster detection algorithm and to identify locations in the simulation data in space as well as in time where the cluster detection fails. Thus, thermodynamics researchers can eliminate weaknesses in their cluster detection algorithms. Several examples for the effective and efficient usage of our tool are presented.
Academic Performance and Lifestyle Behaviors in Australian School Children: A Cluster Analysis.

PubMed

Dumuid, Dorothea; Olds, Timothy; Martín-Fernández, Josep-Antoni; Lewis, Lucy K; Cassidy, Leah; Maher, Carol

2017-12-01

Poor academic performance has been linked with particular lifestyle behaviors, such as unhealthy diet, short sleep duration, high screen time, and low physical activity. However, little is known about how lifestyle behavior patterns (or combinations of behaviors) contribute to children's academic performance. We aimed to compare academic performance across clusters of children with common lifestyle behavior patterns. We clustered participants (Australian children aged 9-11 years, n = 284) into four mutually exclusive groups of distinct lifestyle behavior patterns, using the following lifestyle behaviors as cluster inputs: light, moderate, and vigorous physical activity; sedentary behavior and sleep, derived from 24-hour accelerometry; self-reported screen time and diet. Differences in academic performance (measured by a nationally administered standardized test) were detected across the clusters, with scores being lowest in the Junk Food Screenies cluster (unhealthy diet/high screen time) and highest in the Sitters cluster (high nonscreen sedentary behavior/low physical activity). These findings suggest that reduction in screen time and an improved diet may contribute positively to academic performance. While children with high nonscreen sedentary time performed better academically in this study, they also accumulated low levels of physical activity. This warrants further investigation, given the known physical and mental benefits of physical activity.
The Technical and Biological Reproducibility of Matrix-Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry (MALDI-TOF MS) Based Typing: Employment of Bioinformatics in a Multicenter Study

PubMed Central

Oberle, Michael; Wohlwend, Nadia; Jonas, Daniel; Maurer, Florian P.; Jost, Geraldine; Tschudin-Sutter, Sarah; Vranckx, Katleen; Egli, Adrian

2016-01-01

Background The technical, biological, and inter-center reproducibility of matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI TOF MS) typing data has not yet been explored. The aim of this study is to compare typing data from multiple centers employing bioinformatics using bacterial strains from two past outbreaks and non-related strains. Material/Methods Participants received twelve extended spectrum betalactamase-producing E. coli isolates and followed the same standard operating procedure (SOP) including a full-protein extraction protocol. All laboratories provided visually read spectra via flexAnalysis (Bruker, Germany). Raw data from each laboratory allowed calculating the technical and biological reproducibility between centers using BioNumerics (Applied Maths NV, Belgium). Results Technical and biological reproducibility ranged between 96.8–99.4% and 47.6–94.4%, respectively. The inter-center reproducibility showed a comparable clustering among identical isolates. Principal component analysis indicated a higher tendency to cluster within the same center. Therefore, we used a discriminant analysis, which completely separated the clusters. Next, we defined a reference center and performed a statistical analysis to identify specific peaks to identify the outbreak clusters. Finally, we used a classifier algorithm and a linear support vector machine on the determined peaks as classifier. A validation showed that within the set of the reference center, the identification of the cluster was 100% correct with a large contrast between the score with the correct cluster and the next best scoring cluster. Conclusions Based on the sufficient technical and biological reproducibility of MALDI-TOF MS based spectra, detection of specific clusters is possible from spectra obtained from different centers. However, we believe that a shared SOP and a bioinformatics approach are required to make the analysis robust and reliable. PMID:27798637
An investigation about the structures, thermodynamics and kinetics of the formic acid involved molecular clusters

NASA Astrophysics Data System (ADS)

Zhang, Rui; Jiang, Shuai; Liu, Yi-Rong; Wen, Hui; Feng, Ya-Juan; Huang, Teng; Huang, Wei

2018-05-01

Despite the very important role of atmospheric aerosol nucleation in climate change and air quality, the detailed aerosol nucleation mechanism is still unclear. Here we investigated the formic acid (FA) involved multicomponent nucleation molecular clusters including sulfuric acid (SA), dimethylamine (DMA) and water (W) through a quantum chemical method. The thermodynamics and kinetics analysis was based on the global minima given by Basin-Hopping (BH) algorithm coupled with Density Functional Theory (DFT) and subsequent benchmarked calculations. Then the interaction analysis based on ElectroStatic Potential (ESP), Topological and Atomic Charges analysis was made to characterize the binding features of the clusters. The results show that FA binds weakly with the other molecules in the cluster while W binds more weakly. Further kinetic analysis about the time evolution of the clusters show that even though the formic acid's weak interaction with other nucleation precursors, its effect on sulfuric acid dimer steady state concentration cannot be neglected due to its high concentration in the atmosphere.
The Gap Procedure: for the identification of phylogenetic clusters in HIV-1 sequence data.

PubMed

Vrbik, Irene; Stephens, David A; Roger, Michel; Brenner, Bluma G

2015-11-04

In the context of infectious disease, sequence clustering can be used to provide important insights into the dynamics of transmission. Cluster analysis is usually performed using a phylogenetic approach whereby clusters are assigned on the basis of sufficiently small genetic distances and high bootstrap support (or posterior probabilities). The computational burden involved in this phylogenetic threshold approach is a major drawback, especially when a large number of sequences are being considered. In addition, this method requires a skilled user to specify the appropriate threshold values which may vary widely depending on the application. This paper presents the Gap Procedure, a distance-based clustering algorithm for the classification of DNA sequences sampled from individuals infected with the human immunodeficiency virus type 1 (HIV-1). Our heuristic algorithm bypasses the need for phylogenetic reconstruction, thereby supporting the quick analysis of large genetic data sets. Moreover, this fully automated procedure relies on data-driven gaps in sorted pairwise distances to infer clusters, thus no user-specified threshold values are required. The clustering results obtained by the Gap Procedure on both real and simulated data, closely agree with those found using the threshold approach, while only requiring a fraction of the time to complete the analysis. Apart from the dramatic gains in computational time, the Gap Procedure is highly effective in finding distinct groups of genetically similar sequences and obviates the need for subjective user-specified values. The clusters of genetically similar sequences returned by this procedure can be used to detect patterns in HIV-1 transmission and thereby aid in the prevention, treatment and containment of the disease.
Analyzing gene expression time-courses based on multi-resolution shape mixture model.

PubMed

Li, Ying; He, Ye; Zhang, Yu

2016-11-01

Biological processes actually are a dynamic molecular process over time. Time course gene expression experiments provide opportunities to explore patterns of gene expression change over a time and understand the dynamic behavior of gene expression, which is crucial for study on development and progression of biology and disease. Analysis of the gene expression time-course profiles has not been fully exploited so far. It is still a challenge problem. We propose a novel shape-based mixture model clustering method for gene expression time-course profiles to explore the significant gene groups. Based on multi-resolution fractal features and mixture clustering model, we proposed a multi-resolution shape mixture model algorithm. Multi-resolution fractal features is computed by wavelet decomposition, which explore patterns of change over time of gene expression at different resolution. Our proposed multi-resolution shape mixture model algorithm is a probabilistic framework which offers a more natural and robust way of clustering time-course gene expression. We assessed the performance of our proposed algorithm using yeast time-course gene expression profiles compared with several popular clustering methods for gene expression profiles. The grouped genes identified by different methods are evaluated by enrichment analysis of biological pathways and known protein-protein interactions from experiment evidence. The grouped genes identified by our proposed algorithm have more strong biological significance. A novel multi-resolution shape mixture model algorithm based on multi-resolution fractal features is proposed. Our proposed model provides a novel horizons and an alternative tool for visualization and analysis of time-course gene expression profiles. The R and Matlab program is available upon the request. Copyright © 2016 Elsevier Inc. All rights reserved.
Who attends a Children's Hospital Emergency Department for dental reasons? A two-step cluster analysis approach.

PubMed

Marshman, Z; Broomhead, T; Rodd, H D; Jones, K; Burke, D; Baker, S R

2016-09-28

Emergency departments (EDs) have been identified as key providers of dental care although few studies have examined patterns of attendance or clusters of characteristics. The aim was to identify the reasons for visits to an ED, whether these remained stable over time, and characterize clusters of patients by socio-demographic and attendance variables. Pseudonymized data were obtained for children who attended the ED in 2003-2004, 2004-2005 and 2012-2013. Presenting complaint was categorized as attending for dental or nondental reasons. Other variables analysed included patient (age, sex, ethnicity and deprivation) and attendance characteristics (distance travelled, season, nature of complaint, time elapsed since onset of symptoms, day of week and hours of attendance), together with treatment outcome (advice, antibiotics and referral). To assess trends over time, analyses were conducted on patient, attendance and treatment outcome variables. To examine whether patients could be characterized by socio-demographic and attendance variables, a two-step cluster analysis was undertaken on 2003-2004 data set and validated on 2004-2005 and 2012-2013 data sets. In 2003-2004, 550 children attended the ED for dental reasons rising to 687 in 2012-2013. The most important predictors of dental attendance were as follows: nature of complaint, ethnicity, time elapsed, sex and deprivation of the area in which children lived. The analysis showed two clusters: cluster 1 was comprised of children who attended the ED for dental injury, were of White ethnicity and attended within 24 h of onset of symptoms. Children in this cluster were likely to be from the least or less deprived areas (compared to Cluster 2) and were more likely to be males. Cluster 2 comprised of children attending the ED for caries, oral mucosal lesions or other complaints, were likely to be of other (non-White) ethnicities and were likely to attend more than 24 h after symptoms began. Children in this cluster were more likely to come from the most deprived areas and were both males and females. The clusters varied according to treatment outcome; those patients in Cluster 2 were more likely to be prescribed medication, whilst those children in Cluster 1 were more likely to be referred to another specialty. A significant number of visits to the ED were for dental reasons with two clusters of children. The results have identified groups of patients for whom appropriate dental provision is lacking and where targeted services are needed to improve outcomes for children and reduce the burden on EDs. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Cluster randomised trials in the medical literature: two bibliometric surveys

PubMed Central

Bland, J Martin

2004-01-01

Background Several reviews of published cluster randomised trials have reported that about half did not take clustering into account in the analysis, which was thus incorrect and potentially misleading. In this paper I ask whether cluster randomised trials are increasing in both number and quality of reporting. Methods Computer search for papers on cluster randomised trials since 1980, hand search of trial reports published in selected volumes of the British Medical Journal over 20 years. Results There has been a large increase in the numbers of methodological papers and of trial reports using the term 'cluster random' in recent years, with about equal numbers of each type of paper. The British Medical Journal contained more such reports than any other journal. In this journal there was a corresponding increase over time in the number of trials where subjects were randomised in clusters. In 2003 all reports showed awareness of the need to allow for clustering in the analysis. In 1993 and before clustering was ignored in most such trials. Conclusion Cluster trials are becoming more frequent and reporting is of higher quality. Perhaps statistician pressure works. PMID:15310402
The dynamics of cyclone clustering in re-analysis and a high-resolution climate model

NASA Astrophysics Data System (ADS)

Priestley, Matthew; Pinto, Joaquim; Dacre, Helen; Shaffrey, Len

2017-04-01

Extratropical cyclones have a tendency to occur in groups (clusters) in the exit of the North Atlantic storm track during wintertime, potentially leading to widespread socioeconomic impacts. The Winter of 2013/14 was the stormiest on record for the UK and was characterised by the recurrent clustering of intense extratropical cyclones. This clustering was associated with a strong, straight and persistent North Atlantic 250 hPa jet with Rossby wave-breaking (RWB) on both flanks, pinning the jet in place. Here, we provide for the first time an analysis of all clustered events in 36 years of the ERA-Interim Re-analysis at three latitudes (45˚ N, 55˚ N, 65˚ N) encompassing various regions of Western Europe. The relationship between the occurrence of RWB and cyclone clustering is studied in detail. Clustering at 55˚ N is associated with an extended and anomalously strong jet flanked on both sides by RWB. However, clustering at 65(45)˚ N is associated with RWB to the south (north) of the jet, deflecting the jet northwards (southwards). A positive correlation was found between the intensity of the clustering and RWB occurrence to the north and south of the jet. However, there is considerable spread in these relationships. Finally, analysis has shown that the relationships identified in the re-analysis are also present in a high-resolution coupled global climate model (HiGEM). In particular, clustering is associated with the same dynamical conditions at each of our three latitudes in spite of the identified biases in frequency and intensity of RWB.
An incremental DPMM-based method for trajectory clustering, modeling, and retrieval.

PubMed

Hu, Weiming; Li, Xi; Tian, Guodong; Maybank, Stephen; Zhang, Zhongfei

2013-05-01

Trajectory analysis is the basis for many applications, such as indexing of motion events in videos, activity recognition, and surveillance. In this paper, the Dirichlet process mixture model (DPMM) is applied to trajectory clustering, modeling, and retrieval. We propose an incremental version of a DPMM-based clustering algorithm and apply it to cluster trajectories. An appropriate number of trajectory clusters is determined automatically. When trajectories belonging to new clusters arrive, the new clusters can be identified online and added to the model without any retraining using the previous data. A time-sensitive Dirichlet process mixture model (tDPMM) is applied to each trajectory cluster for learning the trajectory pattern which represents the time-series characteristics of the trajectories in the cluster. Then, a parameterized index is constructed for each cluster. A novel likelihood estimation algorithm for the tDPMM is proposed, and a trajectory-based video retrieval model is developed. The tDPMM-based probabilistic matching method and the DPMM-based model growing method are combined to make the retrieval model scalable and adaptable. Experimental comparisons with state-of-the-art algorithms demonstrate the effectiveness of our algorithm.

National Differences in Regional Emergency Department Boarding Times: Are US Emergency Departments Prepared for a Public Health Emergency?

PubMed

Love, Jennifer S; Karp, David; Delgado, M Kit; Margolis, Gregg; Wiebe, Douglas J; Carr, Brendan G

2016-08-01

Boarding admitted patients decreases emergency department (ED) capacity to accommodate daily patient surge. Boarding in regional hospitals may decrease the ability to meet community needs during a public health emergency. This study examined differences in regional patient boarding times across the United States and in regions at risk for public health emergencies. A retrospective cross-sectional analysis was performed by using 2012 ED visit data from the American Hospital Association (AHA) database and 2012 hospital ED boarding data from the Centers for Medicare and Medicaid Services Hospital Compare database. Hospitals were grouped into hospital referral regions (HRRs). The primary outcome was mean ED boarding time per HRR. Spatial hot spot analysis examined boarding time spatial clustering. A total of 3317 of 4671 (71%) hospitals were included in the study cohort. A total of 45 high-boarding-time HRRs clustered along the East/West coasts and 67 low-boarding-time HRRs clustered in the Midwest/Northern Plains regions. A total of 86% of HRRs at risk for a terrorist event had high boarding times and 36% of HRRs with frequent natural disasters had high boarding times. Urban, coastal areas have the longest boarding times and are clustered with other high-boarding-time HRRs. Longer boarding times suggest a heightened level of vulnerability and a need to enhance surge capacity because these regions have difficulty meeting daily emergency care demands and are at increased risk for disasters. (Disaster Med Public Health Preparedness. 2016;10:576-582).
A Bayesian cluster analysis method for single-molecule localization microscopy data.

PubMed

Griffié, Juliette; Shannon, Michael; Bromley, Claire L; Boelen, Lies; Burn, Garth L; Williamson, David J; Heard, Nicholas A; Cope, Andrew P; Owen, Dylan M; Rubin-Delanchy, Patrick

2016-12-01

Cell function is regulated by the spatiotemporal organization of the signaling machinery, and a key facet of this is molecular clustering. Here, we present a protocol for the analysis of clustering in data generated by 2D single-molecule localization microscopy (SMLM)-for example, photoactivated localization microscopy (PALM) or stochastic optical reconstruction microscopy (STORM). Three features of such data can cause standard cluster analysis approaches to be ineffective: (i) the data take the form of a list of points rather than a pixel array; (ii) there is a non-negligible unclustered background density of points that must be accounted for; and (iii) each localization has an associated uncertainty in regard to its position. These issues are overcome using a Bayesian, model-based approach. Many possible cluster configurations are proposed and scored against a generative model, which assumes Gaussian clusters overlaid on a completely spatially random (CSR) background, before every point is scrambled by its localization precision. We present the process of generating simulated and experimental data that are suitable to our algorithm, the analysis itself, and the extraction and interpretation of key cluster descriptors such as the number of clusters, cluster radii and the number of localizations per cluster. Variations in these descriptors can be interpreted as arising from changes in the organization of the cellular nanoarchitecture. The protocol requires no specific programming ability, and the processing time for one data set, typically containing 30 regions of interest, is ∼18 h; user input takes ∼1 h.
Fuzzy cluster analysis of high-field functional MRI data.

PubMed

Windischberger, Christian; Barth, Markus; Lamm, Claus; Schroeder, Lee; Bauer, Herbert; Gur, Ruben C; Moser, Ewald

2003-11-01

Functional magnetic resonance imaging (fMRI) based on blood-oxygen level dependent (BOLD) contrast today is an established brain research method and quickly gains acceptance for complementary clinical diagnosis. However, neither the basic mechanisms like coupling between neuronal activation and haemodynamic response are known exactly, nor can the various artifacts be predicted or controlled. Thus, modeling functional signal changes is non-trivial and exploratory data analysis (EDA) may be rather useful. In particular, identification and separation of artifacts as well as quantification of expected, i.e. stimulus correlated, and novel information on brain activity is important for both, new insights in neuroscience and future developments in functional MRI of the human brain. After an introduction on fuzzy clustering and very high-field fMRI we present several examples where fuzzy cluster analysis (FCA) of fMRI time series helps to identify and locally separate various artifacts. We also present and discuss applications and limitations of fuzzy cluster analysis in very high-field functional MRI: differentiate temporal patterns in MRI using (a) a test object with static and dynamic parts, (b) artifacts due to gross head motion artifacts. Using a synthetic fMRI data set we quantitatively examine the influences of relevant FCA parameters on clustering results in terms of receiver-operator characteristics (ROC) and compare them with a commonly used model-based correlation analysis (CA) approach. The application of FCA in analyzing in vivo fMRI data is shown for (a) a motor paradigm, (b) data from multi-echo imaging, and (c) a fMRI study using mental rotation of three-dimensional cubes. We found that differentiation of true "neural" from false "vascular" activation is possible based on echo time dependence and specific activation levels, as well as based on their signal time-course. Exploratory data analysis methods in general and fuzzy cluster analysis in particular may help to identify artifacts and add novel and unexpected information valuable for interpretation, classification and characterization of functional MRI data which can be used to design new data acquisition schemes, stimulus presentations, neuro(physio)logical paradigms, as well as to improve quantitative biophysical models.
Identification and characterization of earthquake clusters: a comparative analysis for selected sequences in Italy

NASA Astrophysics Data System (ADS)

Peresan, Antonella; Gentili, Stefania

2017-04-01

Identification and statistical characterization of seismic clusters may provide useful insights about the features of seismic energy release and their relation to physical properties of the crust within a given region. Moreover, a number of studies based on spatio-temporal analysis of main-shocks occurrence require preliminary declustering of the earthquake catalogs. Since various methods, relying on different physical/statistical assumptions, may lead to diverse classifications of earthquakes into main events and related events, we aim to investigate the classification differences among different declustering techniques. Accordingly, a formal selection and comparative analysis of earthquake clusters is carried out for the most relevant earthquakes in North-Eastern Italy, as reported in the local OGS-CRS bulletins, compiled at the National Institute of Oceanography and Experimental Geophysics since 1977. The comparison is then extended to selected earthquake sequences associated with a different seismotectonic setting, namely to events that occurred in the region struck by the recent Central Italy destructive earthquakes, making use of INGV data. Various techniques, ranging from classical space-time windows methods to ad hoc manual identification of aftershocks, are applied for detection of earthquake clusters. In particular, a statistical method based on nearest-neighbor distances of events in space-time-energy domain, is considered. Results from clusters identification by the nearest-neighbor method turn out quite robust with respect to the time span of the input catalogue, as well as to minimum magnitude cutoff. The identified clusters for the largest events reported in North-Eastern Italy since 1977 are well consistent with those reported in earlier studies, which were aimed at detailed manual aftershocks identification. The study shows that the data-driven approach, based on the nearest-neighbor distances, can be satisfactorily applied to decompose the seismic catalog into background seismicity and individual sequences of earthquake clusters, also in areas characterized by moderate seismic activity, where the standard declustering techniques may turn out rather gross approximations. With these results acquired, the main statistical features of seismic clusters are explored, including complex interdependence of related events, with the aim to characterize the space-time patterns of earthquakes occurrence in North-Eastern Italy and capture their basic differences with Central Italy sequences.
Software system for data management and distributed processing of multichannel biomedical signals.

PubMed

Franaszczuk, P J; Jouny, C C

2004-01-01

The presented software is designed for efficient utilization of cluster of PC computers for signal analysis of multichannel physiological data. The system consists of three main components: 1) a library of input and output procedures, 2) a database storing additional information about location in a storage system, 3) a user interface for selecting data for analysis, choosing programs for analysis, and distributing computing and output data on cluster nodes. The system allows for processing multichannel time series data in multiple binary formats. The description of data format, channels and time of recording are included in separate text files. Definition and selection of multiple channel montages is possible. Epochs for analysis can be selected both manually and automatically. Implementation of a new signal processing procedures is possible with a minimal programming overhead for the input/output processing and user interface. The number of nodes in cluster used for computations and amount of storage can be changed with no major modification to software. Current implementations include the time-frequency analysis of multiday, multichannel recordings of intracranial EEG of epileptic patients as well as evoked response analyses of repeated cognitive tasks.
Dengue Fever Occurrence and Vector Detection by Larval Survey, Ovitrap and MosquiTRAP: A Space-Time Clusters Analysis

PubMed Central

de Melo, Diogo Portella Ornelas; Scherrer, Luciano Rios; Eiras, Álvaro Eduardo

2012-01-01

The use of vector surveillance tools for preventing dengue disease requires fine assessment of risk, in order to improve vector control activities. Nevertheless, the thresholds between vector detection and dengue fever occurrence are currently not well established. In Belo Horizonte (Minas Gerais, Brazil), dengue has been endemic for several years. From January 2007 to June 2008, the dengue vector Aedes (Stegomyia) aegypti was monitored by ovitrap, the sticky-trap MosquiTRAP™ and larval surveys in an study area in Belo Horizonte. Using a space-time scan for clusters detection implemented in SaTScan software, the vector presence recorded by the different monitoring methods was evaluated. Clusters of vectors and dengue fever were detected. It was verified that ovitrap and MosquiTRAP vector detection methods predicted dengue occurrence better than larval survey, both spatially and temporally. MosquiTRAP and ovitrap presented similar results of space-time intersections to dengue fever clusters. Nevertheless ovitrap clusters presented longer duration periods than MosquiTRAP ones, less acuratelly signalizing the dengue risk areas, since the detection of vector clusters during most of the study period was not necessarily correlated to dengue fever occurrence. It was verified that ovitrap clusters occurred more than 200 days (values ranged from 97.0±35.35 to 283.0±168.4 days) before dengue fever clusters, whereas MosquiTRAP clusters preceded dengue fever clusters by approximately 80 days (values ranged from 65.5±58.7 to 94.0±14. 3 days), the former showing to be more temporally precise. Thus, in the present cluster analysis study MosquiTRAP presented superior results for signaling dengue transmission risks both geographically and temporally. Since early detection is crucial for planning and deploying effective preventions, MosquiTRAP showed to be a reliable tool and this method provides groundwork for the development of even more precise tools. PMID:22848729
Representation of Tinnitus in the US Newspaper Media and in Facebook Pages: Cross-Sectional Analysis of Secondary Data.

PubMed

Manchaiah, Vinaya; Ratinaud, Pierre; Andersson, Gerhard

2018-05-08

When people with health conditions begin to manage their health issues, one important issue that emerges is the question as to what exactly do they do with the information that they have obtained through various sources (eg, news media, social media, health professionals, friends, and family). The information they gather helps form their opinions and, to some degree, influences their attitudes toward managing their condition. This study aimed to understand how tinnitus is represented in the US newspaper media and in Facebook pages (ie, social media) using text pattern analysis. This was a cross-sectional study based upon secondary analyses of publicly available data. The 2 datasets (ie, text corpuses) analyzed in this study were generated from US newspaper media during 1980-2017 (downloaded from the database US Major Dailies by ProQuest) and Facebook pages during 2010-2016. The text corpuses were analyzed using the Iramuteq software using cluster analysis and chi-square tests. The newspaper dataset had 432 articles. The cluster analysis resulted in 5 clusters, which were named as follows: (1) brain stimulation (26.2%), (2) symptoms (13.5%), (3) coping (19.8%), (4) social support (24.2%), and (5) treatment innovation (16.4%). A time series analysis of clusters indicated a change in the pattern of information presented in newspaper media during 1980-2017 (eg, more emphasis on cluster 5, focusing on treatment inventions). The Facebook dataset had 1569 texts. The cluster analysis resulted in 7 clusters, which were named as: (1) diagnosis (21.9%), (2) cause (4.1%), (3) research and development (13.6%), (4) social support (18.8%), (5) challenges (11.1%), (6) symptoms (21.4%), and (7) coping (9.2%). A time series analysis of clusters indicated no change in information presented in Facebook pages on tinnitus during 2011-2016. The study highlights the specific aspects about tinnitus that the US newspaper media and Facebook pages focus on, as well as how these aspects change over time. These findings can help health care providers better understand the presuppositions that tinnitus patients may have. More importantly, the findings can help public health experts and health communication experts in tailoring health information about tinnitus to promote self-management, as well as assisting in appropriate choices of treatment for those living with tinnitus. ©Vinaya Manchaiah, Pierre Ratinaud, Gerhard Andersson. Originally published in the Interactive Journal of Medical Research (http://www.i-jmr.org/), 08.05.2018.
Graph Based Models for Unsupervised High Dimensional Data Clustering and Network Analysis

DTIC Science & Technology

2015-01-01

ApprovedOMB No. 0704-0188 Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for...algorithms we proposed improve the time e ciency signi cantly for large scale datasets. In the last chapter, we also propose an incremental reseeding...plume detection in hyper-spectral video data. These graph based clustering algorithms we proposed improve the time efficiency significantly for large
Visual cluster analysis and pattern recognition template and methods

DOEpatents

Osbourn, G.C.; Martinez, R.F.

1999-05-04

A method of clustering using a novel template to define a region of influence is disclosed. Using neighboring approximation methods, computation times can be significantly reduced. The template and method are applicable and improve pattern recognition techniques. 30 figs.
A Real-Time PCR with Melting Curve Analysis for Molecular Typing of Vibrio parahaemolyticus.

PubMed

He, Peiyan; Wang, Henghui; Luo, Jianyong; Yan, Yong; Chen, Zhongwen

2018-05-23

Foodborne disease caused by Vibrio parahaemolyticus is a serious public health problem in many countries. Molecular typing has a great scientific significance and application value for epidemiological research of V. parahaemolyticus. In this study, a real-time PCR with melting curve analysis was established for molecular typing of V. parahaemolyticus. Eighteen large variably presented gene clusters (LVPCs) of V. parahaemolyticus which have different distributions in the genome of different strains were selected as targets. Primer pairs of 18 LVPCs were distributed into three tubes. To validate this newly developed assay, we tested 53 Vibrio parahaemolyticus strains, which were classified in 13 different types. Furthermore, cluster analysis using NTSYS PC 2.02 software could divide 53 V. parahaemolyticus strains into six clusters at a relative similarity coefficient of 0.85. This method is fast, simple, and conveniently for molecular typing of V. parahaemolyticus.
Lifestyle Patterns and Weight Status in Spanish Adults: The ANIBES Study.

PubMed

Pérez-Rodrigo, Carmen; Gianzo-Citores, Marta; Gil, Ángel; González-Gross, Marcela; Ortega, Rosa M; Serra-Majem, Lluis; Varela-Moreiras, Gregorio; Aranceta-Bartrina, Javier

2017-06-14

Limited knowledge is available on lifestyle patterns in Spanish adults. We investigated dietary patterns and possible meaningful clustering of physical activity, sedentary behavior, sleep time, and smoking in Spanish adults aged 18-64 years and their association with obesity. Analysis was based on a subsample ( n = 1617) of the cross-sectional ANIBES study in Spain. We performed exploratory factor analysis and subsequent cluster analysis of dietary patterns, physical activity, sedentary behaviors, sleep time, and smoking. Logistic regression analysis was used to explore the association between the cluster solutions and obesity. Factor analysis identified four dietary patterns, " Traditional DP ", " Mediterranean DP ", " Snack DP " and " Dairy-sweet DP ". Dietary patterns, physical activity behaviors, sedentary behaviors, sleep time, and smoking in Spanish adults aggregated into three different clusters of lifestyle patterns: " Mixed diet-physically active-low sedentary lifestyle pattern ", " Not poor diet-low physical activity-low sedentary lifestyle pattern " and " Poor diet-low physical activity-sedentary lifestyle pattern ". A higher proportion of people aged 18-30 years was classified into the " Poor diet-low physical activity-sedentary lifestyle pattern ". The prevalence odds ratio for obesity in men in the " Mixed diet-physically active-low sedentary lifestyle pattern " was significantly lower compared to those in the " Poor diet-low physical activity-sedentary lifestyle pattern ". Those behavior patterns are helpful to identify specific issues in population subgroups and inform intervention strategies. The findings in this study underline the importance of designing and implementing interventions that address multiple health risk practices, considering lifestyle patterns and associated determinants.
Analysis of the convective evaporation of nondilute clusters of drops

NASA Technical Reports Server (NTRS)

Bellan, J.; Harstad, K.

1987-01-01

The penetration distance of an outer flow into a drop cluster volume is the critical, evaporation mode-controlling parameter in the present model for nondilute drop clusters' convective evaporation. The model is found to perform well for such low penetration distances as those obtained for dense clusters in hot environments and low relative velocities between the outer gases and the cluster. For large penetration distances, however, the predictive power of the model deteriorates; in addition, the evaporation time is found to be a weak function of the initial relative velocity and a strong function of the initial drop temperature. The results generally show that the interior drop temperature was transient throughout the drop lifetime, although temperature nonuniformities persisted up to the first third of the total evaporation time at most.
Whole-Volume Clustering of Time Series Data from Zebrafish Brain Calcium Images via Mixture Modeling.

PubMed

Nguyen, Hien D; Ullmann, Jeremy F P; McLachlan, Geoffrey J; Voleti, Venkatakaushik; Li, Wenze; Hillman, Elizabeth M C; Reutens, David C; Janke, Andrew L

2018-02-01

Calcium is a ubiquitous messenger in neural signaling events. An increasing number of techniques are enabling visualization of neurological activity in animal models via luminescent proteins that bind to calcium ions. These techniques generate large volumes of spatially correlated time series. A model-based functional data analysis methodology via Gaussian mixtures is suggested for the clustering of data from such visualizations is proposed. The methodology is theoretically justified and a computationally efficient approach to estimation is suggested. An example analysis of a zebrafish imaging experiment is presented.
Time-dependent risks of cancer clustering among couples: a nationwide population-based cohort study in Taiwan.

PubMed

Wang, Jong-Yi; Liang, Yia-Wen; Yeh, Chun-Chen; Liu, Chiu-Shong; Wang, Chen-Yu

2018-02-21

Spousal clustering of cancer warrants attention. Whether the common environment or high-age vulnerability determines cancer clustering is unclear. The risk of clustering in couples versus non-couples is undetermined. The time to cancer clustering after the first cancer diagnosis is yet to be reported. This study investigated cancer clustering over time among couples by using nationwide data. A cohort of 5643 married couples in the 2002-2013 Taiwan National Health Insurance Research Database was identified and randomly matched with 5643 non-couple pairs through dual propensity score matching. Factors associated with clustering (both spouses with tumours) were analysed by using the Cox proportional hazard model. Propensity-matched analysis revealed that the risk of clustering of all tumours among couples (13.70%) was significantly higher than that among non-couples (11.84%) (OR=1.182, 95% CI 1.058 to 1.321, P=0.0031). The median time to clustering of all tumours and of malignant tumours was 2.92 and 2.32 years, respectively. Risk characteristics associated with clustering included high age and comorbidity. Shared environmental factors among spouses might be linked to a high incidence of cancer clustering. Cancer incidence in one spouse may signal cancer vulnerability in the other spouse. Promoting family-oriented cancer care in vulnerable families and preventing shared lifestyle risk factors for cancer are suggested. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Research on retailer data clustering algorithm based on Spark

NASA Astrophysics Data System (ADS)

Huang, Qiuman; Zhou, Feng

2017-03-01

Big data analysis is a hot topic in the IT field now. Spark is a high-reliability and high-performance distributed parallel computing framework for big data sets. K-means algorithm is one of the classical partition methods in clustering algorithm. In this paper, we study the k-means clustering algorithm on Spark. Firstly, the principle of the algorithm is analyzed, and then the clustering analysis is carried out on the supermarket customers through the experiment to find out the different shopping patterns. At the same time, this paper proposes the parallelization of k-means algorithm and the distributed computing framework of Spark, and gives the concrete design scheme and implementation scheme. This paper uses the two-year sales data of a supermarket to validate the proposed clustering algorithm and achieve the goal of subdividing customers, and then analyze the clustering results to help enterprises to take different marketing strategies for different customer groups to improve sales performance.
Clustering of financial time series with application to index and enhanced index tracking portfolio

NASA Astrophysics Data System (ADS)

Dose, Christian; Cincotti, Silvano

2005-09-01

A stochastic-optimization technique based on time series cluster analysis is described for index tracking and enhanced index tracking problems. Our methodology solves the problem in two steps, i.e., by first selecting a subset of stocks and then setting the weight of each stock as a result of an optimization process (asset allocation). Present formulation takes into account constraints on the number of stocks and on the fraction of capital invested in each of them, whilst not including transaction costs. Computational results based on clustering selection are compared to those of random techniques and show the importance of clustering in noise reduction and robust forecasting applications, in particular for enhanced index tracking.
Mixed Pattern Matching-Based Traffic Abnormal Behavior Recognition

PubMed Central

Cui, Zhiming; Zhao, Pengpeng

2014-01-01

A motion trajectory is an intuitive representation form in time-space domain for a micromotion behavior of moving target. Trajectory analysis is an important approach to recognize abnormal behaviors of moving targets. Against the complexity of vehicle trajectories, this paper first proposed a trajectory pattern learning method based on dynamic time warping (DTW) and spectral clustering. It introduced the DTW distance to measure the distances between vehicle trajectories and determined the number of clusters automatically by a spectral clustering algorithm based on the distance matrix. Then, it clusters sample data points into different clusters. After the spatial patterns and direction patterns learned from the clusters, a recognition method for detecting vehicle abnormal behaviors based on mixed pattern matching was proposed. The experimental results show that the proposed technical scheme can recognize main types of traffic abnormal behaviors effectively and has good robustness. The real-world application verified its feasibility and the validity. PMID:24605045
Identification and validation of asthma phenotypes in Chinese population using cluster analysis.

PubMed

Wang, Lei; Liang, Rui; Zhou, Ting; Zheng, Jing; Liang, Bing Miao; Zhang, Hong Ping; Luo, Feng Ming; Gibson, Peter G; Wang, Gang

2017-10-01

Asthma is a heterogeneous airway disease, so it is crucial to clearly identify clinical phenotypes to achieve better asthma management. To identify and prospectively validate asthma clusters in a Chinese population. Two hundred eighty-four patients were consecutively recruited and 18 sociodemographic and clinical variables were collected. Hierarchical cluster analysis was performed by the Ward method followed by k-means cluster analysis. Then, a prospective 12-month cohort study was used to validate the identified clusters. Five clusters were successfully identified. Clusters 1 (n = 71) and 3 (n = 81) were mild asthma phenotypes with slight airway obstruction and low exacerbation risk, but with a sex differential. Cluster 2 (n = 65) described an "allergic" phenotype, cluster 4 (n = 33) featured a "fixed airflow limitation" phenotype with smoking, and cluster 5 (n = 34) was a "low socioeconomic status" phenotype. Patients in clusters 2, 4, and 5 had distinctly lower socioeconomic status and more psychological symptoms. Cluster 2 had a significantly increased risk of exacerbations (risk ratio [RR] 1.13, 95% confidence interval [CI] 1.03-1.25), unplanned visits for asthma (RR 1.98, 95% CI 1.07-3.66), and emergency visits for asthma (RR 7.17, 95% CI 1.26-40.80). Cluster 4 had an increased risk of unplanned visits (RR 2.22, 95% CI 1.02-4.81), and cluster 5 had increased emergency visits (RR 12.72, 95% CI 1.95-69.78). Kaplan-Meier analysis confirmed that cluster grouping was predictive of time to the first asthma exacerbation, unplanned visit, emergency visit, and hospital admission (P < .0001 for all comparisons). We identified 3 clinical clusters as "allergic asthma," "fixed airflow limitation," and "low socioeconomic status" phenotypes that are at high risk of severe asthma exacerbations and that have management implications for clinical practice in developing countries. Copyright © 2017 American College of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
Floating Droplet Array: An Ultrahigh-Throughput Device for Droplet Trapping, Real-time Analysis and Recovery

PubMed Central

Labanieh, Louai; Nguyen, Thi N.; Zhao, Weian; Kang, Dong-Ku

2016-01-01

We describe the design, fabrication and use of a dual-layered microfluidic device for ultrahigh-throughput droplet trapping, analysis, and recovery using droplet buoyancy. To demonstrate the utility of this device for digital quantification of analytes, we quantify the number of droplets, which contain a β-galactosidase-conjugated bead among more than 100,000 immobilized droplets. In addition, we demonstrate that this device can be used for droplet clustering and real-time analysis by clustering several droplets together into microwells and monitoring diffusion of fluorescein, a product of the enzymatic reaction of β-galactosidase and its fluorogenic substrate FDG, between droplets. PMID:27134760
Sirenomelia in Argentina: Prevalence, geographic clusters and temporal trends analysis.

PubMed

Groisman, Boris; Liascovich, Rosa; Gili, Juan Antonio; Barbero, Pablo; Bidondo, María Paz

2016-07-01

Sirenomelia is a severe malformation of the lower body characterized by a single medial lower limb and a variable combination of visceral abnormalities. Given that Sirenomelia is a very rare birth defect, epidemiological studies are scarce. The aim of this study is to evaluate prevalence, geographic clusters and time trends of sirenomelia in Argentina, using data from the National Network of Congenital Anomalies of Argentina (RENAC) from November 2009 until December 2014. This is a descriptive study using data from the RENAC, a hospital-based surveillance system for newborns affected with major morphological congenital anomalies. We calculated sirenomelia prevalence throughout the period, searched for geographical clusters, and evaluated time trends. The prevalence of confirmed cases of sirenomelia throughout the period was 2.35 per 100,000 births. Cluster analysis showed no statistically significant geographical aggregates. Time-trends analysis showed that the prevalence was higher in years 2009 to 2010. The observed prevalence was higher than the observed in previous epidemiological studies in other geographic regions. We observed a likely real increase in the initial period of our study. We used strict diagnostic criteria, excluding cases that only had clinical diagnosis of sirenomelia. Therefore, real prevalence could be even higher. This study did not show any geographic clusters. Because etiology of sirenomelia has not yet been established, studies of epidemiological features of this defect may contribute to define its causes. Birth Defects Research (Part A) 106:604-611, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

Nearest clusters based partial least squares discriminant analysis for the classification of spectral data.

PubMed

Song, Weiran; Wang, Hui; Maguire, Paul; Nibouche, Omar

2018-06-07

Partial Least Squares Discriminant Analysis (PLS-DA) is one of the most effective multivariate analysis methods for spectral data analysis, which extracts latent variables and uses them to predict responses. In particular, it is an effective method for handling high-dimensional and collinear spectral data. However, PLS-DA does not explicitly address data multimodality, i.e., within-class multimodal distribution of data. In this paper, we present a novel method termed nearest clusters based PLS-DA (NCPLS-DA) for addressing the multimodality and nonlinearity issues explicitly and improving the performance of PLS-DA on spectral data classification. The new method applies hierarchical clustering to divide samples into clusters and calculates the corresponding centre of every cluster. For a given query point, only clusters whose centres are nearest to such a query point are used for PLS-DA. Such a method can provide a simple and effective tool for separating multimodal and nonlinear classes into clusters which are locally linear and unimodal. Experimental results on 17 datasets, including 12 UCI and 5 spectral datasets, show that NCPLS-DA can outperform 4 baseline methods, namely, PLS-DA, kernel PLS-DA, local PLS-DA and k-NN, achieving the highest classification accuracy most of the time. Copyright © 2018 Elsevier B.V. All rights reserved.
SOMFlow: Guided Exploratory Cluster Analysis with Self-Organizing Maps and Analytic Provenance.

PubMed

Sacha, Dominik; Kraus, Matthias; Bernard, Jurgen; Behrisch, Michael; Schreck, Tobias; Asano, Yuki; Keim, Daniel A

2018-01-01

Clustering is a core building block for data analysis, aiming to extract otherwise hidden structures and relations from raw datasets, such as particular groups that can be effectively related, compared, and interpreted. A plethora of visual-interactive cluster analysis techniques has been proposed to date, however, arriving at useful clusterings often requires several rounds of user interactions to fine-tune the data preprocessing and algorithms. We present a multi-stage Visual Analytics (VA) approach for iterative cluster refinement together with an implementation (SOMFlow) that uses Self-Organizing Maps (SOM) to analyze time series data. It supports exploration by offering the analyst a visual platform to analyze intermediate results, adapt the underlying computations, iteratively partition the data, and to reflect previous analytical activities. The history of previous decisions is explicitly visualized within a flow graph, allowing to compare earlier cluster refinements and to explore relations. We further leverage quality and interestingness measures to guide the analyst in the discovery of useful patterns, relations, and data partitions. We conducted two pair analytics experiments together with a subject matter expert in speech intonation research to demonstrate that the approach is effective for interactive data analysis, supporting enhanced understanding of clustering results as well as the interactive process itself.
Suzaku observations of low surface brightness cluster Abell 1631

NASA Astrophysics Data System (ADS)

Babazaki, Yasunori; Mitsuishi, Ikuyuki; Ota, Naomi; Sasaki, Shin; Böhringer, Hans; Chon, Gayoung; Pratt, Gabriel W.; Matsumoto, Hironori

2018-04-01

We present analysis results for a nearby galaxy cluster Abell 1631 at z = 0.046 using the X-ray observatory Suzaku. This cluster is categorized as a low X-ray surface brightness cluster. To study the dynamical state of the cluster, we conduct four-pointed Suzaku observations and investigate physical properties of the Mpc-scale hot gas associated with the A 1631 cluster for the first time. Unlike relaxed clusters, the X-ray image shows no strong peak at the center and an irregular morphology. We perform spectral analysis and investigate the radial profiles of the gas temperature, density, and entropy out to approximately 1.5 Mpc in the east, north, west, and south directions by combining with the XMM-Newton data archive. The measured gas density in the central region is relatively low (a few ×10-4 cm-3) at the given temperature (˜2.9 keV) compared with X-ray-selected clusters. The entropy profile and value within the central region (r < 0.1 r200) are found to be flatter and higher (≳400 keV cm2). The observed bolometric luminosity is approximately three times lower than that expected from the luminosity-temperature relation in previous studies of relaxed clusters. These features are also observed in another low surface brightness cluster, Abell 76. The spatial distributions of galaxies and the hot gas appear to be different. The X-ray luminosity is relatively lower than that expected from the velocity dispersion. A post-merger scenario may explain the observed results.
Suzaku observations of low surface brightness cluster Abell 1631

NASA Astrophysics Data System (ADS)

Babazaki, Yasunori; Mitsuishi, Ikuyuki; Ota, Naomi; Sasaki, Shin; Böhringer, Hans; Chon, Gayoung; Pratt, Gabriel W.; Matsumoto, Hironori

2018-06-01

We present analysis results for a nearby galaxy cluster Abell 1631 at z = 0.046 using the X-ray observatory Suzaku. This cluster is categorized as a low X-ray surface brightness cluster. To study the dynamical state of the cluster, we conduct four-pointed Suzaku observations and investigate physical properties of the Mpc-scale hot gas associated with the A 1631 cluster for the first time. Unlike relaxed clusters, the X-ray image shows no strong peak at the center and an irregular morphology. We perform spectral analysis and investigate the radial profiles of the gas temperature, density, and entropy out to approximately 1.5 Mpc in the east, north, west, and south directions by combining with the XMM-Newton data archive. The measured gas density in the central region is relatively low (a few ×10-4 cm-3) at the given temperature (˜2.9 keV) compared with X-ray-selected clusters. The entropy profile and value within the central region (r < 0.1 r200) are found to be flatter and higher (≳400 keV cm2). The observed bolometric luminosity is approximately three times lower than that expected from the luminosity-temperature relation in previous studies of relaxed clusters. These features are also observed in another low surface brightness cluster, Abell 76. The spatial distributions of galaxies and the hot gas appear to be different. The X-ray luminosity is relatively lower than that expected from the velocity dispersion. A post-merger scenario may explain the observed results.
A New Approach to Identify High Burnout Medical Staffs by Kernel K-Means Cluster Analysis in a Regional Teaching Hospital in Taiwan.

PubMed

Lee, Yii-Ching; Huang, Shian-Chang; Huang, Chih-Hsuan; Wu, Hsin-Hung

2016-01-01

This study uses kernel k-means cluster analysis to identify medical staffs with high burnout. The data collected in October to November 2014 are from the emotional exhaustion dimension of the Chinese version of Safety Attitudes Questionnaire in a regional teaching hospital in Taiwan. The number of effective questionnaires including the entire staffs such as physicians, nurses, technicians, pharmacists, medical administrators, and respiratory therapists is 680. The results show that 8 clusters are generated by kernel k-means method. Employees in clusters 1, 4, and 5 are relatively in good conditions, whereas employees in clusters 2, 3, 6, 7, and 8 need to be closely monitored from time to time because they have relatively higher degree of burnout. When employees with higher degree of burnout are identified, the hospital management can take actions to improve the resilience, reduce the potential medical errors, and, eventually, enhance the patient safety. This study also suggests that the hospital management needs to keep track of medical staffs' fatigue conditions and provide timely assistance for burnout recovery through employee assistance programs, mindfulness-based stress reduction programs, positivity currency buildup, and forming appreciative inquiry groups. © The Author(s) 2016.
Competing Effects Between Screen Media Time and Physical Activity in Adolescent Girls: Clustering a Self-Organizing Maps Analysis.

PubMed

Valencia-Peris, Alexandra; Devís-Devís, José; García-Massó, Xavier; Lizandra, Jorge; Pérez-Gimeno, Esther; Peiró-Velert, Carmen

2016-06-01

Previous research shows contradictory findings on potential competing effects between sedentary screen media usage (SMU) and physical activity (PA). This study examined these effects on adolescent girls via self-organizing maps analysis focusing on 3 target profiles. A sample of 1,516 girls aged 12 to 18 years self-reported daily time engagement in PA (moderate and vigorous intensity) and in screen media activities (TV/video/DVD, computer, and videogames), separately and combined. Topological interrelationships from the 13 emerging maps indicated a moderate competing effect between physically active and sedentary SMU patterns. Higher SES and overweight status were linked to either active or inactive behaviors. Three target clusters were explored in more detail. Cluster 1, named temperate-media actives, showed capabilities of being active while engaging in a moderate level of SMU (TV/video/DVD mainly). In Cluster 2, named prudent-media inactives, and Cluster 3, compulsive-media inactives, a competing effect between SMU and PA emerged, being sedentary SMU behaviors responsible for a low involvement in active pursuits. SMU and PA emerge as both related and independent behaviors in girls, resulting in a moderate competing effect. Findings support the case for recommending the timing of PA and SMU for recreational purposes considering different profiles, sociodemographic factors and types of SMU.
Cluster Adjusted Regression for Displaced Subject Data (CARDS): Marginal Inference under Potentially Informative Temporal Cluster Size Profiles

PubMed Central

Bible, Joe; Beck, James D.; Datta, Somnath

2016-01-01

Summary Ignorance of the mechanisms responsible for the availability of information presents an unusual problem for analysts. It is often the case that the availability of information is dependent on the outcome. In the analysis of cluster data we say that a condition for informative cluster size (ICS) exists when the inference drawn from analysis of hypothetical balanced data varies from that of inference drawn on observed data. Much work has been done in order to address the analysis of clustered data with informative cluster size; examples include Inverse Probability Weighting (IPW), Cluster Weighted Generalized Estimating Equations (CWGEE), and Doubly Weighted Generalized Estimating Equations (DWGEE). When cluster size changes with time, i.e., the data set possess temporally varying cluster sizes (TVCS), these methods may produce biased inference for the underlying marginal distribution of interest. We propose a new marginalization that may be appropriate for addressing clustered longitudinal data with TVCS. The principal motivation for our present work is to analyze the periodontal data collected by Beck et al. (1997, Journal of Periodontal Research 6, 497–505). Longitudinal periodontal data often exhibits both ICS and TVCS as the number of teeth possessed by participants at the onset of study is not constant and teeth as well as individuals may be displaced throughout the study. PMID:26682911
Stability and change in adolescent spirituality/religiosity: a person-centered approach.

PubMed

Good, Marie; Willoughby, Teena; Busseri, Michael A

2011-03-01

Although there has been a substantial increase over the past decade in studies that have examined the psychosocial correlates of spirituality/religiosity in adolescence, very little is known about spirituality/religiosity as a domain of development in its own right. To address this limitation, the authors identified configurations of multiple dimensions of spirituality/religiosity across 2 time points with an empirical classification procedure (cluster analysis) and assessed development in these configurations at the sample and individual level. Participants included 756 predominately Canadian-born adolescents (53% female, 47% male) from southern Ontario, Canada, who completed a survey in Grade 11 (M age = 16.41 years) and Grade 12 (M age = 17.36 years). Measures included religious activity involvement, enjoyment of religious activities, the Spiritual Transcendence Index, wondering about spiritual issues, frequency of prayer, and frequency of meditation. Sample-level development (structural stability and change) was assessed by examining whether the structural configurations of the clusters were consistent over time. Individual-level development was assessed by examining intraindividual stability and change in cluster membership over time. Results revealed that a five cluster-solution was optimal at both grades. Clusters were identified as aspiritual/irreligious, disconnected wonderers, high institutional and personal, primarily personal, and meditators. With the exception of the high institutional and personal cluster, the cluster structures were stable over time. There also was significant intraindividual stability in all clusters over time; however, a significant proportion of individuals classified as high institutional and personal in Grade 11 moved into the primarily personal cluster in Grade 12. PsycINFO Database Record (c) 2011 APA, all rights reserved.
Pinpointing clusters of apparently sporadic cases of Legionnaires' disease.

PubMed Central

Bhopal, R. S.; Diggle, P.; Rowlingson, B.

1992-01-01

OBJECTIVES--To test the hypothesis that many non-outbreak cases of legionnaires' disease are not sporadic and to attempt to pinpoint cases clustering in space and time. DESIGN--Descriptive study of a case series, 1978-86. SETTING--15 health boards in Scotland. PATIENTS--203 probable cases of non-outbreak, non-travel, community acquired legionnaires' disease in patients resident in Scotland. MAIN MEASURES--Date of onset of disease and postcode and health board of residence of cases. RESULTS--Space-time clustering was present and numerous groups of cases were identified, all but two being newly recognised. Nine cases occurred during three months within two postcodes in Edinburgh, and an outbreak was probably missed. In several places cases occurred in one area over a prolonged period--for example, nine cases in postcode districts G11.5 and G12.8 in Glasgow during five years (estimated mean annual incidence of community acquired, non-outbreak, non-travel legionnaires' disease of 146 per million residents v 4.8 per million for Scotland). Statistical analysis showed that the space time clustering of cases in the Glasgow and Edinburgh areas was unusual (p = 0.036, p = 0.068 respectively). CONCLUSION--Future surveillance requires greater awareness that clusters can be overlooked; case searching whenever a case is identified; collection of complete information particularly of date of onset of the disease and address or postcode; ongoing analysis for space-time clustering; and an accurate yet workable definition of sporadic cases. Other researchers should re-examine their data on apparently sporadic infection. PMID:1586784
Cluster analysis of dynamic contrast enhanced MRI reveals tumor subregions related to locoregional relapse for cervical cancer patients.

PubMed

Torheim, Turid; Groendahl, Aurora R; Andersen, Erlend K F; Lyng, Heidi; Malinen, Eirik; Kvaal, Knut; Futsaether, Cecilia M

2016-11-01

Solid tumors are known to be spatially heterogeneous. Detection of treatment-resistant tumor regions can improve clinical outcome, by enabling implementation of strategies targeting such regions. In this study, K-means clustering was used to group voxels in dynamic contrast enhanced magnetic resonance images (DCE-MRI) of cervical cancers. The aim was to identify clusters reflecting treatment resistance that could be used for targeted radiotherapy with a dose-painting approach. Eighty-one patients with locally advanced cervical cancer underwent DCE-MRI prior to chemoradiotherapy. The resulting image time series were fitted to two pharmacokinetic models, the Tofts model (yielding parameters K trans and ν e ) and the Brix model (A Brix , k ep and k el ). K-means clustering was used to group similar voxels based on either the pharmacokinetic parameter maps or the relative signal increase (RSI) time series. The associations between voxel clusters and treatment outcome (measured as locoregional control) were evaluated using the volume fraction or the spatial distribution of each cluster. One voxel cluster based on the RSI time series was significantly related to locoregional control (adjusted p-value 0.048). This cluster consisted of low-enhancing voxels. We found that tumors with poor prognosis had this RSI-based cluster gathered into few patches, making this cluster a potential candidate for targeted radiotherapy. None of the voxels clusters based on Tofts or Brix parameter maps were significantly related to treatment outcome. We identified one group of tumor voxels significantly associated with locoregional relapse that could potentially be used for dose painting. This tumor voxel cluster was identified using the raw MRI time series rather than the pharmacokinetic maps.
Prospective associations between socio-economic status and dietary patterns in European children: the Identification and Prevention of Dietary- and Lifestyle-induced Health Effects in Children and Infants (IDEFICS) Study.

PubMed

Fernández-Alvira, Juan Miguel; Börnhorst, Claudia; Bammann, Karin; Gwozdz, Wencke; Krogh, Vittorio; Hebestreit, Antje; Barba, Gianvincenzo; Reisch, Lucia; Eiben, Gabriele; Iglesia, Iris; Veidebaum, Tomas; Kourides, Yannis A; Kovacs, Eva; Huybrechts, Inge; Pigeot, Iris; Moreno, Luis A

2015-02-14

Exploring changes in children's diet over time and the relationship between these changes and socio-economic status (SES) may help to understand the impact of social inequalities on dietary patterns. The aim of the present study was to describe dietary patterns by applying a cluster analysis to 9301 children participating in the baseline (2-9 years old) and follow-up (4-11 years old) surveys of the Identification and Prevention of Dietary- and Lifestyle-induced Health Effects in Children and Infants Study, and to describe the cluster memberships of these children over time and their association with SES. We applied the K-means clustering algorithm based on the similarities between the relative frequencies of consumption of forty-two food items. The following three consistent clusters were obtained at baseline and follow-up: processed (higher frequency of consumption of snacks and fast food); sweet (higher frequency of consumption of sweet foods and sweetened drinks); healthy (higher frequency of consumption of fruits, vegetables and wholemeal products). Children with higher-educated mothers and fathers and the highest household income were more likely to be allocated to the healthy cluster at baseline and follow-up and less likely to be allocated to the sweet cluster. Migrants were more likely to be allocated to the processed cluster at baseline and follow-up. Applying the cluster analysis to derive dietary patterns at the two time points allowed us to identify groups of children from a lower socio-economic background presenting persistently unhealthier dietary profiles. This finding reflects the need for healthy eating interventions specifically targeting children from lower socio-economic backgrounds.
Clustering ENTLN sferics to improve TGF temporal analysis

NASA Astrophysics Data System (ADS)

Pradhan, E.; Briggs, M. S.; Stanbro, M.; Cramer, E.; Heckman, S.; Roberts, O.

2017-12-01

Using TGFs detected with Fermi Gamma-ray Burst Monitor (GBM) and simultaneous radio sferics detected by Earth Network Total Lightning Network (ENTLN), we establish a temporal co-relation between them. The first step is to find ENTLN strokes that that are closely associated to GBM TGFs. We then identify all the related strokes in the lightning flash that the TGF-associated-stroke belongs to. After trying several algorithms, we found out that the DBSCAN clustering algorithm was best for clustering related ENTLN strokes into flashes. The operation of DBSCAN was optimized using a single seperation measure that combined time and distance seperation. Previous analysis found that these strokes show three timescales with respect to the gamma-ray time. We will use the improved identification of flashes to research this.
Effect of functionalization of boron nitride flakes by main group metal clusters on their optoelectronic properties

NASA Astrophysics Data System (ADS)

Chakraborty, Debdutta; Chattaraj, Pratim Kumar

2017-10-01

The possibility of functionalizing boron nitride flakes (BNFs) with some selected main group metal clusters, viz. OLi4, NLi5, CLi6, BLI7 and Al12Be, has been analyzed with the aid of density functional theory (DFT) based computations. Thermochemical as well as energetic considerations suggest that all the metal clusters interact with the BNF moiety in a favorable fashion. As a result of functionalization, the static (first) hyperpolarizability (β ) values of the metal cluster supported BNF moieties increase quite significantly as compared to that in the case of pristine BNF. Time dependent DFT analysis reveals that the metal clusters can lower the transition energies associated with the dominant electronic transitions quite significantly thereby enabling the metal cluster supported BNF moieties to exhibit significant non-linear optical activity. Moreover, the studied systems demonstrate broad band absorption capability spanning the UV-visible as well as infra-red domains. Energy decomposition analysis reveals that the electrostatic interactions principally stabilize the metal cluster supported BNF moieties.
Effect of functionalization of boron nitride flakes by main group metal clusters on their optoelectronic properties.

PubMed

Chakraborty, Debdutta; Chattaraj, Pratim Kumar

2017-10-25

The possibility of functionalizing boron nitride flakes (BNFs) with some selected main group metal clusters, viz. OLi 4 , NLi 5 , CLi 6 , BLI 7 and Al 12 Be, has been analyzed with the aid of density functional theory (DFT) based computations. Thermochemical as well as energetic considerations suggest that all the metal clusters interact with the BNF moiety in a favorable fashion. As a result of functionalization, the static (first) hyperpolarizability ([Formula: see text]) values of the metal cluster supported BNF moieties increase quite significantly as compared to that in the case of pristine BNF. Time dependent DFT analysis reveals that the metal clusters can lower the transition energies associated with the dominant electronic transitions quite significantly thereby enabling the metal cluster supported BNF moieties to exhibit significant non-linear optical activity. Moreover, the studied systems demonstrate broad band absorption capability spanning the UV-visible as well as infra-red domains. Energy decomposition analysis reveals that the electrostatic interactions principally stabilize the metal cluster supported BNF moieties.
Spatiotemporal analysis of dengue fever in Nepal from 2010 to 2014.

PubMed

Acharya, Bipin Kumar; Cao, ChunXiang; Lakes, Tobia; Chen, Wei; Naeem, Shahid

2016-08-22

Due to recent emergence, dengue is becoming one of the major public health problems in Nepal. The numbers of reported dengue cases in general and the area with reported dengue cases are both continuously increasing in recent years. However, spatiotemporal patterns and clusters of dengue have not been investigated yet. This study aims to fill this gap by analyzing spatiotemporal patterns based on monthly surveillance data aggregated at district. Dengue cases from 2010 to 2014 at district level were collected from the Nepal government's health and mapping agencies respectively. GeoDa software was used to map crude incidence, excess hazard and spatially smoothed incidence. Cluster analysis was performed in SaTScan software to explore spatiotemporal clusters of dengue during the above-mentioned time period. Spatiotemporal distribution of dengue fever in Nepal from 2010 to 2014 was mapped at district level in terms of crude incidence, excess risk and spatially smoothed incidence. Results show that the distribution of dengue fever was not random but clustered in space and time. Chitwan district was identified as the most likely cluster and Jhapa district was the first secondary cluster in both spatial and spatiotemporal scan. July to September of 2010 was identified as a significant temporal cluster. This study assessed and mapped for the first time the spatiotemporal pattern of dengue fever in Nepal. Two districts namely Chitwan and Jhapa were found highly affected by dengue fever. The current study also demonstrated the importance of geospatial approach in epidemiological research. The initial result on dengue patterns and risk of this study may assist institutions and policy makers to develop better preventive strategies.
Spike sorting using locality preserving projection with gap statistics and landmark-based spectral clustering.

PubMed

Nguyen, Thanh; Khosravi, Abbas; Creighton, Douglas; Nahavandi, Saeid

2014-12-30

Understanding neural functions requires knowledge from analysing electrophysiological data. The process of assigning spikes of a multichannel signal into clusters, called spike sorting, is one of the important problems in such analysis. There have been various automated spike sorting techniques with both advantages and disadvantages regarding accuracy and computational costs. Therefore, developing spike sorting methods that are highly accurate and computationally inexpensive is always a challenge in the biomedical engineering practice. An automatic unsupervised spike sorting method is proposed in this paper. The method uses features extracted by the locality preserving projection (LPP) algorithm. These features afterwards serve as inputs for the landmark-based spectral clustering (LSC) method. Gap statistics (GS) is employed to evaluate the number of clusters before the LSC can be performed. The proposed LPP-LSC is highly accurate and computationally inexpensive spike sorting approach. LPP spike features are very discriminative; thereby boost the performance of clustering methods. Furthermore, the LSC method exhibits its efficiency when integrated with the cluster evaluator GS. The proposed method's accuracy is approximately 13% superior to that of the benchmark combination between wavelet transformation and superparamagnetic clustering (WT-SPC). Additionally, LPP-LSC computing time is six times less than that of the WT-SPC. LPP-LSC obviously demonstrates a win-win spike sorting solution meeting both accuracy and computational cost criteria. LPP and LSC are linear algorithms that help reduce computational burden and thus their combination can be applied into real-time spike analysis. Copyright © 2014 Elsevier B.V. All rights reserved.
An Efficient Data Compression Model Based on Spatial Clustering and Principal Component Analysis in Wireless Sensor Networks.

PubMed

Yin, Yihang; Liu, Fengzheng; Zhou, Xiang; Li, Quanzhong

2015-08-07

Wireless sensor networks (WSNs) have been widely used to monitor the environment, and sensors in WSNs are usually power constrained. Because inner-node communication consumes most of the power, efficient data compression schemes are needed to reduce the data transmission to prolong the lifetime of WSNs. In this paper, we propose an efficient data compression model to aggregate data, which is based on spatial clustering and principal component analysis (PCA). First, sensors with a strong temporal-spatial correlation are grouped into one cluster for further processing with a novel similarity measure metric. Next, sensor data in one cluster are aggregated in the cluster head sensor node, and an efficient adaptive strategy is proposed for the selection of the cluster head to conserve energy. Finally, the proposed model applies principal component analysis with an error bound guarantee to compress the data and retain the definite variance at the same time. Computer simulations show that the proposed model can greatly reduce communication and obtain a lower mean square error than other PCA-based algorithms.
Deconstructing Bipolar Disorder and Schizophrenia: A cross-diagnostic cluster analysis of cognitive phenotypes.

PubMed

Lee, Junghee; Rizzo, Shemra; Altshuler, Lori; Glahn, David C; Miklowitz, David J; Sugar, Catherine A; Wynn, Jonathan K; Green, Michael F

2017-02-01

Bipolar disorder (BD) and schizophrenia (SZ) show substantial overlap. It has been suggested that a subgroup of patients might contribute to these overlapping features. This study employed a cross-diagnostic cluster analysis to identify subgroups of individuals with shared cognitive phenotypes. 143 participants (68 BD patients, 39 SZ patients and 36 healthy controls) completed a battery of EEG and performance assessments on perception, nonsocial cognition and social cognition. A K-means cluster analysis was conducted with all participants across diagnostic groups. Clinical symptoms, functional capacity, and functional outcome were assessed in patients. A two-cluster solution across 3 groups was the most stable. One cluster including 44 BD patients, 31 controls and 5 SZ patients showed better cognition (High cluster) than the other cluster with 24 BD patients, 35 SZ patients and 5 controls (Low cluster). BD patients in the High cluster performed better than BD patients in the Low cluster across cognitive domains. Within each cluster, participants with different clinical diagnoses showed different profiles across cognitive domains. All patients are in the chronic phase and out of mood episode at the time of assessment and most of the assessment were behavioral measures. This study identified two clusters with shared cognitive phenotype profiles that were not proxies for clinical diagnoses. The finding of better social cognitive performance of BD patients than SZ patients in the Lowe cluster suggest that relatively preserved social cognition may be important to identify disease process distinct to each disorder. Copyright © 2016 Elsevier B.V. All rights reserved.
A hierarchical clustering scheme approach to assessment of IP-network traffic using detrended fluctuation analysis

NASA Astrophysics Data System (ADS)

Takuma, Takehisa; Masugi, Masao

2009-03-01

This paper presents an approach to the assessment of IP-network traffic in terms of the time variation of self-similarity. To get a comprehensive view in analyzing the degree of long-range dependence (LRD) of IP-network traffic, we use a hierarchical clustering scheme, which provides a way to classify high-dimensional data with a tree-like structure. Also, in the LRD-based analysis, we employ detrended fluctuation analysis (DFA), which is applicable to the analysis of long-range power-law correlations or LRD in non-stationary time-series signals. Based on sequential measurements of IP-network traffic at two locations, this paper derives corresponding values for the LRD-related parameter α that reflects the degree of LRD of measured data. In performing the hierarchical clustering scheme, we use three parameters: the α value, average throughput, and the proportion of network traffic that exceeds 80% of network bandwidth for each measured data set. We visually confirm that the traffic data can be classified in accordance with the network traffic properties, resulting in that the combined depiction of the LRD and other factors can give us an effective assessment of network conditions at different times.
EventThread: Visual Summarization and Stage Analysis of Event Sequence Data.

PubMed

Guo, Shunan; Xu, Ke; Zhao, Rongwen; Gotz, David; Zha, Hongyuan; Cao, Nan

2018-01-01

Event sequence data such as electronic health records, a person's academic records, or car service records, are ordered series of events which have occurred over a period of time. Analyzing collections of event sequences can reveal common or semantically important sequential patterns. For example, event sequence analysis might reveal frequently used care plans for treating a disease, typical publishing patterns of professors, and the patterns of service that result in a well-maintained car. It is challenging, however, to visually explore large numbers of event sequences, or sequences with large numbers of event types. Existing methods focus on extracting explicitly matching patterns of events using statistical analysis to create stages of event progression over time. However, these methods fail to capture latent clusters of similar but not identical evolutions of event sequences. In this paper, we introduce a novel visualization system named EventThread which clusters event sequences into threads based on tensor analysis and visualizes the latent stage categories and evolution patterns by interactively grouping the threads by similarity into time-specific clusters. We demonstrate the effectiveness of EventThread through usage scenarios in three different application domains and via interviews with an expert user.

Spatio-Temporal Epidemiology of Viral Hepatitis in China (2003-2015): Implications for Prevention and Control Policies.

PubMed

Zhu, Bin; Liu, Jinlin; Fu, Yang; Zhang, Bo; Mao, Ying

2018-04-02

Viral hepatitis, as one of the most serious notifiable infectious diseases in China, takes heavy tolls from the infected and causes a severe economic burden to society, yet few studies have systematically explored the spatio-temporal epidemiology of viral hepatitis in China. This study aims to explore, visualize and compare the epidemiologic trends and spatial changing patterns of different types of viral hepatitis (A, B, C, E and unspecified, based on the classification of CDC) at the provincial level in China. The growth rates of incidence are used and converted to box plots to visualize the epidemiologic trends, with the linear trend being tested by chi-square linear by linear association test. Two complementary spatial cluster methods are used to explore the overall agglomeration level and identify spatial clusters: spatial autocorrelation analysis (measured by global and local Moran's I) and space-time scan analysis. Based on the spatial autocorrelation analysis, the hotspots of hepatitis A remain relatively stable and gradually shrunk, with Yunnan and Sichuan successively moving out the high-high (HH) cluster area. The HH clustering feature of hepatitis B in China gradually disappeared with time. However, the HH cluster area of hepatitis C has gradually moved towards the west, while for hepatitis E, the provincial units around the Yangtze River Delta region have been revealing HH cluster features since 2005. The space-time scan analysis also indicates the distinct spatial changing patterns of different types of viral hepatitis in China. It is easy to conclude that there is no one-size-fits-all plan for the prevention and control of viral hepatitis in all the provincial units. An effective response requires a package of coordinated actions, which should vary across localities regarding the spatial-temporal epidemic dynamics of each type of virus and the specific conditions of each provincial unit.
Disease clusters, exact distributions of maxima, and P-values.

PubMed

Grimson, R C

1993-10-01

This paper presents combinatorial (exact) methods that are useful in the analysis of disease cluster data obtained from small environments, such as buildings and neighbourhoods. Maxwell-Boltzmann and Fermi-Dirac occupancy models are compared in terms of appropriateness of representation of disease incidence patterns (space and/or time) in these environments. The methods are illustrated by a statistical analysis of the incidence pattern of bone fractures in a setting wherein fracture clustering was alleged to be occurring. One of the methodological results derived in this paper is the exact distribution of the maximum cell frequency in occupancy models.
Optimizing R with SparkR on a commodity cluster for biomedical research.

PubMed

Sedlmayr, Martin; Würfl, Tobias; Maier, Christian; Häberle, Lothar; Fasching, Peter; Prokosch, Hans-Ulrich; Christoph, Jan

2016-12-01

Medical researchers are challenged today by the enormous amount of data collected in healthcare. Analysis methods such as genome-wide association studies (GWAS) are often computationally intensive and thus require enormous resources to be performed in a reasonable amount of time. While dedicated clusters and public clouds may deliver the desired performance, their use requires upfront financial efforts or anonymous data, which is often not possible for preliminary or occasional tasks. We explored the possibilities to build a private, flexible cluster for processing scripts in R based on commodity, non-dedicated hardware of our department. For this, a GWAS-calculation in R on a single desktop computer, a Message Passing Interface (MPI)-cluster, and a SparkR-cluster were compared with regards to the performance, scalability, quality, and simplicity. The original script had a projected runtime of three years on a single desktop computer. Optimizing the script in R already yielded a significant reduction in computing time (2 weeks). By using R-MPI and SparkR, we were able to parallelize the computation and reduce the time to less than three hours (2.6 h) on already available, standard office computers. While MPI is a proven approach in high-performance clusters, it requires rather static, dedicated nodes. SparkR and its Hadoop siblings allow for a dynamic, elastic environment with automated failure handling. SparkR also scales better with the number of nodes in the cluster than MPI due to optimized data communication. R is a popular environment for clinical data analysis. The new SparkR solution offers elastic resources and allows supporting big data analysis using R even on non-dedicated resources with minimal change to the original code. To unleash the full potential, additional efforts should be invested to customize and improve the algorithms, especially with regards to data distribution. Copyright © 2016 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.
Health-Related Quality of Life and Lifestyle Behavior Clusters in School-Aged Children from 12 Countries.

PubMed

Dumuid, Dorothea; Olds, Timothy; Lewis, Lucy K; Martin-Fernández, Josep Antoni; Katzmarzyk, Peter T; Barreira, Tiago; Broyles, Stephanie T; Chaput, Jean-Philippe; Fogelholm, Mikael; Hu, Gang; Kuriyan, Rebecca; Kurpad, Anura; Lambert, Estelle V; Maia, José; Matsudo, Victor; Onywera, Vincent O; Sarmiento, Olga L; Standage, Martyn; Tremblay, Mark S; Tudor-Locke, Catrine; Zhao, Pei; Gillison, Fiona; Maher, Carol

2017-04-01

To evaluate the relationship between children's lifestyles and health-related quality of life and to explore whether this relationship varies among children from different world regions. This study used cross-sectional data from the International Study of Childhood Obesity, Lifestyle and the Environment. Children (9-11 years) were recruited from sites in 12 nations (n = 5759). Clustering input variables were 24-hour accelerometry and self-reported diet and screen time. Health-related quality of life was self-reported with KIDSCREEN-10. Cluster analyses (using compositional analysis techniques) were performed on a site-wise basis. Lifestyle behavior cluster characteristics were compared between sites. The relationship between cluster membership and health-related quality of life was assessed with the use of linear models. Lifestyle behavior clusters were similar across the 12 sites, with clusters commonly characterized by (1) high physical activity (actives); (2) high sedentary behavior (sitters); (3) high screen time/unhealthy eating pattern (junk-food screenies); and (4) low screen time/healthy eating pattern and moderate physical activity/sedentary behavior (all-rounders). Health-related quality of life was greatest in the all-rounders cluster. Children from different world regions clustered into groups of similar lifestyle behaviors. Cluster membership was related to differing health-related quality of life, with children from the all-rounders cluster consistently reporting greatest health-related quality of life at sites around the world. Findings support the importance of a healthy combination of lifestyle behaviors in childhood: low screen time, healthy eating pattern, and balanced daily activity behaviors (physical activity and sedentary behavior). ClinicalTrials.gov: NCT01722500. Copyright © 2016 Elsevier Inc. All rights reserved.
Atlas-guided cluster analysis of large tractography datasets.

PubMed

Ros, Christian; Güllmar, Daniel; Stenzel, Martin; Mentzel, Hans-Joachim; Reichenbach, Jürgen Rainer

2013-01-01

Diffusion Tensor Imaging (DTI) and fiber tractography are important tools to map the cerebral white matter microstructure in vivo and to model the underlying axonal pathways in the brain with three-dimensional fiber tracts. As the fast and consistent extraction of anatomically correct fiber bundles for multiple datasets is still challenging, we present a novel atlas-guided clustering framework for exploratory data analysis of large tractography datasets. The framework uses an hierarchical cluster analysis approach that exploits the inherent redundancy in large datasets to time-efficiently group fiber tracts. Structural information of a white matter atlas can be incorporated into the clustering to achieve an anatomically correct and reproducible grouping of fiber tracts. This approach facilitates not only the identification of the bundles corresponding to the classes of the atlas; it also enables the extraction of bundles that are not present in the atlas. The new technique was applied to cluster datasets of 46 healthy subjects. Prospects of automatic and anatomically correct as well as reproducible clustering are explored. Reconstructed clusters were well separated and showed good correspondence to anatomical bundles. Using the atlas-guided cluster approach, we observed consistent results across subjects with high reproducibility. In order to investigate the outlier elimination performance of the clustering algorithm, scenarios with varying amounts of noise were simulated and clustered with three different outlier elimination strategies. By exploiting the multithreading capabilities of modern multiprocessor systems in combination with novel algorithms, our toolkit clusters large datasets in a couple of minutes. Experiments were conducted to investigate the achievable speedup and to demonstrate the high performance of the clustering framework in a multiprocessing environment.
A stellar census in globular clusters with MUSE: The contribution of rotation to cluster dynamics studied with 200 000 stars

NASA Astrophysics Data System (ADS)

Kamann, S.; Husser, T.-O.; Dreizler, S.; Emsellem, E.; Weilbacher, P. M.; Martens, S.; Bacon, R.; den Brok, M.; Giesers, B.; Krajnović, D.; Roth, M. M.; Wendt, M.; Wisotzki, L.

2018-02-01

This is the first of a series of papers presenting the results from our survey of 25 Galactic globular clusters with the MUSE integral-field spectrograph. In combination with our dedicated algorithm for source deblending, MUSE provides unique multiplex capabilities in crowded stellar fields and allows us to acquire samples of up to 20 000 stars within the half-light radius of each cluster. The present paper focuses on the analysis of the internal dynamics of 22 out of the 25 clusters, using about 500 000 spectra of 200 000 individual stars. Thanks to the large stellar samples per cluster, we are able to perform a detailed analysis of the central rotation and dispersion fields using both radial profiles and two-dimensional maps. The velocity dispersion profiles we derive show a good general agreement with existing radial velocity studies but typically reach closer to the cluster centres. By comparison with proper motion data, we derive or update the dynamical distance estimates to 14 clusters. Compared to previous dynamical distance estimates for 47 Tuc, our value is in much better agreement with other methods. We further find significant (>3σ) rotation in the majority (13/22) of our clusters. Our analysis seems to confirm earlier findings of a link between rotation and the ellipticities of globular clusters. In addition, we find a correlation between the strengths of internal rotation and the relaxation times of the clusters, suggesting that the central rotation fields are relics of the cluster formation that are gradually dissipated via two-body relaxation.
The smart cluster method. Adaptive earthquake cluster identification and analysis in strong seismic regions

NASA Astrophysics Data System (ADS)

Schaefer, Andreas M.; Daniell, James E.; Wenzel, Friedemann

2017-07-01

Earthquake clustering is an essential part of almost any statistical analysis of spatial and temporal properties of seismic activity. The nature of earthquake clusters and subsequent declustering of earthquake catalogues plays a crucial role in determining the magnitude-dependent earthquake return period and its respective spatial variation for probabilistic seismic hazard assessment. This study introduces the Smart Cluster Method (SCM), a new methodology to identify earthquake clusters, which uses an adaptive point process for spatio-temporal cluster identification. It utilises the magnitude-dependent spatio-temporal earthquake density to adjust the search properties, subsequently analyses the identified clusters to determine directional variation and adjusts its search space with respect to directional properties. In the case of rapid subsequent ruptures like the 1992 Landers sequence or the 2010-2011 Darfield-Christchurch sequence, a reclassification procedure is applied to disassemble subsequent ruptures using near-field searches, nearest neighbour classification and temporal splitting. The method is capable of identifying and classifying earthquake clusters in space and time. It has been tested and validated using earthquake data from California and New Zealand. A total of more than 1500 clusters have been found in both regions since 1980 with M m i n = 2.0. Utilising the knowledge of cluster classification, the method has been adjusted to provide an earthquake declustering algorithm, which has been compared to existing methods. Its performance is comparable to established methodologies. The analysis of earthquake clustering statistics lead to various new and updated correlation functions, e.g. for ratios between mainshock and strongest aftershock and general aftershock activity metrics.
Modeling Uncertainties in EEG Microstates: Analysis of Real and Imagined Motor Movements Using Probabilistic Clustering-Driven Training of Probabilistic Neural Networks.

PubMed

Dinov, Martin; Leech, Robert

2017-01-01

Part of the process of EEG microstate estimation involves clustering EEG channel data at the global field power (GFP) maxima, very commonly using a modified K-means approach. Clustering has also been done deterministically, despite there being uncertainties in multiple stages of the microstate analysis, including the GFP peak definition, the clustering itself and in the post-clustering assignment of microstates back onto the EEG timecourse of interest. We perform a fully probabilistic microstate clustering and labeling, to account for these sources of uncertainty using the closest probabilistic analog to KM called Fuzzy C-means (FCM). We train softmax multi-layer perceptrons (MLPs) using the KM and FCM-inferred cluster assignments as target labels, to then allow for probabilistic labeling of the full EEG data instead of the usual correlation-based deterministic microstate label assignment typically used. We assess the merits of the probabilistic analysis vs. the deterministic approaches in EEG data recorded while participants perform real or imagined motor movements from a publicly available data set of 109 subjects. Though FCM group template maps that are almost topographically identical to KM were found, there is considerable uncertainty in the subsequent assignment of microstate labels. In general, imagined motor movements are less predictable on a time point-by-time point basis, possibly reflecting the more exploratory nature of the brain state during imagined, compared to during real motor movements. We find that some relationships may be more evident using FCM than using KM and propose that future microstate analysis should preferably be performed probabilistically rather than deterministically, especially in situations such as with brain computer interfaces, where both training and applying models of microstates need to account for uncertainty. Probabilistic neural network-driven microstate assignment has a number of advantages that we have discussed, which are likely to be further developed and exploited in future studies. In conclusion, probabilistic clustering and a probabilistic neural network-driven approach to microstate analysis is likely to better model and reveal details and the variability hidden in current deterministic and binarized microstate assignment and analyses.
Modeling Uncertainties in EEG Microstates: Analysis of Real and Imagined Motor Movements Using Probabilistic Clustering-Driven Training of Probabilistic Neural Networks

PubMed Central

Dinov, Martin; Leech, Robert

2017-01-01

Part of the process of EEG microstate estimation involves clustering EEG channel data at the global field power (GFP) maxima, very commonly using a modified K-means approach. Clustering has also been done deterministically, despite there being uncertainties in multiple stages of the microstate analysis, including the GFP peak definition, the clustering itself and in the post-clustering assignment of microstates back onto the EEG timecourse of interest. We perform a fully probabilistic microstate clustering and labeling, to account for these sources of uncertainty using the closest probabilistic analog to KM called Fuzzy C-means (FCM). We train softmax multi-layer perceptrons (MLPs) using the KM and FCM-inferred cluster assignments as target labels, to then allow for probabilistic labeling of the full EEG data instead of the usual correlation-based deterministic microstate label assignment typically used. We assess the merits of the probabilistic analysis vs. the deterministic approaches in EEG data recorded while participants perform real or imagined motor movements from a publicly available data set of 109 subjects. Though FCM group template maps that are almost topographically identical to KM were found, there is considerable uncertainty in the subsequent assignment of microstate labels. In general, imagined motor movements are less predictable on a time point-by-time point basis, possibly reflecting the more exploratory nature of the brain state during imagined, compared to during real motor movements. We find that some relationships may be more evident using FCM than using KM and propose that future microstate analysis should preferably be performed probabilistically rather than deterministically, especially in situations such as with brain computer interfaces, where both training and applying models of microstates need to account for uncertainty. Probabilistic neural network-driven microstate assignment has a number of advantages that we have discussed, which are likely to be further developed and exploited in future studies. In conclusion, probabilistic clustering and a probabilistic neural network-driven approach to microstate analysis is likely to better model and reveal details and the variability hidden in current deterministic and binarized microstate assignment and analyses. PMID:29163110
Lifestyle Patterns and Weight Status in Spanish Adults: The ANIBES Study

PubMed Central

Pérez-Rodrigo, Carmen; Gianzo-Citores, Marta; Gil, Ángel; González-Gross, Marcela; Ortega, Rosa M.; Serra-Majem, Lluis; Varela-Moreiras, Gregorio; Aranceta-Bartrina, Javier

2017-01-01

Limited knowledge is available on lifestyle patterns in Spanish adults. We investigated dietary patterns and possible meaningful clustering of physical activity, sedentary behavior, sleep time, and smoking in Spanish adults aged 18–64 years and their association with obesity. Analysis was based on a subsample (n = 1617) of the cross-sectional ANIBES study in Spain. We performed exploratory factor analysis and subsequent cluster analysis of dietary patterns, physical activity, sedentary behaviors, sleep time, and smoking. Logistic regression analysis was used to explore the association between the cluster solutions and obesity. Factor analysis identified four dietary patterns, “Traditional DP”, “Mediterranean DP”, “Snack DP” and “Dairy-sweet DP”. Dietary patterns, physical activity behaviors, sedentary behaviors, sleep time, and smoking in Spanish adults aggregated into three different clusters of lifestyle patterns: “Mixed diet-physically active-low sedentary lifestyle pattern”, “Not poor diet-low physical activity-low sedentary lifestyle pattern” and “Poor diet-low physical activity-sedentary lifestyle pattern”. A higher proportion of people aged 18–30 years was classified into the “Poor diet-low physical activity-sedentary lifestyle pattern”. The prevalence odds ratio for obesity in men in the “Mixed diet-physically active-low sedentary lifestyle pattern” was significantly lower compared to those in the “Poor diet-low physical activity-sedentary lifestyle pattern”. Those behavior patterns are helpful to identify specific issues in population subgroups and inform intervention strategies. The findings in this study underline the importance of designing and implementing interventions that address multiple health risk practices, considering lifestyle patterns and associated determinants. PMID:28613259
Waiting-time distributions of magnetic discontinuities: clustering or Poisson process?

PubMed

Greco, A; Matthaeus, W H; Servidio, S; Dmitruk, P

2009-10-01

Using solar wind data from the Advanced Composition Explorer spacecraft, with the support of Hall magnetohydrodynamic simulations, the waiting-time distributions of magnetic discontinuities have been analyzed. A possible phenomenon of clusterization of these discontinuities is studied in detail. We perform a local Poisson's analysis in order to establish if these intermittent events are randomly distributed or not. Possible implications about the nature of solar wind discontinuities are discussed.
Waiting-time distributions of magnetic discontinuities: Clustering or Poisson process?

DOE Office of Scientific and Technical Information (OSTI.GOV)

Greco, A.; Matthaeus, W. H.; Servidio, S.

2009-10-15

Using solar wind data from the Advanced Composition Explorer spacecraft, with the support of Hall magnetohydrodynamic simulations, the waiting-time distributions of magnetic discontinuities have been analyzed. A possible phenomenon of clusterization of these discontinuities is studied in detail. We perform a local Poisson's analysis in order to establish if these intermittent events are randomly distributed or not. Possible implications about the nature of solar wind discontinuities are discussed.
Conveyor Performance based on Motor DC 12 Volt Eg-530ad-2f using K-Means Clustering

NASA Astrophysics Data System (ADS)

Arifin, Zaenal; Artini, Sri DP; Much Ibnu Subroto, Imam

2017-04-01

To produce goods in industry, a controlled tool to improve production is required. Separation process has become a part of production process. Separation process is carried out based on certain criteria to get optimum result. By knowing the characteristics performance of a controlled tools in separation process the optimum results is also possible to be obtained. Clustering analysis is popular method for clustering data into smaller segments. Clustering analysis is useful to divide a group of object into a k-group in which the member value of the group is homogeny or similar. Similarity in the group is set based on certain criteria. The work in this paper based on K-Means method to conduct clustering of loading in the performance of a conveyor driven by a dc motor 12 volt eg-530-2f. This technique gives a complete clustering data for a prototype of conveyor driven by dc motor to separate goods in term of height. The parameters involved are voltage, current, time of travelling. These parameters give two clusters namely optimal cluster with center of cluster 10.50 volt, 0.3 Ampere, 10.58 second, and unoptimal cluster with center of cluster 10.88 volt, 0.28 Ampere and 40.43 second.
Patterns of time use among low-income urban minority adolescents and associations with academic outcomes and problem behaviors.

PubMed

Wolf, Sharon; Aber, J Lawrence; Morris, Pamela A

2015-06-01

Time budgets represent key opportunities for developmental support and contribute to an understanding of achievement gaps and adjustment across populations of youth. This study assessed the connection between out-of-school time use patterns and academic performance outcomes, academic motivations and goals, and problem behaviors for 504 low-income urban African American and Latino adolescents (54% female; M = 16.6 years). Time use patterns were measured across eight activity types using cluster analysis. Four groups of adolescents were identified, based on their different profiles of time use: (1) Academic: those with most time in academic activities; (2) Social: those with most time in social activities; (3) Maintenance/work: those with most time in maintenance and work activities; and (4) TV/computer: those with most time in TV or computer activities. Time use patterns were meaningfully associated with variation in outcomes in this population. Adolescents in the Academic cluster had the highest levels of adjustment across all domains; adolescents in the Social cluster had the lowest academic performance and highest problem behaviors; and adolescents in the TV/computer cluster had the lowest levels of intrinsic motivation. Females were more likely to be in the Academic cluster, and less likely to be in the other three clusters compared to males. No differences by race or gender were found in assessing the relationship between time use and outcomes. The study's results indicate that time use patterns are meaningfully associated with within-group variation in adjustment for low-income minority adolescents, and that shared contexts may shape time use more than individual differences in race/ethnicity for this population.
Genome-wide DNA methylation analysis reveals estrogen-mediated epigenetic repression of metallothionein-1 gene cluster in breast cancer.

PubMed

Jadhav, Rohit R; Ye, Zhenqing; Huang, Rui-Lan; Liu, Joseph; Hsu, Pei-Yin; Huang, Yi-Wen; Rangel, Leticia B; Lai, Hung-Cheng; Roa, Juan Carlos; Kirma, Nameer B; Huang, Tim Hui-Ming; Jin, Victor X

2015-01-01

Recent genome-wide analysis has shown that DNA methylation spans long stretches of chromosome regions consisting of clusters of contiguous CpG islands or gene families. Hypermethylation of various gene clusters has been reported in many types of cancer. In this study, we conducted methyl-binding domain capture (MBDCap) sequencing (MBD-seq) analysis on a breast cancer cohort consisting of 77 patients and 10 normal controls, as well as a panel of 38 breast cancer cell lines. Bioinformatics analysis determined seven gene clusters with a significant difference in overall survival (OS) and further revealed a distinct feature that the conservation of a large gene cluster (approximately 70 kb) metallothionein-1 (MT1) among 45 species is much lower than the average of all RefSeq genes. Furthermore, we found that DNA methylation is an important epigenetic regulator contributing to gene repression of MT1 gene cluster in both ERα positive (ERα+) and ERα negative (ERα-) breast tumors. In silico analysis revealed much lower gene expression of this cluster in The Cancer Genome Atlas (TCGA) cohort for ERα + tumors. To further investigate the role of estrogen, we conducted 17β-estradiol (E2) and demethylating agent 5-aza-2'-deoxycytidine (DAC) treatment in various breast cancer cell types. Cell proliferation and invasion assays suggested MT1F and MT1M may play an anti-oncogenic role in breast cancer. Our data suggests that DNA methylation in large contiguous gene clusters can be potential prognostic markers of breast cancer. Further investigation of these clusters revealed that estrogen mediates epigenetic repression of MT1 cluster in ERα + breast cancer cell lines. In all, our studies identify thousands of breast tumor hypermethylated regions for the first time, in particular, discovering seven large contiguous hypermethylated gene clusters.
Generating a Magellanic star cluster catalog with ASteCA

NASA Astrophysics Data System (ADS)

Perren, G. I.; Piatti, A. E.; Vázquez, R. A.

2016-08-01

An increasing number of software tools have been employed in the recent years for the automated or semi-automated processing of astronomical data. The main advantages of using these tools over a standard by-eye analysis include: speed (particularly for large databases), homogeneity, reproducibility, and precision. At the same time, they enable a statistically correct study of the uncertainties associated with the analysis, in contrast with manually set errors, or the still widespread practice of simply not assigning errors. We present a catalog comprising 210 star clusters located in the Large and Small Magellanic Clouds, observed with Washington photometry. Their fundamental parameters were estimated through an homogeneous, automatized and completely unassisted process, via the Automated Stellar Cluster Analysis package ( ASteCA). Our results are compared with two types of studies on these clusters: one where the photometry is the same, and another where the photometric system is different than that employed by ASteCA.
Delineation of Stenotrophomonas maltophilia isolates from cystic fibrosis patients by fatty acid methyl ester profiles and matrix-assisted laser desorption/ionization time-of-flight mass spectra using hierarchical cluster analysis and principal component analysis.

PubMed

Vidigal, Pedrina Gonçalves; Mosel, Frank; Koehling, Hedda Luise; Mueller, Karl Dieter; Buer, Jan; Rath, Peter Michael; Steinmann, Joerg

2014-12-01

Stenotrophomonas maltophilia is an opportunist multidrug-resistant pathogen that causes a wide range of nosocomial infections. Various cystic fibrosis (CF) centres have reported an increasing prevalence of S. maltophilia colonization/infection among patients with this disease. The purpose of this study was to assess specific fingerprints of S. maltophilia isolates from CF patients (n = 71) by investigating fatty acid methyl esters (FAMEs) through gas chromatography (GC) and highly abundant proteins by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS), and to compare them with isolates obtained from intensive care unit (ICU) patients (n = 20) and the environment (n = 11). Principal component analysis (PCA) of GC-FAME patterns did not reveal a clustering corresponding to distinct CF, ICU or environmental types. Based on the peak area index, it was observed that S. maltophilia isolates from CF patients produced significantly higher amounts of fatty acids in comparison with ICU patients and the environmental isolates. Hierarchical cluster analysis (HCA) based on the MALDI-TOF MS peak profiles of S. maltophilia revealed the presence of five large clusters, suggesting a high phenotypic diversity. Although HCA of MALDI-TOF mass spectra did not result in distinct clusters predominantly composed of CF isolates, PCA revealed the presence of a distinct cluster composed of S. maltophilia isolates from CF patients. Our data suggest that S. maltophilia colonizing CF patients tend to modify not only their fatty acid patterns but also their protein patterns as a response to adaptation in the unfavourable environment of the CF lung. © 2014 The Authors.
A hybrid clustering approach for multivariate time series - A case study applied to failure analysis in a gas turbine.

PubMed

Fontes, Cristiano Hora; Budman, Hector

2017-11-01

A clustering problem involving multivariate time series (MTS) requires the selection of similarity metrics. This paper shows the limitations of the PCA similarity factor (SPCA) as a single metric in nonlinear problems where there are differences in magnitude of the same process variables due to expected changes in operation conditions. A novel method for clustering MTS based on a combination between SPCA and the average-based Euclidean distance (AED) within a fuzzy clustering approach is proposed. Case studies involving either simulated or real industrial data collected from a large scale gas turbine are used to illustrate that the hybrid approach enhances the ability to recognize normal and fault operating patterns. This paper also proposes an oversampling procedure to create synthetic multivariate time series that can be useful in commonly occurring situations involving unbalanced data sets. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.
Dietary patterns by cluster analysis in pregnant women: relationship with nutrient intakes and dietary patterns in 7-year-old offspring.

PubMed

Freitas-Vilela, Ana Amélia; Smith, Andrew D A C; Kac, Gilberto; Pearson, Rebecca M; Heron, Jon; Emond, Alan; Hibbeln, Joseph R; Castro, Maria Beatriz Trindade; Emmett, Pauline M

2017-04-01

Little is known about how dietary patterns of mothers and their children track over time. The objectives of this study are to obtain dietary patterns in pregnancy using cluster analysis, to examine women's mean nutrient intakes in each cluster and to compare the dietary patterns of mothers to those of their children. Pregnant women (n = 12 195) from the Avon Longitudinal Study of Parents and Children reported their frequency of consumption of 47 foods and food groups. These data were used to obtain dietary patterns during pregnancy by cluster analysis. The absolute and energy-adjusted nutrient intakes were compared between clusters. Women's dietary patterns were compared with previously derived clusters of their children at 7 years of age. Multinomial logistic regression was performed to evaluate relationships comparing maternal and offspring clusters. Three maternal clusters were identified: 'fruit and vegetables', 'meat and potatoes' and 'white bread and coffee'. After energy adjustment women in the 'fruit and vegetables' cluster had the highest mean nutrient intakes. Mothers in the 'fruit and vegetables' cluster were more likely than mothers in 'meat and potatoes' (adjusted odds ratio [OR]: 2.00; 95% Confidence Interval [CI]: 1.69-2.36) or 'white bread and coffee' (OR: 2.18; 95% CI: 1.87-2.53) clusters to have children in a 'plant-based' cluster. However the majority of children were in clusters unrelated to their mother dietary pattern. Three distinct dietary patterns were obtained in pregnancy; the 'fruit and vegetables' pattern being the most nutrient dense. Mothers' dietary patterns were associated with but did not dominate offspring dietary patterns. © 2016 The Authors. Maternal & Child Nutrition published by John Wiley & Sons Ltd.
Principal component analysis vs. self-organizing maps combined with hierarchical clustering for pattern recognition in volcano seismic spectra

NASA Astrophysics Data System (ADS)

Unglert, K.; Radić, V.; Jellinek, A. M.

2016-06-01

Variations in the spectral content of volcano seismicity related to changes in volcanic activity are commonly identified manually in spectrograms. However, long time series of monitoring data at volcano observatories require tools to facilitate automated and rapid processing. Techniques such as self-organizing maps (SOM) and principal component analysis (PCA) can help to quickly and automatically identify important patterns related to impending eruptions. For the first time, we evaluate the performance of SOM and PCA on synthetic volcano seismic spectra constructed from observations during two well-studied eruptions at Klauea Volcano, Hawai'i, that include features observed in many volcanic settings. In particular, our objective is to test which of the techniques can best retrieve a set of three spectral patterns that we used to compose a synthetic spectrogram. We find that, without a priori knowledge of the given set of patterns, neither SOM nor PCA can directly recover the spectra. We thus test hierarchical clustering, a commonly used method, to investigate whether clustering in the space of the principal components and on the SOM, respectively, can retrieve the known patterns. Our clustering method applied to the SOM fails to detect the correct number and shape of the known input spectra. In contrast, clustering of the data reconstructed by the first three PCA modes reproduces these patterns and their occurrence in time more consistently. This result suggests that PCA in combination with hierarchical clustering is a powerful practical tool for automated identification of characteristic patterns in volcano seismic spectra. Our results indicate that, in contrast to PCA, common clustering algorithms may not be ideal to group patterns on the SOM and that it is crucial to evaluate the performance of these tools on a control dataset prior to their application to real data.

Clustering of energy balance-related behaviors and parental education in European children: the ENERGY-project.

PubMed

Fernández-Alvira, Juan M; De Bourdeaudhuij, Ilse; Singh, Amika S; Vik, Frøydis N; Manios, Yannis; Kovacs, Eva; Jan, Natasa; Brug, Johannes; Moreno, Luis A

2013-01-15

Recent research and literature reviews show that, among schoolchildren, some specific energy balance-related behaviors (EBRBs) are relevant for overweight and obesity prevention. It is also well known that the prevalence of overweight and obesity is considerably higher among schoolchildren from lower socio-economic backgrounds. This study examines whether sugared drinks intake, physical activity, screen time and usual sleep duration cluster in reliable and meaningful ways among European children, and whether the identified clusters could be characterized by parental education. The cross-sectional study comprised a total of 5284 children (46% male), from seven European countries participating in the ENERGY-project ("EuropeaN Energy balance Research to prevent excessive weight Gain among Youth"). Information on sugared drinks intake, physical activity, screen time and usual sleep duration was obtained using validated self-report questionnaires. Based on these behaviors, gender-specific cluster analysis was performed. Associations with parental education were identified using chi-square tests and odds ratios. Five meaningful and stable clusters were found for both genders. The cluster with high physical activity level showed the highest proportion of participants with highly educated parents, while clusters with high sugared drinks consumption, high screen time and low sleep duration were more prevalent in the group with lower educated parents. Odds ratio showed that children with lower educated parents were less likely to be allocated in the active cluster and more likely to be allocated in the low activity/sedentary pattern cluster. Children with lower educated parents seemed to be more likely to present unhealthier EBRBs clustering, mainly characterized by their self-reported time spent on physical activity and screen viewing. Therefore, special focus should be given to lower educated parents and their children in order to develop effective primary prevention strategies.
Gaussian mixture clustering and imputation of microarray data.

PubMed

Ouyang, Ming; Welsh, William J; Georgopoulos, Panos

2004-04-12

In microarray experiments, missing entries arise from blemishes on the chips. In large-scale studies, virtually every chip contains some missing entries and more than 90% of the genes are affected. Many analysis methods require a full set of data. Either those genes with missing entries are excluded, or the missing entries are filled with estimates prior to the analyses. This study compares methods of missing value estimation. Two evaluation metrics of imputation accuracy are employed. First, the root mean squared error measures the difference between the true values and the imputed values. Second, the number of mis-clustered genes measures the difference between clustering with true values and that with imputed values; it examines the bias introduced by imputation to clustering. The Gaussian mixture clustering with model averaging imputation is superior to all other imputation methods, according to both evaluation metrics, on both time-series (correlated) and non-time series (uncorrelated) data sets.
SAR image change detection using watershed and spectral clustering

NASA Astrophysics Data System (ADS)

Niu, Ruican; Jiao, L. C.; Wang, Guiting; Feng, Jie

2011-12-01

A new method of change detection in SAR images based on spectral clustering is presented in this paper. Spectral clustering is employed to extract change information from a pair images acquired on the same geographical area at different time. Watershed transform is applied to initially segment the big image into non-overlapped local regions, leading to reduce the complexity. Experiments results and system analysis confirm the effectiveness of the proposed algorithm.
Clustering change patterns using Fourier transformation with time-course gene expression data.

PubMed

Kim, Jaehee

2011-01-01

To understand the behavior of genes, it is important to explore how the patterns of gene expression change over a period of time because biologically related gene groups can share the same change patterns. In this study, the problem of finding similar change patterns is induced to clustering with the derivative Fourier coefficients. This work is aimed at discovering gene groups with similar change patterns which share similar biological properties. We developed a statistical model using derivative Fourier coefficients to identify similar change patterns of gene expression. We used a model-based method to cluster the Fourier series estimation of derivatives. We applied our model to cluster change patterns of yeast cell cycle microarray expression data with alpha-factor synchronization. It showed that, as the method clusters with the probability-neighboring data, the model-based clustering with our proposed model yielded biologically interpretable results. We expect that our proposed Fourier analysis with suitably chosen smoothing parameters could serve as a useful tool in classifying genes and interpreting possible biological change patterns.
Cluster: A New Application for Spatial Analysis of Pixelated Data for Epiphytotics.

PubMed

Nelson, Scot C; Corcoja, Iulian; Pethybridge, Sarah J

2017-12-01

Spatial analysis of epiphytotics is essential to develop and test hypotheses about pathogen ecology, disease dynamics, and to optimize plant disease management strategies. Data collection for spatial analysis requires substantial investment in time to depict patterns in various frames and hierarchies. We developed a new approach for spatial analysis of pixelated data in digital imagery and incorporated the method in a stand-alone desktop application called Cluster. The user isolates target entities (clusters) by designating up to 24 pixel colors as nontargets and moves a threshold slider to visualize the targets. The app calculates the percent area occupied by targeted pixels, identifies the centroids of targeted clusters, and computes the relative compass angle of orientation for each cluster. Users can deselect anomalous clusters manually and/or automatically by specifying a size threshold value to exclude smaller targets from the analysis. Up to 1,000 stochastic simulations randomly place the centroids of each cluster in ranked order of size (largest to smallest) within each matrix while preserving their calculated angles of orientation for the long axes. A two-tailed probability t test compares the mean inter-cluster distances for the observed versus the values derived from randomly simulated maps. This is the basis for statistical testing of the null hypothesis that the clusters are randomly distributed within the frame of interest. These frames can assume any shape, from natural (e.g., leaf) to arbitrary (e.g., a rectangular or polygonal field). Cluster summarizes normalized attributes of clusters, including pixel number, axis length, axis width, compass orientation, and the length/width ratio, available to the user as a downloadable spreadsheet. Each simulated map may be saved as an image and inspected. Provided examples demonstrate the utility of Cluster to analyze patterns at various spatial scales in plant pathology and ecology and highlight the limitations, trade-offs, and considerations for the sensitivities of variables and the biological interpretations of results. The Cluster app is available as a free download for Apple computers at iTunes, with a link to a user guide website.
Effects of Dexamethasone and Placebo on Symptom Clusters in Advanced Cancer Patients: A Preliminary Report.

PubMed

Yennurajalingam, Sriram; Williams, Janet L; Chisholm, Gary; Bruera, Eduardo

2016-03-01

Advanced cancer patients frequently experience debilitating symptoms that occur in clusters, but few pharmacological studies have targeted symptom clusters. Our objective was to examine the effects of dexamethasone on symptom clusters in patients with advanced cancer. We reviewed the data from a previous randomized clinical trial to determine the effects of dexamethasone on cancer symptoms. Symptom clusters were identified according to baseline symptoms by using principal component analysis. Correlations and change in the severity of symptom clusters were analyzed after study treatment. A total of 114 participants were included in this study. Three clusters were identified: fatigue/anorexia-cachexia/depression (FAD), sleep/anxiety/drowsiness (SAD), and pain/dyspnea (PD). Changes in severity of FAD and PD significantly correlated over time (at baseline, day 8, and day 15). The FAD cluster was associated with significant improvement in severity at day 8 and day 15, whereas no significant change was observed with the SAD cluster or PD cluster after dexamethasone treatment. The results of this preliminary study suggest significant correlation over time and improvement in the FAD cluster at day 8 and day 15 after treatment with dexamethasone. These findings suggest that fatigue, anorexia-cachexia, and depression may share a common pathophysiologic basis. Further studies are needed to investigate this cluster and target anti-inflammatory therapies. ©AlphaMed Press.
Advanced analysis of forest fire clustering

NASA Astrophysics Data System (ADS)

Kanevski, Mikhail; Pereira, Mario; Golay, Jean

2017-04-01

Analysis of point pattern clustering is an important topic in spatial statistics and for many applications: biodiversity, epidemiology, natural hazards, geomarketing, etc. There are several fundamental approaches used to quantify spatial data clustering using topological, statistical and fractal measures. In the present research, the recently introduced multi-point Morisita index (mMI) is applied to study the spatial clustering of forest fires in Portugal. The data set consists of more than 30000 fire events covering the time period from 1975 to 2013. The distribution of forest fires is very complex and highly variable in space. mMI is a multi-point extension of the classical two-point Morisita index. In essence, mMI is estimated by covering the region under study by a grid and by computing how many times more likely it is that m points selected at random will be from the same grid cell than it would be in the case of a complete random Poisson process. By changing the number of grid cells (size of the grid cells), mMI characterizes the scaling properties of spatial clustering. From mMI, the data intrinsic dimension (fractal dimension) of the point distribution can be estimated as well. In this study, the mMI of forest fires is compared with the mMI of random patterns (RPs) generated within the validity domain defined as the forest area of Portugal. It turns out that the forest fires are highly clustered inside the validity domain in comparison with the RPs. Moreover, they demonstrate different scaling properties at different spatial scales. The results obtained from the mMI analysis are also compared with those of fractal measures of clustering - box counting and sand box counting approaches. REFERENCES Golay J., Kanevski M., Vega Orozco C., Leuenberger M., 2014: The multipoint Morisita index for the analysis of spatial patterns. Physica A, 406, 191-202. Golay J., Kanevski M. 2015: A new estimator of intrinsic dimension based on the multipoint Morisita index. Pattern Recognition, 48, 4070-4081.
Sleep stages identification in patients with sleep disorder using k-means clustering

NASA Astrophysics Data System (ADS)

Fadhlullah, M. U.; Resahya, A.; Nugraha, D. F.; Yulita, I. N.

2018-05-01

Data mining is a computational intelligence discipline where a large dataset processed using a certain method to look for patterns within the large dataset. This pattern then used for real time application or to develop some certain knowledge. This is a valuable tool to solve a complex problem, discover new knowledge, data analysis and decision making. To be able to get the pattern that lies inside the large dataset, clustering method is used to get the pattern. Clustering is basically grouping data that looks similar so a certain pattern can be seen in the large data set. Clustering itself has several algorithms to group the data into the corresponding cluster. This research used data from patients who suffer sleep disorders and aims to help people in the medical world to reduce the time required to classify the sleep stages from a patient who suffers from sleep disorders. This study used K-Means algorithm and silhouette evaluation to find out that 3 clusters are the optimal cluster for this dataset which means can be divided to 3 sleep stages.
Clustering analysis of moving target signatures

NASA Astrophysics Data System (ADS)

Martone, Anthony; Ranney, Kenneth; Innocenti, Roberto

2010-04-01

Previously, we developed a moving target indication (MTI) processing approach to detect and track slow-moving targets inside buildings, which successfully detected moving targets (MTs) from data collected by a low-frequency, ultra-wideband radar. Our MTI algorithms include change detection, automatic target detection (ATD), clustering, and tracking. The MTI algorithms can be implemented in a real-time or near-real-time system; however, a person-in-the-loop is needed to select input parameters for the clustering algorithm. Specifically, the number of clusters to input into the cluster algorithm is unknown and requires manual selection. A critical need exists to automate all aspects of the MTI processing formulation. In this paper, we investigate two techniques that automatically determine the number of clusters: the adaptive knee-point (KP) algorithm and the recursive pixel finding (RPF) algorithm. The KP algorithm is based on a well-known heuristic approach for determining the number of clusters. The RPF algorithm is analogous to the image processing, pixel labeling procedure. Both algorithms are used to analyze the false alarm and detection rates of three operational scenarios of personnel walking inside wood and cinderblock buildings.
Application of Geostatistical Methods and Machine Learning for spatio-temporal Earthquake Cluster Analysis

NASA Astrophysics Data System (ADS)

Schaefer, A. M.; Daniell, J. E.; Wenzel, F.

2014-12-01

Earthquake clustering tends to be an increasingly important part of general earthquake research especially in terms of seismic hazard assessment and earthquake forecasting and prediction approaches. The distinct identification and definition of foreshocks, aftershocks, mainshocks and secondary mainshocks is taken into account using a point based spatio-temporal clustering algorithm originating from the field of classic machine learning. This can be further applied for declustering purposes to separate background seismicity from triggered seismicity. The results are interpreted and processed to assemble 3D-(x,y,t) earthquake clustering maps which are based on smoothed seismicity records in space and time. In addition, multi-dimensional Gaussian functions are used to capture clustering parameters for spatial distribution and dominant orientations. Clusters are further processed using methodologies originating from geostatistics, which have been mostly applied and developed in mining projects during the last decades. A 2.5D variogram analysis is applied to identify spatio-temporal homogeneity in terms of earthquake density and energy output. The results are mitigated using Kriging to provide an accurate mapping solution for clustering features. As a case study, seismic data of New Zealand and the United States is used, covering events since the 1950s, from which an earthquake cluster catalogue is assembled for most of the major events, including a detailed analysis of the Landers and Christchurch sequences.
Graph analysis of cell clusters forming vascular networks

NASA Astrophysics Data System (ADS)

Alves, A. P.; Mesquita, O. N.; Gómez-Gardeñes, J.; Agero, U.

2018-03-01

This manuscript describes the experimental observation of vasculogenesis in chick embryos by means of network analysis. The formation of the vascular network was observed in the area opaca of embryos from 40 to 55 h of development. In the area opaca endothelial cell clusters self-organize as a primitive and approximately regular network of capillaries. The process was observed by bright-field microscopy in control embryos and in embryos treated with Bevacizumab (Avastin), an antibody that inhibits the signalling of the vascular endothelial growth factor (VEGF). The sequence of images of the vascular growth were thresholded, and used to quantify the forming network in control and Avastin-treated embryos. This characterization is made by measuring vessels density, number of cell clusters and the largest cluster density. From the original images, the topology of the vascular network was extracted and characterized by means of the usual network metrics such as: the degree distribution, average clustering coefficient, average short path length and assortativity, among others. This analysis allows to monitor how the largest connected cluster of the vascular network evolves in time and provides with quantitative evidence of the disruptive effects that Avastin has on the tree structure of vascular networks.
HRLSim: a high performance spiking neural network simulator for GPGPU clusters.

PubMed

Minkovich, Kirill; Thibeault, Corey M; O'Brien, Michael John; Nogin, Aleksey; Cho, Youngkwan; Srinivasa, Narayan

2014-02-01

Modeling of large-scale spiking neural models is an important tool in the quest to understand brain function and subsequently create real-world applications. This paper describes a spiking neural network simulator environment called HRL Spiking Simulator (HRLSim). This simulator is suitable for implementation on a cluster of general purpose graphical processing units (GPGPUs). Novel aspects of HRLSim are described and an analysis of its performance is provided for various configurations of the cluster. With the advent of inexpensive GPGPU cards and compute power, HRLSim offers an affordable and scalable tool for design, real-time simulation, and analysis of large-scale spiking neural networks.
Microarray characterization of gene expression changes in blood during acute ethanol exposure

PubMed Central

2013-01-01

Background As part of the civil aviation safety program to define the adverse effects of ethanol on flying performance, we performed a DNA microarray analysis of human whole blood samples from a five-time point study of subjects administered ethanol orally, followed by breathalyzer analysis, to monitor blood alcohol concentration (BAC) to discover significant gene expression changes in response to the ethanol exposure. Methods Subjects were administered either orange juice or orange juice with ethanol. Blood samples were taken based on BAC and total RNA was isolated from PaxGene™ blood tubes. The amplified cDNA was used in microarray and quantitative real-time polymerase chain reaction (RT-qPCR) analyses to evaluate differential gene expression. Microarray data was analyzed in a pipeline fashion to summarize and normalize and the results evaluated for relative expression across time points with multiple methods. Candidate genes showing distinctive expression patterns in response to ethanol were clustered by pattern and further analyzed for related function, pathway membership and common transcription factor binding within and across clusters. RT-qPCR was used with representative genes to confirm relative transcript levels across time to those detected in microarrays. Results Microarray analysis of samples representing 0%, 0.04%, 0.08%, return to 0.04%, and 0.02% wt/vol BAC showed that changes in gene expression could be detected across the time course. The expression changes were verified by qRT-PCR. The candidate genes of interest (GOI) identified from the microarray analysis and clustered by expression pattern across the five BAC points showed seven coordinately expressed groups. Analysis showed function-based networks, shared transcription factor binding sites and signaling pathways for members of the clusters. These include hematological functions, innate immunity and inflammation functions, metabolic functions expected of ethanol metabolism, and pancreatic and hepatic function. Five of the seven clusters showed links to the p38 MAPK pathway. Conclusions The results of this study provide a first look at changing gene expression patterns in human blood during an acute rise in blood ethanol concentration and its depletion because of metabolism and excretion, and demonstrate that it is possible to detect changes in gene expression using total RNA isolated from whole blood. The analysis approach for this study serves as a workflow to investigate the biology linked to expression changes across a time course and from these changes, to identify target genes that could serve as biomarkers linked to pilot performance. PMID:23883607
Atlas-Guided Cluster Analysis of Large Tractography Datasets

PubMed Central

Ros, Christian; Güllmar, Daniel; Stenzel, Martin; Mentzel, Hans-Joachim; Reichenbach, Jürgen Rainer

2013-01-01

Diffusion Tensor Imaging (DTI) and fiber tractography are important tools to map the cerebral white matter microstructure in vivo and to model the underlying axonal pathways in the brain with three-dimensional fiber tracts. As the fast and consistent extraction of anatomically correct fiber bundles for multiple datasets is still challenging, we present a novel atlas-guided clustering framework for exploratory data analysis of large tractography datasets. The framework uses an hierarchical cluster analysis approach that exploits the inherent redundancy in large datasets to time-efficiently group fiber tracts. Structural information of a white matter atlas can be incorporated into the clustering to achieve an anatomically correct and reproducible grouping of fiber tracts. This approach facilitates not only the identification of the bundles corresponding to the classes of the atlas; it also enables the extraction of bundles that are not present in the atlas. The new technique was applied to cluster datasets of 46 healthy subjects. Prospects of automatic and anatomically correct as well as reproducible clustering are explored. Reconstructed clusters were well separated and showed good correspondence to anatomical bundles. Using the atlas-guided cluster approach, we observed consistent results across subjects with high reproducibility. In order to investigate the outlier elimination performance of the clustering algorithm, scenarios with varying amounts of noise were simulated and clustered with three different outlier elimination strategies. By exploiting the multithreading capabilities of modern multiprocessor systems in combination with novel algorithms, our toolkit clusters large datasets in a couple of minutes. Experiments were conducted to investigate the achievable speedup and to demonstrate the high performance of the clustering framework in a multiprocessing environment. PMID:24386292
The association between content of the elements S, Cl, K, Fe, Cu, Zn and Br in normal and cirrhotic liver tissue from Danes and Greenlandic Inuit examined by dual hierarchical clustering analysis.

PubMed

Laursen, Jens; Milman, Nils; Pind, Niels; Pedersen, Henrik; Mulvad, Gert

2014-01-01

Meta-analysis of previous studies evaluating associations between content of elements sulphur (S), chlorine (Cl), potassium (K), iron (Fe), copper (Cu), zinc (Zn) and bromine (Br) in normal and cirrhotic autopsy liver tissue samples. Normal liver samples from 45 Greenlandic Inuit, median age 60 years and from 71 Danes, median age 61 years. Cirrhotic liver samples from 27 Danes, median age 71 years. Element content was measured using X-ray fluorescence spectrometry. Dual hierarchical clustering analysis, creating a dual dendrogram, one clustering element contents according to calculated similarities, one clustering elements according to correlation coefficients between the element contents, both using Euclidian distance and Ward Procedure. One dendrogram separated subjects in 7 clusters showing no differences in ethnicity, gender or age. The analysis discriminated between elements in normal and cirrhotic livers. The other dendrogram clustered elements in four clusters: sulphur and chlorine; copper and bromine; potassium and zinc; iron. There were significant correlations between the elements in normal liver samples: S was associated with Cl, K, Br and Zn; Cl with S and Br; K with S, Br and Zn; Cu with Br. Zn with S and K. Br with S, Cl, K and Cu. Fe did not show significant associations with any other element. In contrast to simple statistical methods, which analyses content of elements separately one by one, dual hierarchical clustering analysis incorporates all elements at the same time and can be used to examine the linkage and interplay between multiple elements in tissue samples. Copyright © 2013 Elsevier GmbH. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Jacobson, Heather R.; Pilachowski, Catherine A.; Friel, Eileen D., E-mail: jacob189@msu.edu, E-mail: catyp@astro.indiana.edu, E-mail: edfriel@mac.com

We present a detailed chemical abundance study of evolved stars in 10 open clusters based on Hydra multi-object echelle spectra obtained with the WIYN 3.5 m telescope. From an analysis of both equivalent widths and spectrum synthesis, abundances have been determined for the elements Fe, Na, O, Mg, Si, Ca, Ti, Ni, Zr, and for two of the 10 clusters, Al and Cr. To our knowledge, this is the first detailed abundance analysis for clusters NGC 1245, NGC 2194, NGC 2355, and NGC 2425. These 10 clusters were selected for analysis because they span a Galactocentric distance range R{sub gc}more » {approx} 9-13 kpc, the approximate location of the transition between the inner and outer disks. Combined with cluster samples from our previous work and those of other studies in the literature, we explore abundance trends as a function of cluster R{sub gc}, age, and [Fe/H]. As found previously by us and other studies, the [Fe/H] distribution appears to decrease with increasing R{sub gc} to a distance of {approx}12 kpc and then flattens to a roughly constant value in the outer disk. Cluster average element [X/Fe] ratios appear to be independent of R{sub gc}, although the picture for [O/Fe] is more complicated with a clear trend of [O/Fe] with [Fe/H] and sample incompleteness. Other than oxygen, no other element [X/Fe] exhibits a clear trend with [Fe/H]; likewise, there does not appear to be any strong correlation between abundance and cluster age. We divided clusters into different age bins to explore temporal variations in the radial element distributions. The radial metallicity gradient appears to have flattened slightly as a function of time, as found by other studies. There is also some indication that the transition from the inner disk metallicity gradient to the {approx}constant [Fe/H] distribution of the outer disk occurs at different Galactocentric radii for different age bins. However, interpretation of the time evolution of radial abundance distributions is complicated by the unequal R{sub gc} and [Fe/H] ranges spanned by clusters in different age bins.« less
Transcriptional analysis of exopolysaccharides biosynthesis gene clusters in Lactobacillus plantarum.

PubMed

Vastano, Valeria; Perrone, Filomena; Marasco, Rosangela; Sacco, Margherita; Muscariello, Lidia

2016-04-01

Exopolysaccharides (EPS) from lactic acid bacteria contribute to specific rheology and texture of fermented milk products and find applications also in non-dairy foods and in therapeutics. Recently, four clusters of genes (cps) associated with surface polysaccharide production have been identified in Lactobacillus plantarum WCFS1, a probiotic and food-associated lactobacillus. These clusters are involved in cell surface architecture and probably in release and/or exposure of immunomodulating bacterial molecules. Here we show a transcriptional analysis of these clusters. Indeed, RT-PCR experiments revealed that the cps loci are organized in five operons. Moreover, by reverse transcription-qPCR analysis performed on L. plantarum WCFS1 (wild type) and WCFS1-2 (ΔccpA), we demonstrated that expression of three cps clusters is under the control of the global regulator CcpA. These results, together with the identification of putative CcpA target sequences (catabolite responsive element CRE) in the regulatory region of four out of five transcriptional units, strongly suggest for the first time a role of the master regulator CcpA in EPS gene transcription among lactobacilli.
REGIONAL-SCALE WIND FIELD CLASSIFICATION EMPLOYING CLUSTER ANALYSIS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Glascoe, L G; Glaser, R E; Chin, H S

2004-06-17

The classification of time-varying multivariate regional-scale wind fields at a specific location can assist event planning as well as consequence and risk analysis. Further, wind field classification involves data transformation and inference techniques that effectively characterize stochastic wind field variation. Such a classification scheme is potentially useful for addressing overall atmospheric transport uncertainty and meteorological parameter sensitivity issues. Different methods to classify wind fields over a location include the principal component analysis of wind data (e.g., Hardy and Walton, 1978) and the use of cluster analysis for wind data (e.g., Green et al., 1992; Kaufmann and Weber, 1996). The goalmore » of this study is to use a clustering method to classify the winds of a gridded data set, i.e, from meteorological simulations generated by a forecast model.« less
Detecting dominant motion patterns in crowds of pedestrians

NASA Astrophysics Data System (ADS)

Saqib, Muhammad; Khan, Sultan Daud; Blumenstein, Michael

2017-02-01

As the population of the world increases, urbanization generates crowding situations which poses challenges to public safety and security. Manual analysis of crowded situations is a tedious job and usually prone to errors. In this paper, we propose a novel technique of crowd analysis, the aim of which is to detect different dominant motion patterns in real-time videos. A motion field is generated by computing the dense optical flow. The motion field is then divided into blocks. For each block, we adopt an Intra-clustering algorithm for detecting different flows within the block. Later on, we employ Inter-clustering for clustering the flow vectors among different blocks. We evaluate the performance of our approach on different real-time videos. The experimental results show that our proposed method is capable of detecting distinct motion patterns in crowded videos. Moreover, our algorithm outperforms state-of-the-art methods.
Statistical Analysis of Small-Scale Magnetic Flux Emergence Patterns: A Useful Subsurface Diagnostic?

NASA Astrophysics Data System (ADS)

Lamb, Derek A.

2016-10-01

While sunspots follow a well-defined pattern of emergence in space and time, small-scale flux emergence is assumed to occur randomly at all times in the quiet Sun. HMI's full-disk coverage, high cadence, spatial resolution, and duty cycle allow us to probe that basic assumption. Some case studies of emergence suggest that temporal clustering on spatial scales of 50-150 Mm may occur. If clustering is present, it could serve as a diagnostic of large-scale subsurface magnetic field structures. We present the results of a manual survey of small-scale flux emergence events over a short time period, and a statistical analysis addressing the question of whether these events show spatio-temporal behavior that is anything other than random.

Fast EEG spike detection via eigenvalue analysis and clustering of spatial amplitude distribution

NASA Astrophysics Data System (ADS)

Fukami, Tadanori; Shimada, Takamasa; Ishikawa, Bunnoshin

2018-06-01

Objective. In the current study, we tested a proposed method for fast spike detection in electroencephalography (EEG). Approach. We performed eigenvalue analysis in two-dimensional space spanned by gradients calculated from two neighboring samples to detect high-amplitude negative peaks. We extracted the spike candidates by imposing restrictions on parameters regarding spike shape and eigenvalues reflecting detection characteristics of individual medical doctors. We subsequently performed clustering, classifying detected peaks by considering the amplitude distribution at 19 scalp electrodes. Clusters with a small number of candidates were excluded. We then defined a score for eliminating spike candidates for which the pattern of detected electrodes differed from the overall pattern in a cluster. Spikes were detected by setting the score threshold. Main results. Based on visual inspection by a psychiatrist experienced in EEG, we evaluated the proposed method using two statistical measures of precision and recall with respect to detection performance. We found that precision and recall exhibited a trade-off relationship. The average recall value was 0.708 in eight subjects with the score threshold that maximized the F-measure, with 58.6 ± 36.2 spikes per subject. Under this condition, the average precision was 0.390, corresponding to a false positive rate 2.09 times higher than the true positive rate. Analysis of the required processing time revealed that, using a general-purpose computer, our method could be used to perform spike detection in 12.1% of the recording time. The process of narrowing down spike candidates based on shape occupied most of the processing time. Significance. Although the average recall value was comparable with that of other studies, the proposed method significantly shortened the processing time.
Village-based spatio-temporal cluster analysis of the schistosomiasis risk in the Poyang Lake Region, China.

PubMed

Xia, Congcong; Bergquist, Robert; Lynn, Henry; Hu, Fei; Lin, Dandan; Hao, Yuwan; Li, Shizhu; Hu, Yi; Zhang, Zhijie

2017-03-08

The Poyang Lake Region, one of the major epidemic sites of schistosomiasis in China, remains a severe challenge. To improve our understanding of the current endemic status of schistosomiasis and to better control the transmission of the disease in the Poyang Lake Region, it is important to analyse the clustering pattern of schistosomiasis and detect the hotspots of transmission risk. Based on annual surveillance data, at the village level in this region from 2009 to 2014, spatial and temporal cluster analyses were conducted to assess the pattern of schistosomiasis infection risk among humans through purely spatial (Local Moran's I, Kulldorff and Flexible scan statistic) and space-time scan statistics (Kulldorff). A dramatic decline was found in the infection rate during the study period, which was shown to be maintained at a low level. The number of spatial clusters declined over time and were concentrated in counties around Poyang Lake, including Yugan, Yongxiu, Nanchang, Xingzi, Xinjian, De'an as well as Pengze, situated along the Yangtze River and the most serious area found in this study. Space-time analysis revealed that the clustering time frame appeared between 2009 and 2011 and the most likely cluster with the widest range was particularly concentrated in Pengze County. This study detected areas at high risk for schistosomiasis both in space and time at the village level from 2009 to 2014 in Poyang Lake Region. The high-risk areas are now more concentrated and mainly distributed at the river inflows Poyang Lake and along Yangtze River in Pengze County. It was assumed that the water projects including reservoirs and a recently breached dyke in this area were partly to blame. This study points out that attempts to reduce the negative effects of water projects in China should focus on the Poyang Lake Region.
Time series clustering analysis of health-promoting behavior

NASA Astrophysics Data System (ADS)

Yang, Chi-Ta; Hung, Yu-Shiang; Deng, Guang-Feng

2013-10-01

Health promotion must be emphasized to achieve the World Health Organization goal of health for all. Since the global population is aging rapidly, ComCare elder health-promoting service was developed by the Taiwan Institute for Information Industry in 2011. Based on the Pender health promotion model, ComCare service offers five categories of health-promoting functions to address the everyday needs of seniors: nutrition management, social support, exercise management, health responsibility, stress management. To assess the overall ComCare service and to improve understanding of the health-promoting behavior of elders, this study analyzed health-promoting behavioral data automatically collected by the ComCare monitoring system. In the 30638 session records collected for 249 elders from January, 2012 to March, 2013, behavior patterns were identified by fuzzy c-mean time series clustering algorithm combined with autocorrelation-based representation schemes. The analysis showed that time series data for elder health-promoting behavior can be classified into four different clusters. Each type reveals different health-promoting needs, frequencies, function numbers and behaviors. The data analysis result can assist policymakers, health-care providers, and experts in medicine, public health, nursing and psychology and has been provided to Taiwan National Health Insurance Administration to assess the elder health-promoting behavior.
Noise-free accurate count of microbial colonies by time-lapse shadow image analysis.

PubMed

Ogawa, Hiroyuki; Nasu, Senshi; Takeshige, Motomu; Funabashi, Hisakage; Saito, Mikako; Matsuoka, Hideaki

2012-12-01

Microbial colonies in food matrices could be counted accurately by a novel noise-free method based on time-lapse shadow image analysis. An agar plate containing many clusters of microbial colonies and/or meat fragments was trans-illuminated to project their 2-dimensional (2D) shadow images on a color CCD camera. The 2D shadow images of every cluster distributed within a 3-mm thick agar layer were captured in focus simultaneously by means of a multiple focusing system, and were then converted to 3-dimensional (3D) shadow images. By time-lapse analysis of the 3D shadow images, it was determined whether each cluster comprised single or multiple colonies or a meat fragment. The analytical precision was high enough to be able to distinguish a microbial colony from a meat fragment, to recognize an oval image as two colonies contacting each other, and to detect microbial colonies hidden under a food fragment. The detection of hidden colonies is its outstanding performance in comparison with other systems. The present system attained accuracy for counting fewer than 5 colonies and is therefore of practical importance. Copyright © 2012 Elsevier B.V. All rights reserved.
A new approach for evaluating flexible working hours.

PubMed

Giebel, Ole; Janssen, Daniela; Schomann, Carsten; Nachreiner, Friedhelm

2004-01-01

Recent studies on flexible working hours show at least some of these working time arrangements seem to be associated with impairing effects of health and well-being. According to available evidence, variability of working hours seems to play an important role. The question, however, is how this variability can be assessed and used to explain or predict impairments. Based on earlier methods used to assess shift-work effects, a time series analysis approach was applied to the matter of flexible working hours. Data on the working hours of 4 week's length of 137 respondents derived from a survey on flexible work hours involving 15 companies of different production and service sectors in Germany were converted to time series and analyzed by spectral analysis. A cluster analysis of the resulting power spectra yielded 5 clusters of flexible work hours. Analyzing these clusters for differences in reported impairments showed that workers who showed suppression of circadian and weekly rhythms experienced severest impairments, especially in circadian controlled functions like sleep and digestion. The results thus indicate that analyzing the periodicity of flexible working hours seems to be a promising approach for predicting impairments which should be investigated further in the future.
20 Years Spatial-Temporal Analysis of Dengue Fever and Hemorrhagic Fever in Mexico.

PubMed

Hernández-Gaytán, Sendy Isarel; Díaz-Vásquez, Francisco Javier; Duran-Arenas, Luis Gerardo; López Cervantes, Malaquías; Rothenberg, Stephen J

2017-10-01

Dengue Fever (DF) is a human vector-borne disease and a major public health problem worldwide. In Mexico, DF and Dengue Hemorrhagic Fever (DHF) cases have increased in recent years. The aim of this study was to identify variations in the spatial distribution of DF and DHF cases over time using space-time statistical analysis and geographic information systems. Official data of DF and DHF cases were obtained in 32 states from 1995-2015. Space-time scan statistics were used to determine the space-time clusters of DF and DHF cases nationwide, and a geographic information system was used to display the location of clusters. A total of 885,748 DF cases was registered of which 13.4% (n = 119,174) correspond to DHF in the 32 states from 1995-2015. The most likely cluster of DF (relative risk = 25.5) contained the states of Jalisco, Colima, and Nayarit, on the Pacific coast in 2009, and the most likely cluster of DHF (relative risk = 8.5) was in the states of Chiapas, Tabasco, Campeche, Oaxaca, Veracruz, Quintana Roo, Yucatán, Puebla, Morelos, and Guerrero principally on the Gulf coast over 2006-2015. The geographic distribution of DF and DHF cases has increased in recent years and cases are significantly clustered in two coastal areas (Pacific and Gulf of Mexico). This provides the basis for further investigation of risk factors as well as interventions in specific areas. Copyright © 2018 IMSS. Published by Elsevier Inc. All rights reserved.
Variable number of tandem repeats and pulsed-field gel electrophoresis cluster analysis of enterohemorrhagic Escherichia coli serovar O157 strains.

PubMed

Yokoyama, Eiji; Uchimura, Masako

2007-11-01

Ninety-five enterohemorrhagic Escherichia coli serovar O157 strains, including 30 strains isolated from 13 intrafamily outbreaks and 14 strains isolated from 3 mass outbreaks, were studied by pulsed-field gel electrophoresis (PFGE) and variable number of tandem repeats (VNTR) typing, and the resulting data were subjected to cluster analysis. Cluster analysis of the VNTR typing data revealed that 57 (60.0%) of 95 strains, including all epidemiologically linked strains, formed clusters with at least 95% similarity. Cluster analysis of the PFGE patterns revealed that 67 (70.5%) of 95 strains, including all but 1 of the epidemiologically linked strains, formed clusters with 90% similarity. The number of epidemiologically unlinked strains forming clusters was significantly less by VNTR cluster analysis than by PFGE cluster analysis. The congruence value between PFGE and VNTR cluster analysis was low and did not show an obvious correlation. With two-step cluster analysis, the number of clustered epidemiologically unlinked strains by PFGE cluster analysis that were divided by subsequent VNTR cluster analysis was significantly higher than the number by VNTR cluster analysis that were divided by subsequent PFGE cluster analysis. These results indicate that VNTR cluster analysis is more efficient than PFGE cluster analysis as an epidemiological tool to trace the transmission of enterohemorrhagic E. coli O157.
A geo-computational algorithm for exploring the structure of diffusion progression in time and space.

PubMed

Chin, Wei-Chien-Benny; Wen, Tzai-Hung; Sabel, Clive E; Wang, I-Hsiang

2017-10-03

A diffusion process can be considered as the movement of linked events through space and time. Therefore, space-time locations of events are key to identify any diffusion process. However, previous clustering analysis methods have focused only on space-time proximity characteristics, neglecting the temporal lag of the movement of events. We argue that the temporal lag between events is a key to understand the process of diffusion movement. Using the temporal lag could help to clarify the types of close relationships. This study aims to develop a data exploration algorithm, namely the TrAcking Progression In Time And Space (TaPiTaS) algorithm, for understanding diffusion processes. Based on the spatial distance and temporal interval between cases, TaPiTaS detects sub-clusters, a group of events that have high probability of having common sources, identifies progression links, the relationships between sub-clusters, and tracks progression chains, the connected components of sub-clusters. Dengue Fever cases data was used as an illustrative case study. The location and temporal range of sub-clusters are presented, along with the progression links. TaPiTaS algorithm contributes a more detailed and in-depth understanding of the development of progression chains, namely the geographic diffusion process.
Geospatial Distribution and Clustering of Chlamydia trachomatis in Communities Undergoing Mass Azithromycin Treatment

PubMed Central

Yohannan, Jithin; He, Bing; Wang, Jiangxia; Greene, Gregory; Schein, Yvette; Mkocha, Harran; Munoz, Beatriz; Quinn, Thomas C.; Gaydos, Charlotte; West, Sheila K.

2014-01-01

Purpose. We detected spatial clustering of households with Chlamydia trachomatis infection (CI) and active trachoma (AT) in villages undergoing mass treatment with azithromycin (MDA) over time. Methods. We obtained global positioning system (GPS) coordinates for all households in four villages in Kongwa District, Tanzania. Every 6 months for a period of 42 months, our team examined all children under 10 for AT, and tested for CI with ocular swabbing and Amplicor. Villages underwent four rounds of annual MDA. We classified households as having ≥1 child with CI (or AT) or having 0 children with CI (or AT). We calculated the difference in the K function between households with and without CI or AT to detect clustering at each time point. Results. Between 918 and 991 households were included over the 42 months of this analysis. At baseline, 306 households (32.59%) had ≥1 child with CI, which declined to 73 households (7.50%) at 42 months. We observed borderline clustering of households with CI at 12 months after one round of MDA and statistically significant clustering with growing cluster sizes between 18 and 24 months after two rounds of MDA. Clusters diminished in size at 30 months after 3 rounds of MDA. Active trachoma did not cluster at any time point. Conclusions. This study demonstrates that CI clusters after multiple rounds of MDA. Clusters of infection may increase in size if the annual antibiotic pressure is removed. The absence of growth after the three rounds suggests the start of control of transmission. PMID:24906862
Inferring HIV-1 Transmission Dynamics in Germany From Recently Transmitted Viruses.

PubMed

Pouran Yousef, Kaveh; Meixenberger, Karolin; Smith, Maureen R; Somogyi, Sybille; Gromöller, Silvana; Schmidt, Daniel; Gunsenheimer-Bartmeyer, Barbara; Hamouda, Osamah; Kücherer, Claudia; von Kleist, Max

2016-11-01

Although HIV continues to spread globally, novel intervention strategies such as treatment as prevention (TasP) may bring the epidemic to a halt. However, their effective implementation requires a profound understanding of the underlying transmission dynamics. We analyzed parameters of the German HIV epidemic based on phylogenetic clustering of viral sequences from recently infected seroconverters with known infection dates. Viral baseline and follow-up pol sequences (n = 1943) from 1159 drug-naïve individuals were selected from a nationwide long-term observational study initiated in 1997. Putative transmission clusters were computed based on a maximum likelihood phylogeny. Using individual follow-up sequences, we optimized our clustering threshold to maximize the likelihood of co-clustering individuals connected by direct transmission. The sizes of putative transmission clusters scaled inversely with their abundance and their distribution exhibited a heavy tail. Clusters based on the optimal clustering threshold were significantly more likely to contain members of the same or bordering German federal states. Interinfection times between co-clustered individuals were significantly shorter (26 weeks; interquartile range: 13-83) than in a null model. Viral intraindividual evolution may be used to select criteria that maximize co-clustering of transmission pairs in the absence of strong adaptive selection pressure. Interinfection times of co-clustered individuals may then be an indicator of the typical time to onward transmission. Our analysis suggests that onward transmission may have occurred early after infection, when individuals are typically unaware of their serological status. The latter argues that TasP should be combined with HIV testing campaigns to reduce the possibility of transmission before TasP initiation.
Photogrammetric Analysis of CPAS Main Parachutes

NASA Technical Reports Server (NTRS)

Ray, Eric; Bretz, David

2011-01-01

The Crew Exploration Vehicle Parachute Assembly System (CPAS) is being designed to land the Orion Crew Module (CM) at a safe rate of descent at splashdown with a cluster of two to three Main parachutes. The instantaneous rate of descent varies based on parachute fly-out angles and geometric inlet area. Parachutes in a cluster oscillate between significant fly-out angles and colliding into each other. The former presents a sub-optimal inlet area and the latter lowers the effective drag area as the parachutes interfere with each other. The fly-out angles are also important in meeting a twist torque requirement. Understanding cluster behavior necessitates measuring the Mains with photogrammetric analysis. Imagery from upward looking cameras is analyzed to determine parachute geometry. Fly-out angles are measured from each parachute vent to an axis determined from geometry. Determining the scale of the objects requires knowledge of camera and lens calibration as well as features of known size. Several points along the skirt are tracked to compute an effective circumference, diameter, and inlet area as a function of time. The effects of this geometry are clearly seen in the system drag coefficient time history. Photogrammetric analysis is key in evaluating the effects of design features such as an Over-Inflation Control Line (OICL), Main Line Length Ratio (MLLR), and geometric porosity, which are varied in an attempt to minimize cluster oscillations. The effects of these designs are evaluated through statistical analysis.
Relating anomaly correlation to lead time: Clustering analysis of CFSv2 forecasts of summer precipitation in China

NASA Astrophysics Data System (ADS)

Zhao, Tongtiegang; Liu, Pan; Zhang, Yongyong; Ruan, Chengqing

2017-09-01

Global climate model (GCM) forecasts are an integral part of long-range hydroclimatic forecasting. We propose to use clustering to explore anomaly correlation, which indicates the performance of raw GCM forecasts, in the three-dimensional space of latitude, longitude, and initialization time. Focusing on a certain period of the year, correlations for forecasts initialized at different preceding periods form a vector. The vectors of anomaly correlation across different GCM grid cells are clustered to reveal how GCM forecasts perform as time progresses. Through the case study of Climate Forecast System Version 2 (CFSv2) forecasts of summer precipitation in China, we observe that the correlation at a certain cell oscillates with lead time and can become negative. The use of clustering reveals two meaningful patterns that characterize the relationship between anomaly correlation and lead time. For some grid cells in Central and Southwest China, CFSv2 forecasts exhibit positive correlations with observations and they tend to improve as time progresses. This result suggests that CFSv2 forecasts tend to capture the summer precipitation induced by the East Asian monsoon and the South Asian monsoon. It also indicates that CFSv2 forecasts can potentially be applied to improving hydrological forecasts in these regions. For some other cells, the correlations are generally close to zero at different lead times. This outcome implies that CFSv2 forecasts still have plenty of room for further improvement. The robustness of the patterns has been tested using both hierarchical clustering and k-means clustering and examined with the Silhouette score.
Determination of Arctic sea ice variability modes on interannual timescales via nonhierarchical clustering

NASA Astrophysics Data System (ADS)

Fučkar, Neven-Stjepan; Guemas, Virginie; Massonnet, François; Doblas-Reyes, Francisco

2015-04-01

Over the modern observational era, the northern hemisphere sea ice concentration, age and thickness have experienced a sharp long-term decline superimposed with strong internal variability. Hence, there is a crucial need to identify robust patterns of Arctic sea ice variability on interannual timescales and disentangle them from the long-term trend in noisy datasets. The principal component analysis (PCA) is a versatile and broadly used method for the study of climate variability. However, the PCA has several limiting aspects because it assumes that all modes of variability have symmetry between positive and negative phases, and suppresses nonlinearities by using a linear covariance matrix. Clustering methods offer an alternative set of dimension reduction tools that are more robust and capable of taking into account possible nonlinear characteristics of a climate field. Cluster analysis aggregates data into groups or clusters based on their distance, to simultaneously minimize the distance between data points in a given cluster and maximize the distance between the centers of the clusters. We extract modes of Arctic interannual sea-ice variability with nonhierarchical K-means cluster analysis and investigate the mechanisms leading to these modes. Our focus is on the sea ice thickness (SIT) as the base variable for clustering because SIT holds most of the climate memory for variability and predictability on interannual timescales. We primarily use global reconstructions of sea ice fields with a state-of-the-art ocean-sea-ice model, but we also verify the robustness of determined clusters in other Arctic sea ice datasets. Applied cluster analysis over the 1958-2013 period shows that the optimal number of detrended SIT clusters is K=3. Determined SIT cluster patterns and their time series of occurrence are rather similar between different seasons and months. Two opposite thermodynamic modes are characterized with prevailing negative or positive SIT anomalies over the Arctic basin. The intermediate mode, with negative anomalies centered on the East Siberian shelf and positive anomalies along the North American side of the basin, has predominately dynamic characteristics. The associated sea ice concentration (SIC) clusters vary more between different seasons and months, but the SIC patterns are physically framed by the SIT cluster patterns.
Analysis of radiation-induced small Cu particle cluster formation in aqueous CuCl2

USGS Publications Warehouse

Jayanetti, Sumedha; Mayanovic, Robert A.; Anderson, Alan J.; Bassett, William A.; Chou, I.-Ming

2001-01-01

Radition-induced small Cu particle cluster formation in aqueous CuCl2 was analyzed. It was noticed that nearest neighbor distance increased with the increase in the time of irradiation. This showed that the clusters approached the lattice dimension of bulk copper. As the average cluster size approached its bulk dimensions, an increase in the nearest neighbor coordination number was found with the decrease in the surface to volume ratio. Radiolysis of water by incident x-ray beam led to the reduction of copper ions in the solution to themetallic state.
Lagged segmented Poincaré plot analysis for risk stratification in patients with dilated cardiomyopathy.

PubMed

Voss, Andreas; Fischer, Claudia; Schroeder, Rico; Figulla, Hans R; Goernig, Matthias

2012-07-01

The objectives of this study were to introduce a new type of heart-rate variability analysis improving risk stratification in patients with idiopathic dilated cardiomyopathy (DCM) and to provide additional information about impaired heart beat generation in these patients. Beat-to-beat intervals (BBI) of 30-min ECGs recorded from 91 DCM patients and 21 healthy subjects were analyzed applying the lagged segmented Poincaré plot analysis (LSPPA) method. LSPPA includes the Poincaré plot reconstruction with lags of 1-100, rotating the cloud of points, its normalized segmentation adapted to their standard deviations, and finally, a frequency-dependent clustering. The lags were combined into eight different clusters representing specific frequency bands within 0.012-1.153 Hz. Statistical differences between low- and high-risk DCM could be found within the clusters II-VIII (e.g., cluster IV: 0.033-0.038 Hz; p = 0.0002; sensitivity = 85.7 %; specificity = 71.4 %). The multivariate statistics led to a sensitivity of 92.9 %, specificity of 85.7 % and an area under the curve of 92.1 % discriminating these patient groups. We introduced the LSPPA method to investigate time correlations in BBI time series. We found that LSPPA contributes considerably to risk stratification in DCM and yields the highest discriminant power in the low and very low-frequency bands.
Impacts of exploratory drilling for oil and gas on the benthic environment of Georges Bank

USGS Publications Warehouse

Neff, J. M.; Bothner, Michael H.; Maciolek, N. J.; Grassle, J. F.

1989-01-01

Cluster analysis revealed a strong relationship between community structure and both sediment type and water depth. Little seasonal variation was detected, but some interannual differences were revealed by cluster analysis and correspondence analysis. The replicates from a station always resembled each other more than they resembled any replicates from other stations. In addition, the combined replicates from a station always clustered with samples from that station taken on other cruises. This excellent replication and uniformity of the benthic infaunal community at a station over time made it possible to detect very subtle changes in community parameters that might be related to discharges of drilling fluid and drill cuttings. Nevertheless, no changes were detected in benthic communities of Georges Bank that could be attributed to drilling activities.
Accurate calibration of a molecular beam time-of-flight mass spectrometer for on-line analysis of high molecular weight species.

PubMed

Apicella, B; Wang, X; Passaro, M; Ciajolo, A; Russo, C

2016-10-15

Time-of-Flight (TOF) Mass Spectrometry is a powerful analytical technique, provided that an accurate calibration by standard molecules in the same m/z range of the analytes is performed. Calibration in a very large m/z range is a difficult task, particularly in studies focusing on the detection of high molecular weight clusters of different molecules or high molecular weight species. External calibration is the most common procedure used for TOF mass spectrometric analysis in the gas phase and, generally, the only available standards are made up of mixtures of noble gases, covering a small mass range for calibration, up to m/z 136 (higher mass isotope of xenon). In this work, an accurate calibration of a Molecular Beam Time-of Flight Mass Spectrometer (MB-TOFMS) is presented, based on the use of water clusters up to m/z 3000. The advantages of calibrating a MB-TOFMS with water clusters for the detection of analytes with masses above those of the traditional calibrants such as noble gases were quantitatively shown by statistical calculations. A comparison of the water cluster and noble gases calibration procedures in attributing the masses to a test mixture extending up to m/z 800 is also reported. In the case of the analysis of combustion products, another important feature of water cluster calibration was shown, that is the possibility of using them as "internal standard" directly formed from the combustion water, under suitable experimental conditions. The water clusters calibration of a MB-TOFMS gives rise to a ten-fold reduction in error compared to the traditional calibration with noble gases. The consequent improvement in mass accuracy in the calibration of a MB-TOFMS has important implications in various fields where detection of high molecular mass species is required. In combustion products analysis, it is also possible to obtain a new calibration spectrum before the acquisition of each spectrum, only modifying some operative conditions. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Approximate cluster analysis method and three-dimensional diagram of optical characteristics of lunar surface

NASA Astrophysics Data System (ADS)

Yevsyukov, N. N.

1985-09-01

An approximate isolation algorithm for the isolation of multidimensional clusters is developed and applied in the construction of a three-dimensional diagram of the optical characteristics of the lunar surface. The method is somewhat analogous to that of Koontz and Fukunaga (1972) and involves isolating two-dimensional clusters, adding a new characteristic, and linearizing, a cycle which is repeated a limited number of times. The lunar-surface parameters analyzed are the 620-nm albedo, the 620/380-nm color index, and the 950/620-nm index. The results are presented graphically; the reliability of the cluster-isolation process is discussed; and some correspondences between known lunar morphology and the cluster maps are indicated.
Electronic structure and optical properties of the thiolate-protected Au28(SMe)20 cluster.

PubMed

Knoppe, Stefan; Malola, Sami; Lehtovaara, Lauri; Bürgi, Thomas; Häkkinen, Hannu

2013-10-10

The recently reported crystal structure of the Au28(TBBT)20 cluster (TBBT: p-tert-butylbenzenethiolate) is analyzed with (time-dependent) density functional theory (TD-DFT). Bader charge analysis reveals a novel trimeric Au3(SR)4 binding motif. The cluster can be formulated as Au14(Au2(SR)3)4(Au3(SR)4)2. The electronic structure of the Au14(6+) core and the ligand-protected cluster were analyzed, and their stability can be explained by formation of distorted eight-electron superatoms. Optical absorption and circular dichroism (CD) spectra were calculated and compared to the experiment. Assignment of handedness of the intrinsically chiral cluster is possible.
MC 2: Subaru and Hubble Space Telescope Weak-lensing Analysis of the Double Radio Relic Galaxy Cluster PLCK G287.0+32.9

DOE Office of Scientific and Technical Information (OSTI.GOV)

Finner, Kyle; Jee, M. James; Golovich, Nathan

The second most significant detection of the Planck Sunyaev-Zel'dovich survey, PLCK G287.0+32.9 (z = 0.385), boasts two similarly bright radio relics and a radio halo. One radio relic is locatedmore » $$\\sim 400\\,\\mathrm{kpc}$$ NW of the X-ray peak and the other $$\\sim 2.8$$ Mpc to the SE. This large difference suggests that a complex merging scenario is required. A key missing puzzle for the merging scenario reconstruction is the underlying dark matter distribution in high resolution. Here, we present a joint Subaru Telescope and Hubble Space Telescope weak-lensing analysis of the cluster. Our analysis shows that the mass distribution features four significant substructures. Of the substructures, a primary cluster of mass $${M}_{200{\\rm{c}}}={1.59}_{-0.22}^{+0.25}\\times {10}^{15}\\ {h}_{70}^{-1}\\ {M}_{\\odot }$$ dominates the weak-lensing signal. This cluster is likely to be undergoing a merger with one (or more) subcluster whose mass is approximately a factor of 10 lower. One candidate is the subcluster of mass $${M}_{200{\\rm{c}}}={1.16}_{-0.13}^{+0.15}\\times {10}^{14}\\ {h}_{70}^{-1}\\ {M}_{\\odot }$$ located $$\\sim 400\\,\\mathrm{kpc}$$ to the SE. The location of this subcluster suggests that its interaction with the primary cluster could be the source of the NW radio relic. Another subcluster is detected $$\\sim 2$$ Mpc to the SE of the X-ray peak with mass $${M}_{200{\\rm{c}}}={1.68}_{-0.20}^{+0.22}\\times {10}^{14}\\ {h}_{70}^{-1}\\ {M}_{\\odot }$$. This SE subcluster is in the vicinity of the SE radio relic and may have created the SE radio relic during a past merger with the primary cluster. The fourth subcluster, $${M}_{200{\\rm{c}}}={1.87}_{-0.22}^{+0.24}\\times {10}^{14}\\ {h}_{70}^{-1}\\ {M}_{\\odot }$$, is NW of the X-ray peak and beyond the NW radio relic.« less

MC 2: Subaru and Hubble Space Telescope Weak-lensing Analysis of the Double Radio Relic Galaxy Cluster PLCK G287.0+32.9

DOE PAGES

Finner, Kyle; Jee, M. James; Golovich, Nathan; ...

2017-12-11

The second most significant detection of the Planck Sunyaev-Zel'dovich survey, PLCK G287.0+32.9 (z = 0.385), boasts two similarly bright radio relics and a radio halo. One radio relic is locatedmore » $$\\sim 400\\,\\mathrm{kpc}$$ NW of the X-ray peak and the other $$\\sim 2.8$$ Mpc to the SE. This large difference suggests that a complex merging scenario is required. A key missing puzzle for the merging scenario reconstruction is the underlying dark matter distribution in high resolution. Here, we present a joint Subaru Telescope and Hubble Space Telescope weak-lensing analysis of the cluster. Our analysis shows that the mass distribution features four significant substructures. Of the substructures, a primary cluster of mass $${M}_{200{\\rm{c}}}={1.59}_{-0.22}^{+0.25}\\times {10}^{15}\\ {h}_{70}^{-1}\\ {M}_{\\odot }$$ dominates the weak-lensing signal. This cluster is likely to be undergoing a merger with one (or more) subcluster whose mass is approximately a factor of 10 lower. One candidate is the subcluster of mass $${M}_{200{\\rm{c}}}={1.16}_{-0.13}^{+0.15}\\times {10}^{14}\\ {h}_{70}^{-1}\\ {M}_{\\odot }$$ located $$\\sim 400\\,\\mathrm{kpc}$$ to the SE. The location of this subcluster suggests that its interaction with the primary cluster could be the source of the NW radio relic. Another subcluster is detected $$\\sim 2$$ Mpc to the SE of the X-ray peak with mass $${M}_{200{\\rm{c}}}={1.68}_{-0.20}^{+0.22}\\times {10}^{14}\\ {h}_{70}^{-1}\\ {M}_{\\odot }$$. This SE subcluster is in the vicinity of the SE radio relic and may have created the SE radio relic during a past merger with the primary cluster. The fourth subcluster, $${M}_{200{\\rm{c}}}={1.87}_{-0.22}^{+0.24}\\times {10}^{14}\\ {h}_{70}^{-1}\\ {M}_{\\odot }$$, is NW of the X-ray peak and beyond the NW radio relic.« less
Computational cluster validation for microarray data analysis: experimental assessment of Clest, Consensus Clustering, Figure of Merit, Gap Statistics and Model Explorer.

PubMed

Giancarlo, Raffaele; Scaturro, Davide; Utro, Filippo

2008-10-29

Inferring cluster structure in microarray datasets is a fundamental task for the so-called -omic sciences. It is also a fundamental question in Statistics, Data Analysis and Classification, in particular with regard to the prediction of the number of clusters in a dataset, usually established via internal validation measures. Despite the wealth of internal measures available in the literature, new ones have been recently proposed, some of them specifically for microarray data. We consider five such measures: Clest, Consensus (Consensus Clustering), FOM (Figure of Merit), Gap (Gap Statistics) and ME (Model Explorer), in addition to the classic WCSS (Within Cluster Sum-of-Squares) and KL (Krzanowski and Lai index). We perform extensive experiments on six benchmark microarray datasets, using both Hierarchical and K-means clustering algorithms, and we provide an analysis assessing both the intrinsic ability of a measure to predict the correct number of clusters in a dataset and its merit relative to the other measures. We pay particular attention both to precision and speed. Moreover, we also provide various fast approximation algorithms for the computation of Gap, FOM and WCSS. The main result is a hierarchy of those measures in terms of precision and speed, highlighting some of their merits and limitations not reported before in the literature. Based on our analysis, we draw several conclusions for the use of those internal measures on microarray data. We report the main ones. Consensus is by far the best performer in terms of predictive power and remarkably algorithm-independent. Unfortunately, on large datasets, it may be of no use because of its non-trivial computer time demand (weeks on a state of the art PC). FOM is the second best performer although, quite surprisingly, it may not be competitive in this scenario: it has essentially the same predictive power of WCSS but it is from 6 to 100 times slower in time, depending on the dataset. The approximation algorithms for the computation of FOM, Gap and WCSS perform very well, i.e., they are faster while still granting a very close approximation of FOM and WCSS. The approximation algorithm for the computation of Gap deserves to be singled-out since it has a predictive power far better than Gap, it is competitive with the other measures, but it is at least two order of magnitude faster in time with respect to Gap. Another important novel conclusion that can be drawn from our analysis is that all the measures we have considered show severe limitations on large datasets, either due to computational demand (Consensus, as already mentioned, Clest and Gap) or to lack of precision (all of the other measures, including their approximations). The software and datasets are available under the GNU GPL on the supplementary material web page.
Information jet: Handling noisy big data from weakly disconnected network

NASA Astrophysics Data System (ADS)

Aurongzeb, Deeder

Sudden aggregation (information jet) of large amount of data is ubiquitous around connected social networks, driven by sudden interacting and non-interacting events, network security threat attacks, online sales channel etc. Clustering of information jet based on time series analysis and graph theory is not new but little work is done to connect them with particle jet statistics. We show pre-clustering based on context can element soft network or network of information which is critical to minimize time to calculate results from noisy big data. We show difference between, stochastic gradient boosting and time series-graph clustering. For disconnected higher dimensional information jet, we use Kallenberg representation theorem (Kallenberg, 2005, arXiv:1401.1137) to identify and eliminate jet similarities from dense or sparse graph.
Operating room scheduling using hybrid clustering priority rule and genetic algorithm

NASA Astrophysics Data System (ADS)

Santoso, Linda Wahyuni; Sinawan, Aisyah Ashrinawati; Wijaya, Andi Rahadiyan; Sudiarso, Andi; Masruroh, Nur Aini; Herliansyah, Muhammad Kusumawan

2017-11-01

Operating room is a bottleneck resource in most hospitals so that operating room scheduling system will influence the whole performance of the hospitals. This research develops a mathematical model of operating room scheduling for elective patients which considers patient priority with limit number of surgeons, operating rooms, and nurse team. Clustering analysis was conducted to the data of surgery durations using hierarchical and non-hierarchical methods. The priority rule of each resulting cluster was determined using Shortest Processing Time method. Genetic Algorithm was used to generate daily operating room schedule which resulted in the lowest values of patient waiting time and nurse overtime. The computational results show that this proposed model reduced patient waiting time by approximately 32.22% and nurse overtime by approximately 32.74% when compared to actual schedule.
Application of microarray analysis on computer cluster and cloud platforms.

PubMed

Bernau, C; Boulesteix, A-L; Knaus, J

2013-01-01

Analysis of recent high-dimensional biological data tends to be computationally intensive as many common approaches such as resampling or permutation tests require the basic statistical analysis to be repeated many times. A crucial advantage of these methods is that they can be easily parallelized due to the computational independence of the resampling or permutation iterations, which has induced many statistics departments to establish their own computer clusters. An alternative is to rent computing resources in the cloud, e.g. at Amazon Web Services. In this article we analyze whether a selection of statistical projects, recently implemented at our department, can be efficiently realized on these cloud resources. Moreover, we illustrate an opportunity to combine computer cluster and cloud resources. In order to compare the efficiency of computer cluster and cloud implementations and their respective parallelizations we use microarray analysis procedures and compare their runtimes on the different platforms. Amazon Web Services provide various instance types which meet the particular needs of the different statistical projects we analyzed in this paper. Moreover, the network capacity is sufficient and the parallelization is comparable in efficiency to standard computer cluster implementations. Our results suggest that many statistical projects can be efficiently realized on cloud resources. It is important to mention, however, that workflows can change substantially as a result of a shift from computer cluster to cloud computing.
Chapter 7. Cloning and analysis of natural product pathways.

PubMed

Gust, Bertolt

2009-01-01

The identification of gene clusters of natural products has lead to an enormous wealth of information about their biosynthesis and its regulation, and about self-resistance mechanisms. Well-established routine techniques are now available for the cloning and sequencing of gene clusters. The subsequent functional analysis of the complex biosynthetic machinery requires efficient genetic tools for manipulation. Until recently, techniques for the introduction of defined changes into Streptomyces chromosomes were very time-consuming. In particular, manipulation of large DNA fragments has been challenging due to the absence of suitable restriction sites for restriction- and ligation-based techniques. The homologous recombination approach called recombineering (referred to as Red/ET-mediated recombination in this chapter) has greatly facilitated targeted genetic modifications of complex biosynthetic pathways from actinomycetes by eliminating many of the time-consuming and labor-intensive steps. This chapter describes techniques for the cloning and identification of biosynthetic gene clusters, for the generation of gene replacements within such clusters, for the construction of integrative library clones and their expression in heterologous hosts, and for the assembly of entire biosynthetic gene clusters from the inserts of individual library clones. A systematic approach toward insertional mutation of a complete Streptomyces genome is shown by the use of an in vitro transposon mutagenesis procedure.
Comparing cluster-level dynamic treatment regimens using sequential, multiple assignment, randomized trials: Regression estimation and sample size considerations.

PubMed

NeCamp, Timothy; Kilbourne, Amy; Almirall, Daniel

2017-08-01

Cluster-level dynamic treatment regimens can be used to guide sequential treatment decision-making at the cluster level in order to improve outcomes at the individual or patient-level. In a cluster-level dynamic treatment regimen, the treatment is potentially adapted and re-adapted over time based on changes in the cluster that could be impacted by prior intervention, including aggregate measures of the individuals or patients that compose it. Cluster-randomized sequential multiple assignment randomized trials can be used to answer multiple open questions preventing scientists from developing high-quality cluster-level dynamic treatment regimens. In a cluster-randomized sequential multiple assignment randomized trial, sequential randomizations occur at the cluster level and outcomes are observed at the individual level. This manuscript makes two contributions to the design and analysis of cluster-randomized sequential multiple assignment randomized trials. First, a weighted least squares regression approach is proposed for comparing the mean of a patient-level outcome between the cluster-level dynamic treatment regimens embedded in a sequential multiple assignment randomized trial. The regression approach facilitates the use of baseline covariates which is often critical in the analysis of cluster-level trials. Second, sample size calculators are derived for two common cluster-randomized sequential multiple assignment randomized trial designs for use when the primary aim is a between-dynamic treatment regimen comparison of the mean of a continuous patient-level outcome. The methods are motivated by the Adaptive Implementation of Effective Programs Trial which is, to our knowledge, the first-ever cluster-randomized sequential multiple assignment randomized trial in psychiatry.
Limits on turbulent propagation of energy in cool-core clusters of galaxies

NASA Astrophysics Data System (ADS)

Bambic, C. J.; Pinto, C.; Fabian, A. C.; Sanders, J.; Reynolds, C. S.

2018-07-01

We place constraints on the propagation velocity of bulk turbulence within the intracluster medium of three clusters and an elliptical galaxy. Using Reflection Grating Spectrometer measurements of turbulent line broadening, we show that for these clusters, the 90 per cent upper limit on turbulent velocities when accounting for instrumental broadening is too low to propagate energy radially to the cooling radius of the clusters within the required cooling time. In this way, we extend previous Hitomi-based analysis on the Perseus cluster to more clusters, with the intention of applying these results to a future, more extensive catalogue. These results constrain models of turbulent heating in active galactic nucleus feedback by requiring a mechanism which can not only provide sufficient energy to offset radiative cooling but also resupply that energy rapidly enough to balance cooling at each cluster radius.
Limits on turbulent propagation of energy in cool-core clusters of galaxies

NASA Astrophysics Data System (ADS)

Bambic, C. J.; Pinto, C.; Fabian, A. C.; Sanders, J.; Reynolds, C. S.

2018-04-01

We place constraints on the propagation velocity of bulk turbulence within the intracluster medium of three clusters and an elliptical galaxy. Using Reflection Grating Spectrometer measurements of turbulent line broadening, we show that for these clusters, the 90% upper limit on turbulent velocities when accounting for instrumental broadening is too low to propagate energy radially to the cooling radius of the clusters within the required cooling time. In this way, we extend previous Hitomi-based analysis on the Perseus cluster to more clusters, with the intention of applying these results to a future, more extensive catalog. These results constrain models of turbulent heating in AGN feedback by requiring a mechanism which can not only provide sufficient energy to offset radiative cooling, but resupply that energy rapidly enough to balance cooling at each cluster radius.
Long-term memory and volatility clustering in high-frequency price changes

NASA Astrophysics Data System (ADS)

oh, Gabjin; Kim, Seunghwan; Eom, Cheoljun

2008-02-01

We studied the long-term memory in diverse stock market indices and foreign exchange rates using Detrended Fluctuation Analysis (DFA). For all high-frequency market data studied, no significant long-term memory property was detected in the return series, while a strong long-term memory property was found in the volatility time series. The possible causes of the long-term memory property were investigated using the return data filtered by the AR(1) model, reflecting the short-term memory property, the GARCH(1,1) model, reflecting the volatility clustering property, and the FIGARCH model, reflecting the long-term memory property of the volatility time series. The memory effect in the AR(1) filtered return and volatility time series remained unchanged, while the long-term memory property diminished significantly in the volatility series of the GARCH(1,1) filtered data. Notably, there is no long-term memory property, when we eliminate the long-term memory property of volatility by the FIGARCH model. For all data used, although the Hurst exponents of the volatility time series changed considerably over time, those of the time series with the volatility clustering effect removed diminish significantly. Our results imply that the long-term memory property of the volatility time series can be attributed to the volatility clustering observed in the financial time series.
Application of Multiple Imputation for Missing Values in Three-Way Three-Mode Multi-Environment Trial Data

PubMed Central

Tian, Ting; McLachlan, Geoffrey J.; Dieters, Mark J.; Basford, Kaye E.

2015-01-01

It is a common occurrence in plant breeding programs to observe missing values in three-way three-mode multi-environment trial (MET) data. We proposed modifications of models for estimating missing observations for these data arrays, and developed a novel approach in terms of hierarchical clustering. Multiple imputation (MI) was used in four ways, multiple agglomerative hierarchical clustering, normal distribution model, normal regression model, and predictive mean match. The later three models used both Bayesian analysis and non-Bayesian analysis, while the first approach used a clustering procedure with randomly selected attributes and assigned real values from the nearest neighbour to the one with missing observations. Different proportions of data entries in six complete datasets were randomly selected to be missing and the MI methods were compared based on the efficiency and accuracy of estimating those values. The results indicated that the models using Bayesian analysis had slightly higher accuracy of estimation performance than those using non-Bayesian analysis but they were more time-consuming. However, the novel approach of multiple agglomerative hierarchical clustering demonstrated the overall best performances. PMID:26689369
Application of Multiple Imputation for Missing Values in Three-Way Three-Mode Multi-Environment Trial Data.

PubMed

Tian, Ting; McLachlan, Geoffrey J; Dieters, Mark J; Basford, Kaye E

2015-01-01

It is a common occurrence in plant breeding programs to observe missing values in three-way three-mode multi-environment trial (MET) data. We proposed modifications of models for estimating missing observations for these data arrays, and developed a novel approach in terms of hierarchical clustering. Multiple imputation (MI) was used in four ways, multiple agglomerative hierarchical clustering, normal distribution model, normal regression model, and predictive mean match. The later three models used both Bayesian analysis and non-Bayesian analysis, while the first approach used a clustering procedure with randomly selected attributes and assigned real values from the nearest neighbour to the one with missing observations. Different proportions of data entries in six complete datasets were randomly selected to be missing and the MI methods were compared based on the efficiency and accuracy of estimating those values. The results indicated that the models using Bayesian analysis had slightly higher accuracy of estimation performance than those using non-Bayesian analysis but they were more time-consuming. However, the novel approach of multiple agglomerative hierarchical clustering demonstrated the overall best performances.
Peculiarities of the irisation in precious opals in view of their mosaic-cluster (frustumation) inner fabric

NASA Astrophysics Data System (ADS)

Povarennykh, M. Yu.; Knot'ko, A. V.; Matvienko, E. N.; Plechov, P. Yu.; Burmistrov, A. A.; Luksha, V. L.

2016-04-01

A direct correlation was shown for the first time between mosaic irisation patterns in synthetic and natural precious opals (from Australia, Ethiopia, Honduras, Slovakia, and Russia) and their frustumational (lump or mosaic-cluster) inner structure by means of photoluminescence, X-ray phase analysis, IR and Raman spectroscopy, and scanning electron microscopy.
The writer independent online handwriting recognition system frog on hand and cluster generative statistical dynamic time warping.

PubMed

Bahlmann, Claus; Burkhardt, Hans

2004-03-01

In this paper, we give a comprehensive description of our writer-independent online handwriting recognition system frog on hand. The focus of this work concerns the presentation of the classification/training approach, which we call cluster generative statistical dynamic time warping (CSDTW). CSDTW is a general, scalable, HMM-based method for variable-sized, sequential data that holistically combines cluster analysis and statistical sequence modeling. It can handle general classification problems that rely on this sequential type of data, e.g., speech recognition, genome processing, robotics, etc. Contrary to previous attempts, clustering and statistical sequence modeling are embedded in a single feature space and use a closely related distance measure. We show character recognition experiments of frog on hand using CSDTW on the UNIPEN online handwriting database. The recognition accuracy is significantly higher than reported results of other handwriting recognition systems. Finally, we describe the real-time implementation of frog on hand on a Linux Compaq iPAQ embedded device.
Faster sequence homology searches by clustering subsequences.

PubMed

Suzuki, Shuji; Kakuta, Masanori; Ishida, Takashi; Akiyama, Yutaka

2015-04-15

Sequence homology searches are used in various fields. New sequencing technologies produce huge amounts of sequence data, which continuously increase the size of sequence databases. As a result, homology searches require large amounts of computational time, especially for metagenomic analysis. We developed a fast homology search method based on database subsequence clustering, and implemented it as GHOSTZ. This method clusters similar subsequences from a database to perform an efficient seed search and ungapped extension by reducing alignment candidates based on triangle inequality. The database subsequence clustering technique achieved an ∼2-fold increase in speed without a large decrease in search sensitivity. When we measured with metagenomic data, GHOSTZ is ∼2.2-2.8 times faster than RAPSearch and is ∼185-261 times faster than BLASTX. The source code is freely available for download at http://www.bi.cs.titech.ac.jp/ghostz/ akiyama@cs.titech.ac.jp Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
Graduation of fertility schedules: an analysis of fertility patterns in London in the 1980s and an application to fertility forecasts.

PubMed

Congdon, P

1990-08-01

London's average total fertility rate (TFR) stood at 1.75. Using a cluster analysis to compare the 1985-1987 fertility patterns of different boroughs of London, demographers learned that 5 natural groupings occurred. 4 boroughs in a central London cluster have the distinction of having a low TFR (1.38) and late fertility (average age of 29.58 years). The researchers attributed these occurrences to the high levels of employment and career attachment and low rates of marriage among women in this cluster. 2 inner city boroughs constituted the smallest cluster and had the largest TFR (2.37), mainly due to high numbers of births to the ethnic minorities. The largest cluster consisted of 12 boroughs located mainly along the periphery with 2 centrally located boroughs (TFR, 1.79). Some of the upper class outer boroughs characterized another cluster with a TFR of 1.61. Another cluster made up of inner and outer boroughs in east and southeast London had a ample proportion of manual worker (TFR, 2.04). Social class most likely accounted for the contrast in TFRs between the 2 aformentioned clusters. Demographers observed that cyclical fluctuation of fertility occurred as opposed to secular trends. Due to these fluctuations, demographers used autoregressive moving average forecast models to time series of the fertility variables in London since 1952. They also applied structural time series models which included regression variables and the influence of cyclical and/or trend behavior. The results showed that large cohorts and the increase in female economic activity caused a delay in the modal age of births and a reduction in the number of births.
UQlust: combining profile hashing with linear-time ranking for efficient clustering and analysis of big macromolecular data.

PubMed

Adamczak, Rafal; Meller, Jarek

2016-12-28

Advances in computing have enabled current protein and RNA structure prediction and molecular simulation methods to dramatically increase their sampling of conformational spaces. The quickly growing number of experimentally resolved structures, and databases such as the Protein Data Bank, also implies large scale structural similarity analyses to retrieve and classify macromolecular data. Consequently, the computational cost of structure comparison and clustering for large sets of macromolecular structures has become a bottleneck that necessitates further algorithmic improvements and development of efficient software solutions. uQlust is a versatile and easy-to-use tool for ultrafast ranking and clustering of macromolecular structures. uQlust makes use of structural profiles of proteins and nucleic acids, while combining a linear-time algorithm for implicit comparison of all pairs of models with profile hashing to enable efficient clustering of large data sets with a low memory footprint. In addition to ranking and clustering of large sets of models of the same protein or RNA molecule, uQlust can also be used in conjunction with fragment-based profiles in order to cluster structures of arbitrary length. For example, hierarchical clustering of the entire PDB using profile hashing can be performed on a typical laptop, thus opening an avenue for structural explorations previously limited to dedicated resources. The uQlust package is freely available under the GNU General Public License at https://github.com/uQlust . uQlust represents a drastic reduction in the computational complexity and memory requirements with respect to existing clustering and model quality assessment methods for macromolecular structure analysis, while yielding results on par with traditional approaches for both proteins and RNAs.
Hebbian self-organizing integrate-and-fire networks for data clustering.

PubMed

Landis, Florian; Ott, Thomas; Stoop, Ruedi

2010-01-01

We propose a Hebbian learning-based data clustering algorithm using spiking neurons. The algorithm is capable of distinguishing between clusters and noisy background data and finds an arbitrary number of clusters of arbitrary shape. These properties render the approach particularly useful for visual scene segmentation into arbitrarily shaped homogeneous regions. We present several application examples, and in order to highlight the advantages and the weaknesses of our method, we systematically compare the results with those from standard methods such as the k-means and Ward's linkage clustering. The analysis demonstrates that not only the clustering ability of the proposed algorithm is more powerful than those of the two concurrent methods, the time complexity of the method is also more modest than that of its generally used strongest competitor.
Whole Blood Gene Expression Profiling Predicts Severe Morbidity and Mortality in Cystic Fibrosis: A 5-Year Follow-Up Study.

PubMed

Saavedra, Milene T; Quon, Bradley S; Faino, Anna; Caceres, Silvia M; Poch, Katie R; Sanders, Linda A; Malcolm, Kenneth C; Nichols, David P; Sagel, Scott D; Taylor-Cousar, Jennifer L; Leach, Sonia M; Strand, Matthew; Nick, Jerry A

2018-05-01

Cystic fibrosis pulmonary exacerbations accelerate pulmonary decline and increase mortality. Previously, we identified a 10-gene leukocyte panel measured directly from whole blood, which indicates response to exacerbation treatment. We hypothesized that molecular characteristics of exacerbations could also predict future disease severity. We tested whether a 10-gene panel measured from whole blood could identify patient cohorts at increased risk for severe morbidity and mortality, beyond standard clinical measures. Transcript abundance for the 10-gene panel was measured from whole blood at the beginning of exacerbation treatment (n = 57). A hierarchical cluster analysis of subjects based on their gene expression was performed, yielding four molecular clusters. An analysis of cluster membership and outcomes incorporating an independent cohort (n = 21) was completed to evaluate robustness of cluster partitioning of genes to predict severe morbidity and mortality. The four molecular clusters were analyzed for differences in forced expiratory volume in 1 second, C-reactive protein, return to baseline forced expiratory volume in 1 second after treatment, time to next exacerbation, and time to morbidity or mortality events (defined as lung transplant referral, lung transplant, intensive care unit admission for respiratory insufficiency, or death). Clustering based on gene expression discriminated between patient groups with significant differences in forced expiratory volume in 1 second, admission frequency, and overall morbidity and mortality. At 5 years, all subjects in cluster 1 (very low risk) were alive and well, whereas 90% of subjects in cluster 4 (high risk) had suffered a major event (P = 0.0001). In multivariable analysis, the ability of gene expression to predict clinical outcomes remained significant, despite adjustment for forced expiratory volume in 1 second, sex, and admission frequency. The robustness of gene clustering to categorize patients appropriately in terms of clinical characteristics, and short- and long-term clinical outcomes, remained consistent, even when adding in a secondary population with significantly different clinical outcomes. Whole blood gene expression profiling allows molecular classification of acute pulmonary exacerbations, beyond standard clinical measures, providing a predictive tool for identifying subjects at increased risk for mortality and disease progression.
Monitoring Wetland Hydro-dynamics in the Prairie Pothole Region Using Landsat Time Series

NASA Astrophysics Data System (ADS)

Zhou, Q.; Rover, J.; Gallant, A.

2017-12-01

Wetlands provide a variety of ecosystem functions, while it is spatially and temporally dynamic. We mapped the dynamics of wetlands in the North Dakota Prairie Pothole Region using all available clear observations of Landsat sensor data from 1985 to 2014. We used a cluster analysis to group pixels exhibiting similar long-term spectral trends over seven Landsat bands, then applied the tasseled-cap transformation to evaluate the temporal characteristics of brightness, greenness, and wetness for each cluster. We tested relations between these three indices and hydrologic conditions, as represented by the Palmer Hydrological Drought Index (PHDI), using the cross-correlation analysis for each cluster performed over an eight-year moving window for the 30 years covered by the study. This temporal window size coincided with the timing of a major shift from a prolonged drought that occurred within the first eight years of the study period to wetter conditions that prevailed throughout the remaining years. The 20 cluster we produced represented a gradient from locations that continuously held water throughout the study period to locations that, at most, held water only for short periods in some years. The spatial distribution of the cluster groups reflected patterns of regional geologic and geomorphologic features. Comparisons of the PHDI to tasseled-cap wetness were the most straightforward to interpret among the results from the three indices. Wetness for most cluster groups had high positive correlations with PHDI during drought years, with the correlations reduced as the landscape entered a lengthy, wetter period; however, wetness generally remained highly and positively correlated with PHDI across all years for four cluster groups where the area exhibited two or more multi-year dry-wet cycles. These same four groups also had strong, generally negative correlations with tasseled-cap brightness. For other cluster groups, brightness often was strongly negatively correlated with the PHDI during the drought years, with the relation weakening for subsequent years of adequate or high moisture. Relations between tasseled-cap greenness and PHDI were highly variable among and within cluster groups. Results from this analysis support ongoing efforts to develop new products that characterize wetland dynamics.

MOCCA-SURVEY Database I: Is NGC 6535 a dark star cluster harbouring an IMBH?

NASA Astrophysics Data System (ADS)

Askar, Abbas; Bianchini, Paolo; de Vita, Ruggero; Giersz, Mirek; Hypki, Arkadiusz; Kamann, Sebastian

2017-01-01

We describe the dynamical evolution of a unique type of dark star cluster model in which the majority of the cluster mass at Hubble time is dominated by an intermediate-mass black hole (IMBH). We analysed results from about 2000 star cluster models (Survey Database I) simulated using the Monte Carlo code MOnte Carlo Cluster simulAtor and identified these dark star cluster models. Taking one of these models, we apply the method of simulating realistic `mock observations' by utilizing the Cluster simulatiOn Comparison with ObservAtions (COCOA) and Simulating Stellar Cluster Observation (SISCO) codes to obtain the photometric and kinematic observational properties of the dark star cluster model at 12 Gyr. We find that the perplexing Galactic globular cluster NGC 6535 closely matches the observational photometric and kinematic properties of the dark star cluster model presented in this paper. Based on our analysis and currently observed properties of NGC 6535, we suggest that this globular cluster could potentially harbour an IMBH. If it exists, the presence of this IMBH can be detected robustly with proposed kinematic observations of NGC 6535.
Temporality in British young women's magazines: food, cooking and weight loss.

PubMed

Spencer, Rosemary J; Russell, Jean M; Barker, Margo E

2014-10-01

The present study examines seasonal and temporal patterns in food-related content of two UK magazines for young women focusing on food types, cooking and weight loss. Content analysis of magazines from three time blocks between 1999 and 2011. Desk-based study. Ninety-seven magazines yielding 590 advertisements and 148 articles. Cluster analysis of type of food advertising produced three clusters of magazines, which reflected recognised food behaviours of young women: vegetarianism, convenience eating and weight control. The first cluster of magazines was associated with Christmas and Millennium time periods, with advertising of alcohol, coffee, cheese, vegetarian meat substitutes and weight-loss pills. Recipes were prominent in article content and tended to be for cakes/desserts, luxury meals and party food. The second cluster was associated with summer months and 2010 issues. There was little advertising for conventional foods in cluster 2, but strong representation of diet plans and foods for weight loss. Weight-loss messages in articles focused on short-term aesthetic goals, emphasising speedy weight loss without giving up nice foods or exercising. Cluster 3 magazines were associated with post-New Year and 2005 periods. Food advertising was for everyday foods and convenience products, with fewer weight-loss products than other clusters; conversely, article content had a greater prevalence of weight-loss messages. The cyclical nature of magazine content - indulgence and excess encouraged at Christmas, restraint recommended post-New Year and severe dieting advocated in the summer months - endorses yo-yo dieting behaviour and may not be conducive to public health.
A Dimensionality Reduction-Based Multi-Step Clustering Method for Robust Vessel Trajectory Analysis

PubMed Central

Liu, Jingxian; Wu, Kefeng

2017-01-01

The Shipboard Automatic Identification System (AIS) is crucial for navigation safety and maritime surveillance, data mining and pattern analysis of AIS information have attracted considerable attention in terms of both basic research and practical applications. Clustering of spatio-temporal AIS trajectories can be used to identify abnormal patterns and mine customary route data for transportation safety. Thus, the capacities of navigation safety and maritime traffic monitoring could be enhanced correspondingly. However, trajectory clustering is often sensitive to undesirable outliers and is essentially more complex compared with traditional point clustering. To overcome this limitation, a multi-step trajectory clustering method is proposed in this paper for robust AIS trajectory clustering. In particular, the Dynamic Time Warping (DTW), a similarity measurement method, is introduced in the first step to measure the distances between different trajectories. The calculated distances, inversely proportional to the similarities, constitute a distance matrix in the second step. Furthermore, as a widely-used dimensional reduction method, Principal Component Analysis (PCA) is exploited to decompose the obtained distance matrix. In particular, the top k principal components with above 95% accumulative contribution rate are extracted by PCA, and the number of the centers k is chosen. The k centers are found by the improved center automatically selection algorithm. In the last step, the improved center clustering algorithm with k clusters is implemented on the distance matrix to achieve the final AIS trajectory clustering results. In order to improve the accuracy of the proposed multi-step clustering algorithm, an automatic algorithm for choosing the k clusters is developed according to the similarity distance. Numerous experiments on realistic AIS trajectory datasets in the bridge area waterway and Mississippi River have been implemented to compare our proposed method with traditional spectral clustering and fast affinity propagation clustering. Experimental results have illustrated its superior performance in terms of quantitative and qualitative evaluations. PMID:28777353
Minimum number of clusters and comparison of analysis methods for cross sectional stepped wedge cluster randomised trials with binary outcomes: A simulation study.

PubMed

Barker, Daniel; D'Este, Catherine; Campbell, Michael J; McElduff, Patrick

2017-03-09

Stepped wedge cluster randomised trials frequently involve a relatively small number of clusters. The most common frameworks used to analyse data from these types of trials are generalised estimating equations and generalised linear mixed models. A topic of much research into these methods has been their application to cluster randomised trial data and, in particular, the number of clusters required to make reasonable inferences about the intervention effect. However, for stepped wedge trials, which have been claimed by many researchers to have a statistical power advantage over the parallel cluster randomised trial, the minimum number of clusters required has not been investigated. We conducted a simulation study where we considered the most commonly used methods suggested in the literature to analyse cross-sectional stepped wedge cluster randomised trial data. We compared the per cent bias, the type I error rate and power of these methods in a stepped wedge trial setting with a binary outcome, where there are few clusters available and when the appropriate adjustment for a time trend is made, which by design may be confounding the intervention effect. We found that the generalised linear mixed modelling approach is the most consistent when few clusters are available. We also found that none of the common analysis methods for stepped wedge trials were both unbiased and maintained a 5% type I error rate when there were only three clusters. Of the commonly used analysis approaches, we recommend the generalised linear mixed model for small stepped wedge trials with binary outcomes. We also suggest that in a stepped wedge design with three steps, at least two clusters be randomised at each step, to ensure that the intervention effect estimator maintains the nominal 5% significance level and is also reasonably unbiased.
Spatial-temporal clustering of companion animal enteric syndrome: detection and investigation through the use of electronic medical records from participating private practices.

PubMed

Anholt, R M; Berezowski, J; Robertson, C; Stephen, C

2015-09-01

There is interest in the potential of companion animal surveillance to provide data to improve pet health and to provide early warning of environmental hazards to people. We implemented a companion animal surveillance system in Calgary, Alberta and the surrounding communities. Informatics technologies automatically extracted electronic medical records from participating veterinary practices and identified cases of enteric syndrome in the warehoused records. The data were analysed using time-series analyses and a retrospective space-time permutation scan statistic. We identified a seasonal pattern of reports of occurrences of enteric syndromes in companion animals and four statistically significant clusters of enteric syndrome cases. The cases within each cluster were examined and information about the animals involved (species, age, sex), their vaccination history, possible exposure or risk behaviour history, information about disease severity, and the aetiological diagnosis was collected. We then assessed whether the cases within the cluster were unusual and if they represented an animal or public health threat. There was often insufficient information recorded in the medical record to characterize the clusters by aetiology or exposures. Space-time analysis of companion animal enteric syndrome cases found evidence of clustering. Collection of more epidemiologically relevant data would enhance the utility of practice-based companion animal surveillance.
Profiling nurses' job satisfaction, acculturation, work environment, stress, cultural values and coping abilities: A cluster analysis.

PubMed

Goh, Yong-Shian; Lee, Alice; Chan, Sally Wai-Chi; Chan, Moon Fai

2015-08-01

This study aimed to determine whether definable profiles existed in a cohort of nursing staff with regard to demographic characteristics, job satisfaction, acculturation, work environment, stress, cultural values and coping abilities. A survey was conducted in one hospital in Singapore from June to July 2012, and 814 full-time staff nurses completed a self-report questionnaire (89% response rate). Demographic characteristics, job satisfaction, acculturation, work environment, perceived stress, cultural values, ways of coping and intention to leave current workplace were assessed as outcomes. The two-step cluster analysis revealed three clusters. Nurses in cluster 1 (n = 222) had lower acculturation scores than nurses in cluster 3. Cluster 2 (n = 362) was a group of younger nurses who reported higher intention to leave (22.4%), stress level and job dissatisfaction than the other two clusters. Nurses in cluster 3 (n = 230) were mostly Singaporean and reported the lowest intention to leave (13.0%). Resources should be allocated to specifically address the needs of younger nurses and hopefully retain them in the profession. Management should focus their retention strategies on junior nurses and provide a work environment that helps to strengthen their intention to remain in nursing by increasing their job satisfaction. © 2014 Wiley Publishing Asia Pty Ltd.
Impact of non-uniform correlation structure on sample size and power in multiple-period cluster randomised trials.

PubMed

Kasza, J; Hemming, K; Hooper, R; Matthews, Jns; Forbes, A B

2017-01-01

Stepped wedge and cluster randomised crossover trials are examples of cluster randomised designs conducted over multiple time periods that are being used with increasing frequency in health research. Recent systematic reviews of both of these designs indicate that the within-cluster correlation is typically taken account of in the analysis of data using a random intercept mixed model, implying a constant correlation between any two individuals in the same cluster no matter how far apart in time they are measured: within-period and between-period intra-cluster correlations are assumed to be identical. Recently proposed extensions allow the within- and between-period intra-cluster correlations to differ, although these methods require that all between-period intra-cluster correlations are identical, which may not be appropriate in all situations. Motivated by a proposed intensive care cluster randomised trial, we propose an alternative correlation structure for repeated cross-sectional multiple-period cluster randomised trials in which the between-period intra-cluster correlation is allowed to decay depending on the distance between measurements. We present results for the variance of treatment effect estimators for varying amounts of decay, investigating the consequences of the variation in decay on sample size planning for stepped wedge, cluster crossover and multiple-period parallel-arm cluster randomised trials. We also investigate the impact of assuming constant between-period intra-cluster correlations instead of decaying between-period intra-cluster correlations. Our results indicate that in certain design configurations, including the one corresponding to the proposed trial, a correlation decay can have an important impact on variances of treatment effect estimators, and hence on sample size and power. An R Shiny app allows readers to interactively explore the impact of correlation decay.
MC 2: Dynamical Analysis of the Merging Galaxy Cluster MACS J1149.5+2223

DOE PAGES

Golovich, Nathan; Dawson, William A.; Wittman, David; ...

2016-10-31

Here, we present an analysis of the merging cluster MACS J1149.5+2223 using archival imaging from Subaru/Suprime-Cam and multi-object spectroscopy from Keck/DEIMOS and Gemini/GMOS. We employ two- and three-dimensional substructure tests and determine that MACS J1149.5+2223 is composed of two separate mergers among three subclusters occurring ~1 Gyr apart. The primary merger gives rise to elongated X-ray morphology and a radio relic in the southeast. The brightest cluster galaxy is a member of the northern subcluster of the primary merger. This subcluster is very massive (more » $${16.7}_{-1.60}^{+1.25}\\times {10}^{14}\\,{M}_{\\odot }$$). The southern subcluster is also very massive ($${10.8}_{-3.54}^{+3.37}\\times {10}^{14}\\,{M}_{\\odot }$$), yet it lacks an associated X-ray surface brightness peak, and it has been unidentified previously despite the detailed study of this Frontier Field cluster. A secondary merger is occurring in the north along the line of sight (LOS) with a third, less massive subcluster ($${1.20}_{-0.34}^{+0.19}\\times {10}^{14}\\,{M}_{\\odot }$$). We perform a Monte Carlo dynamical analysis on the main merger and estimate a collision speed at pericenter of $${2770}_{-310}^{+610}$$ km s -1. We show the merger to be returning from apocenter with core passage occurring $${1.16}_{-0.25}^{+0.50}$$ Gyr before the observed state. We identify the LOS merging subcluster in a strong lensing analysis in the literature and show that it is likely bound to MACS J1149 despite having reached an extreme collision velocity of ~4000 km s -1.« less
Closed-cage tungsten oxide clusters in the gas phase.

PubMed

Singh, D M David Jeba; Pradeep, T; Thirumoorthy, Krishnan; Balasubramanian, Krishnan

2010-05-06

During the course of a study on the clustering of W-Se and W-S mixtures in the gas phase using laser desorption ionization (LDI) mass spectrometry, we observed several anionic W-O clusters. Three distinct species, W(6)O(19)(-), W(13)O(29)(-), and W(14)O(32)(-), stand out as intense peaks in the regular mass spectral pattern of tungsten oxide clusters suggesting unusual stabilities for them. Moreover, these clusters do not fragment in the postsource decay analysis. While trying to understand the precursor material, which produced these clusters, we found the presence of nanoscale forms of tungsten oxide. The structure and thermodynamic parameters of tungsten clusters have been explored using relativistic quantum chemical methods. Our computed results of atomization energy are consistent with the observed LDI mass spectra. The computational results suggest that the clusters observed have closed-cage structure. These distinct W(13) and W(14) clusters were observed for the first time in the gas phase.
Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Science and Real-Time Decision Support

NASA Astrophysics Data System (ADS)

Wright, D. J.; Raad, M.; Hoel, E.; Park, M.; Mollenkopf, A.; Trujillo, R.

2016-12-01

Introduced is a new approach for processing spatiotemporal big data by leveraging distributed analytics and storage. A suite of temporally-aware analysis tools summarizes data nearby or within variable windows, aggregates points (e.g., for various sensor observations or vessel positions), reconstructs time-enabled points into tracks (e.g., for mapping and visualizing storm tracks), joins features (e.g., to find associations between features based on attributes, spatial relationships, temporal relationships or all three simultaneously), calculates point densities, finds hot spots (e.g., in species distributions), and creates space-time slices and cubes (e.g., in microweather applications with temperature, humidity, and pressure, or within human mobility studies). These "feature geo analytics" tools run in both batch and streaming spatial analysis mode as distributed computations across a cluster of servers on typical "big" data sets, where static data exist in traditional geospatial formats (e.g., shapefile) locally on a disk or file share, attached as static spatiotemporal big data stores, or streamed in near-real-time. In other words, the approach registers large datasets or data stores with ArcGIS Server, then distributes analysis across a cluster of machines for parallel processing. Several brief use cases will be highlighted based on a 16-node server cluster at 14 Gb RAM per node, allowing, for example, the buffering of over 8 million points or thousands of polygons in 1 minute. The approach is "hybrid" in that ArcGIS Server integrates open-source big data frameworks such as Apache Hadoop and Apache Spark on the cluster in order to run the analytics. In addition, the user may devise and connect custom open-source interfaces and tools developed in Python or Python Notebooks; the common denominator being the familiar REST API.
Real-time analysis of self-assembled nucleobases by Venturi easy ambient sonic-spray ionization mass spectrometry.

PubMed

Na, Na; Shi, Ruixia; Long, Zi; Lu, Xin; Jiang, Fubin; Ouyang, Jin

2014-10-01

In this study, the real-time analysis of self-assembled nucleobases was employed by Venturi easy ambient sonic-spray ionization mass spectrometry (V-EASI-MS). With the analysis of three nucleobases including 6-methyluracil (6MU), uracil (U) and thymine (T) as examples, different orders of clusters centered with different metal ions were recorded in both positive and negative modes. Compared with the results obtained by traditional electrospray ionization mass spectrometry (ESI-MS) under the same condition, more clusters with high orders, such as [6MU7+Na](+), [6MU15+2NH4](2+), [6MU10+Na](+), [T7+Na](+), and [T15+2NH4](2+) were detected by V-EASI-MS, which demonstrated the soft ionization ability of V-EASI for studying the non-covalent interaction in a self-assembly process. Furthermore, with the injection of K(+) to the system by a syringe pumping, the real-time monitoring of the formation of nucleobases clusters was achieved by the direct extraction of samples from the system under the Venturi effect. Therefore, the effect of cations on the formation of clusters during self-assembly of nucleobases was demonstrated, which was in accordance with the reports. Free of high voltage, heating or radiation during the ionization, this technique is much soft and suitable for obtaining the real-time information of the self-assembly system, which also makes it quite convenient for extraction samples from the reaction system. This "easy and soft" ionization technique has provided a potential pathway for monitoring and controlling the self-assembly processes. Copyright © 2014 Elsevier B.V. All rights reserved.
Metrics and methods for characterizing dairy farm intensification using farm survey data.

PubMed

Gonzalez-Mejia, Alejandra; Styles, David; Wilson, Paul; Gibbons, James

2018-01-01

Evaluation of agricultural intensification requires comprehensive analysis of trends in farm performance across physical and socio-economic aspects, which may diverge across farm types. Typical reporting of economic indicators at sectorial or the "average farm" level does not represent farm diversity and provides limited insight into the sustainability of specific intensification pathways. Using farm business data from a total of 7281 farm survey observations of English and Welsh dairy farms over a 14-year period we calculate a time series of 16 key performance indicators (KPIs) pertinent to farm structure, environmental and socio-economic aspects of sustainability. We then apply principle component analysis and model-based clustering analysis to identify statistically the number of distinct dairy farm typologies for each year of study, and link these clusters through time using multidimensional scaling. Between 2001 and 2014, dairy farms have largely consolidated and specialized into two distinct clusters: more extensive farms relying predominantly on grass, with lower milk yields but higher labour intensity, and more intensive farms producing more milk per cow with more concentrate and more maize, but lower labour intensity. There is some indication that these clusters are converging as the extensive cluster is intensifying slightly faster than the intensive cluster, in terms of milk yield per cow and use of concentrate feed. In 2014, annual milk yields were 6,835 and 7,500 l/cow for extensive and intensive farm types, respectively, whilst annual concentrate feed use was 1.3 and 1.5 tonnes per cow. For several KPIs such as milk yield the mean trend across all farms differed substantially from the extensive and intensive typologies mean. The indicators and analysis methodology developed allows identification of distinct farm types and industry trends using readily available survey data. The identified groups allow the accurate evaluation of the consequences of the reduction in dairy farm numbers and intensification at national and international scales.
Metrics and methods for characterizing dairy farm intensification using farm survey data

PubMed Central

Gonzalez-Mejia, Alejandra; Styles, David; Wilson, Paul

2018-01-01

Evaluation of agricultural intensification requires comprehensive analysis of trends in farm performance across physical and socio-economic aspects, which may diverge across farm types. Typical reporting of economic indicators at sectorial or the “average farm” level does not represent farm diversity and provides limited insight into the sustainability of specific intensification pathways. Using farm business data from a total of 7281 farm survey observations of English and Welsh dairy farms over a 14-year period we calculate a time series of 16 key performance indicators (KPIs) pertinent to farm structure, environmental and socio-economic aspects of sustainability. We then apply principle component analysis and model-based clustering analysis to identify statistically the number of distinct dairy farm typologies for each year of study, and link these clusters through time using multidimensional scaling. Between 2001 and 2014, dairy farms have largely consolidated and specialized into two distinct clusters: more extensive farms relying predominantly on grass, with lower milk yields but higher labour intensity, and more intensive farms producing more milk per cow with more concentrate and more maize, but lower labour intensity. There is some indication that these clusters are converging as the extensive cluster is intensifying slightly faster than the intensive cluster, in terms of milk yield per cow and use of concentrate feed. In 2014, annual milk yields were 6,835 and 7,500 l/cow for extensive and intensive farm types, respectively, whilst annual concentrate feed use was 1.3 and 1.5 tonnes per cow. For several KPIs such as milk yield the mean trend across all farms differed substantially from the extensive and intensive typologies mean. The indicators and analysis methodology developed allows identification of distinct farm types and industry trends using readily available survey data. The identified groups allow the accurate evaluation of the consequences of the reduction in dairy farm numbers and intensification at national and international scales. PMID:29742166
Designing and evaluating health systems level hypertension control interventions for African-Americans: lessons from a pooled analysis of three cluster randomized trials.

PubMed

Pavlik, Valory N; Chan, Wenyaw; Hyman, David J; Feldman, Penny; Ogedegbe, Gbenga; Schwartz, Joseph E; McDonald, Margaret; Einhorn, Paula; Tobin, Jonathan N

2015-01-01

African-Americans (AAs) have a high prevalence of hypertension and their blood pressure (BP) control on treatment still lags behind other groups. In 2004, NHLBI funded five projects that aimed to evaluate clinically feasible interventions to effect changes in medical care delivery leading to an increased proportion of AA patients with controlled BP. Three of the groups performed a pooled analysis of trial results to determine: 1) the magnitude of the combined intervention effect; and 2) how the pooled results could inform the methodology for future health-system level BP interventions. Using a cluster randomized design, the trials enrolled AAs with uncontrolled hypertension to test interventions targeting a combination of patient and clinician behaviors. The 12-month Systolic BP (SBP) and Diastolic BP (DBP) effects of intervention or control cluster assignment were assessed using mixed effects longitudinal regression modeling. 2,015 patients representing 352 clusters participated across the three trials. Pooled BP slopes followed a quadratic pattern, with an initial decline, followed by a rise toward baseline, and did not differ significantly between intervention and control clusters: SBP linear coefficient = -2.60±0.21 mmHg per month, p<0.001; quadratic coefficient = 0.167± 0.02 mmHg/month, p<0.001; group by time interaction group by time group x linear time coefficient=0.145 ± 0.293, p=0.622; group x quadratic time coefficient= -0.017 ± 0.026, p=0.525). RESULTS were similar for DBP. The individual sites did not have significant intervention effects when analyzed separately. Investigators planning behavioral trials to improve BP control in health systems serving AAs should plan for small effect sizes and employ a "run-in" period in which BP can be expected to improve in both experimental and control clusters.
Clustering of diet, physical activity and sedentary behaviour among Australian children: cross-sectional and longitudinal associations with overweight and obesity.

PubMed

Leech, R M; McNaughton, S A; Timperio, A

2015-07-01

Evidence suggests diet, physical activity (PA) and sedentary behaviour cluster together in children, but research supporting an association with overweight/obesity is equivocal. Furthermore, the stability of clusters over time is unknown. The aim of this study was to examine the clustering of diet, PA and sedentary behaviour in Australian children and cross-sectional and longitudinal associations with overweight/obesity. Stability of obesity-related clusters over 3 years was also examined. Data were drawn from the baseline (T1: 2002/2003) and follow-up waves (T2: 2005/2006) of the Health Eating and Play Study. Parents of Australian children aged 5-6 (n=87) and 10-12 years (n=123) completed questionnaires. Children wore accelerometers and height and weight were measured. Obesity-related clusters were determined using K-medians cluster analysis. Multivariate regression models assessed cross-sectional and longitudinal associations between cluster membership, and body mass index (BMI) Z-score and weight status. Kappa statistics assessed cluster stability over time. Three clusters, labelled 'most healthy', 'energy-dense (ED) consumers who watch TV' and 'high sedentary behaviour/low moderate-to-vigorous PA' were identified at baseline and at follow-up. No cross-sectional associations were found between cluster membership, and BMI Z-score or weight status at baseline. Longitudinally, children in the 'ED consumers who watch TV' cluster had a higher odds of being overweight/obese at follow-up (odds ratio=2.8; 95% confidence interval: 1.1, 6.9; P<0.05). Tracking of cluster membership was fair to moderate in younger (K=0.24; P=0.0001) and older children (K=0.46; P<0.0001). This study identified an unhealthy cluster of TV viewing with ED food/drink consumption, which predicted overweight/obesity in a small longitudinal sample of Australian children. Cluster stability was fair to moderate over 3 years and is a novel finding. Prospective research in larger samples is needed to examine how obesity-related clusters track over time and influence the development of overweight and obesity.
Voronoi distance based prospective space-time scans for point data sets: a dengue fever cluster analysis in a southeast Brazilian town

PubMed Central

2011-01-01

Background The Prospective Space-Time scan statistic (PST) is widely used for the evaluation of space-time clusters of point event data. Usually a window of cylindrical shape is employed, with a circular or elliptical base in the space domain. Recently, the concept of Minimum Spanning Tree (MST) was applied to specify the set of potential clusters, through the Density-Equalizing Euclidean MST (DEEMST) method, for the detection of arbitrarily shaped clusters. The original map is cartogram transformed, such that the control points are spread uniformly. That method is quite effective, but the cartogram construction is computationally expensive and complicated. Results A fast method for the detection and inference of point data set space-time disease clusters is presented, the Voronoi Based Scan (VBScan). A Voronoi diagram is built for points representing population individuals (cases and controls). The number of Voronoi cells boundaries intercepted by the line segment joining two cases points defines the Voronoi distance between those points. That distance is used to approximate the density of the heterogeneous population and build the Voronoi distance MST linking the cases. The successive removal of edges from the Voronoi distance MST generates sub-trees which are the potential space-time clusters. Finally, those clusters are evaluated through the scan statistic. Monte Carlo replications of the original data are used to evaluate the significance of the clusters. An application for dengue fever in a small Brazilian city is presented. Conclusions The ability to promptly detect space-time clusters of disease outbreaks, when the number of individuals is large, was shown to be feasible, due to the reduced computational load of VBScan. Instead of changing the map, VBScan modifies the metric used to define the distance between cases, without requiring the cartogram construction. Numerical simulations showed that VBScan has higher power of detection, sensitivity and positive predicted value than the Elliptic PST. Furthermore, as VBScan also incorporates topological information from the point neighborhood structure, in addition to the usual geometric information, it is more robust than purely geometric methods such as the elliptic scan. Those advantages were illustrated in a real setting for dengue fever space-time clusters. PMID:21513556
Clustering stock market companies via chaotic map synchronization

NASA Astrophysics Data System (ADS)

Basalto, N.; Bellotti, R.; De Carlo, F.; Facchi, P.; Pascazio, S.

2005-01-01

A pairwise clustering approach is applied to the analysis of the Dow Jones index companies, in order to identify similar temporal behavior of the traded stock prices. To this end, the chaotic map clustering algorithm is used, where a map is associated to each company and the correlation coefficients of the financial time series to the coupling strengths between maps. The simulation of a chaotic map dynamics gives rise to a natural partition of the data, as companies belonging to the same industrial branch are often grouped together. The identification of clusters of companies of a given stock market index can be exploited in the portfolio optimization strategies.
The method of approximate cluster analysis and the three-dimensional diagram of optical characteristics of the lunar surface

NASA Astrophysics Data System (ADS)

Evsyukov, N. N.

1984-12-01

An approximate isolation algorithm for the isolation of multidimensional clusters is developed and applied in the construction of a three-dimensional diagram of the optical characteristics of the lunar surface. The method is somewhat analogous to that of Koontz and Fukunaga (1972) and involves isolating two-dimensional clusters, adding a new characteristic, and linearizing, a cycle which is repeated a limited number of times. The lunar-surface parameters analyzed are the 620-nm albedo, the 620/380-nm color index, and the 950/620-nm index. The results are presented graphically; the reliability of the cluster-isolation process is discussed; and some correspondences between known lunar morphology and the cluster maps are indicated.
Effects of temperature and pressure on the nucleation and growth of silver clusters from supersaturated vapor: A molecular dynamics analysis

NASA Astrophysics Data System (ADS)

Wang, Qin; Xie, Hui; Chen, Yongshi; Liu, Chao

2017-04-01

The nucleation and growth of silver nanoparticles in the supersaturated system are investigated by molecular dynamics simulation at different temperatures and pressures. The variety of the atoms in the biggest cluster and the size of average clusters in the system versus the time are estimated to reveal the relationship between the nucleation as well as cluster growth. The nucleation rates in different situations are calculated with the threshold method. The effect of temperature and pressure on the nucleation rate is identified as obeying a linear function. Finally, the development of basal elements, such as monomers, dimers and trimmers, is revealed how the temperature and pressure affect the nucleation and growth of the silver cluster.
Density-based clustering: A 'landscape view' of multi-channel neural data for inference and dynamic complexity analysis.

PubMed

Baglietto, Gabriel; Gigante, Guido; Del Giudice, Paolo

2017-01-01

Two, partially interwoven, hot topics in the analysis and statistical modeling of neural data, are the development of efficient and informative representations of the time series derived from multiple neural recordings, and the extraction of information about the connectivity structure of the underlying neural network from the recorded neural activities. In the present paper we show that state-space clustering can provide an easy and effective option for reducing the dimensionality of multiple neural time series, that it can improve inference of synaptic couplings from neural activities, and that it can also allow the construction of a compact representation of the multi-dimensional dynamics, that easily lends itself to complexity measures. We apply a variant of the 'mean-shift' algorithm to perform state-space clustering, and validate it on an Hopfield network in the glassy phase, in which metastable states are largely uncorrelated from memories embedded in the synaptic matrix. In this context, we show that the neural states identified as clusters' centroids offer a parsimonious parametrization of the synaptic matrix, which allows a significant improvement in inferring the synaptic couplings from the neural activities. Moving to the more realistic case of a multi-modular spiking network, with spike-frequency adaptation inducing history-dependent effects, we propose a procedure inspired by Boltzmann learning, but extending its domain of application, to learn inter-module synaptic couplings so that the spiking network reproduces a prescribed pattern of spatial correlations; we then illustrate, in the spiking network, how clustering is effective in extracting relevant features of the network's state-space landscape. Finally, we show that the knowledge of the cluster structure allows casting the multi-dimensional neural dynamics in the form of a symbolic dynamics of transitions between clusters; as an illustration of the potential of such reduction, we define and analyze a measure of complexity of the neural time series.

Diversity in Older Adults' Use of the Internet: Identifying Subgroups Through Latent Class Analysis.

PubMed

van Boekel, Leonieke C; Peek, Sebastiaan Tm; Luijkx, Katrien G

2017-05-24

As for all individuals, the Internet is important in the everyday life of older adults. Research on older adults' use of the Internet has merely focused on users versus nonusers and consequences of Internet use and nonuse. Older adults are a heterogeneous group, which may implicate that their use of the Internet is diverse as well. Older adults can use the Internet for different activities, and this usage can be of influence on benefits the Internet can have for them. The aim of this paper was to describe the diversity or heterogeneity in the activities for which older adults use the Internet and determine whether diversity is related to social or health-related variables. We used data of a national representative Internet panel in the Netherlands. Panel members aged 65 years and older and who have access to and use the Internet were selected (N=1418). We conducted a latent class analysis based on the Internet activities that panel members reported to spend time on. Second, we described the identified clusters with descriptive statistics and compared the clusters using analysis of variance (ANOVA) and chi-square tests. Four clusters were distinguished. Cluster 1 was labeled as the "practical users" (36.88%, n=523). These respondents mainly used the Internet for practical and financial purposes such as searching for information, comparing products, and banking. Respondents in Cluster 2, the "minimizers" (32.23%, n=457), reported lowest frequency on most Internet activities, are older (mean age 73 years), and spent the smallest time on the Internet. Cluster 3 was labeled as the "maximizers" (17.77%, n=252); these respondents used the Internet for various activities, spent most time on the Internet, and were relatively younger (mean age below 70 years). Respondents in Cluster 4, the "social users," mainly used the Internet for social and leisure-related activities such as gaming and social network sites. The identified clusters significantly differed in age (P<.001, ω 2 =0.07), time spent on the Internet (P<.001, ω 2 =0.12), and frequency of downloading apps (P<.001, ω 2 =0.14), with medium to large effect sizes. Social and health-related variables were significantly different between the clusters, except social and emotional loneliness. However, effect sizes were small. The minimizers scored significantly lower on psychological well-being, instrumental activities of daily living (iADL), and experienced health compared with the practical users and maximizers. Older adults are a diverse group in terms of their activities on the Internet. This underlines the importance to look beyond use versus nonuse when studying older adults' Internet use. The clusters we have identified in this study can help tailor the development and deployment of eHealth intervention to specific segments of the older population. ©Leonieke C van Boekel, Sebastiaan TM Peek, Katrien G Luijkx. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 24.05.2017.
Diversity in Older Adults’ Use of the Internet: Identifying Subgroups Through Latent Class Analysis

PubMed Central

van Boekel, Leonieke C; Peek, Sebastiaan TM; Luijkx, Katrien G

2017-01-01

Background As for all individuals, the Internet is important in the everyday life of older adults. Research on older adults’ use of the Internet has merely focused on users versus nonusers and consequences of Internet use and nonuse. Older adults are a heterogeneous group, which may implicate that their use of the Internet is diverse as well. Older adults can use the Internet for different activities, and this usage can be of influence on benefits the Internet can have for them. Objective The aim of this paper was to describe the diversity or heterogeneity in the activities for which older adults use the Internet and determine whether diversity is related to social or health-related variables. Methods We used data of a national representative Internet panel in the Netherlands. Panel members aged 65 years and older and who have access to and use the Internet were selected (N=1418). We conducted a latent class analysis based on the Internet activities that panel members reported to spend time on. Second, we described the identified clusters with descriptive statistics and compared the clusters using analysis of variance (ANOVA) and chi-square tests. Results Four clusters were distinguished. Cluster 1 was labeled as the “practical users” (36.88%, n=523). These respondents mainly used the Internet for practical and financial purposes such as searching for information, comparing products, and banking. Respondents in Cluster 2, the “minimizers” (32.23%, n=457), reported lowest frequency on most Internet activities, are older (mean age 73 years), and spent the smallest time on the Internet. Cluster 3 was labeled as the “maximizers” (17.77%, n=252); these respondents used the Internet for various activities, spent most time on the Internet, and were relatively younger (mean age below 70 years). Respondents in Cluster 4, the “social users,” mainly used the Internet for social and leisure-related activities such as gaming and social network sites. The identified clusters significantly differed in age (P<.001, ω2=0.07), time spent on the Internet (P<.001, ω2=0.12), and frequency of downloading apps (P<.001, ω2=0.14), with medium to large effect sizes. Social and health-related variables were significantly different between the clusters, except social and emotional loneliness. However, effect sizes were small. The minimizers scored significantly lower on psychological well-being, instrumental activities of daily living (iADL), and experienced health compared with the practical users and maximizers. Conclusions Older adults are a diverse group in terms of their activities on the Internet. This underlines the importance to look beyond use versus nonuse when studying older adults’ Internet use. The clusters we have identified in this study can help tailor the development and deployment of eHealth intervention to specific segments of the older population. PMID:28539302
Gas and galaxies in filaments between clusters of galaxies. The study of A399-A401

NASA Astrophysics Data System (ADS)

Bonjean, V.; Aghanim, N.; Salomé, P.; Douspis, M.; Beelen, A.

2018-01-01

We have performed a multi-wavelength analysis of two galaxy cluster systems selected with the thermal Sunyaev-Zel'dovich (tSZ) effect and composed of cluster pairs and an inter-cluster filament. We have focused on one pair of particular interest: A399-A401 at redshift z 0.073 seperated by 3 Mpc. We have also performed the first analysis of one lower-significance newly associated pair: A21-PSZ2 G114.09-34.34 at z 0.094, separated by 4.2 Mpc. We have characterised the intra-cluster gas using the tSZ signal from Planck and, when possible, the galaxy optical and infrared (IR) properties based on two photometric redshift catalogues: 2MPZ and WISExSCOS. From the tSZ data, we measured the gas pressure in the clusters and in the inter-cluster filaments. In the case of A399-A401, the results are in perfect agreement with previous studies and, using the temperature measured from the X-rays, we further estimate the gas density in the filament and find n0 = (4.3 ± 0.7) × 10-4 cm-3. The optical and IR colour-colour and colour-magnitude analyses of the galaxies selected in the cluster system, together with their star formation rate, show no segregation between galaxy populations, both in the clusters and in the filament of A399-A401. Galaxies are all passive, early type, and red and dead. The gas and galaxy properties of this system suggest that the whole system formed at the same time and corresponds to a pre-merger, with a cosmic filament gas heated by the collapse. For the other cluster system, the tSZ analysis was performed and the pressure in the clusters and in the inter-cluster filament was constrained. However, the limited or nonexistent optical and IR data prevent us from concluding on the presence of an actual cosmic filament or from proposing a scenario.
Multi-scale clustering of functional data with application to hydraulic gradients in wetlands

USGS Publications Warehouse

Greenwood, Mark C.; Sojda, Richard S.; Sharp, Julia L.; Peck, Rory G.; Rosenberry, Donald O.

2011-01-01

A new set of methods are developed to perform cluster analysis of functions, motivated by a data set consisting of hydraulic gradients at several locations distributed across a wetland complex. The methods build on previous work on clustering of functions, such as Tarpey and Kinateder (2003) and Hitchcock et al. (2007), but explore functions generated from an additive model decomposition (Wood, 2006) of the original time se- ries. Our decomposition targets two aspects of the series, using an adaptive smoother for the trend and circular spline for the diurnal variation in the series. Different measures for comparing locations are discussed, including a method for efficiently clustering time series that are of different lengths using a functional data approach. The complicated nature of these wetlands are highlighted by the shifting group memberships depending on which scale of variation and year of the study are considered.
VAX CLuster upgrade: Report of a CPC task force

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hanson, J.; Berry, H.; Kessler, P.

The CSCF VAX cluster provides interactive computing for 100 users during prime time, plus a considerable amount of daytime and overnight batch processing. While this cluster represents less than 10% of the VAX computing power at BNL (6 MIPS out of 70), it has served as an important center for this larger network, supporting special hardware and software too expensive to maintain on every machine. In addition, it is the only unrestricted facility available to VAX/VMS users (other machines are typically dedicated to special projects). This committee's analysis shows that the cpu's on the CSCF cluster are currently badly oversaturated,more » frequently giving extremely poor interactive response. Short batch jobs (a necessary part of interactive work) typically take 3 to 4 times as long to execute as they would on an idle machine. There is also an immediate need for more scratch disk space and user permanent file space.« less
Multivariate Analysis of Remains of Molluscan Foods Consumed by Latest Pleistocene and Holocene Humans in Nerja Cave, Málaga, Spain

NASA Astrophysics Data System (ADS)

Serrano, Francisco; Guerra-Merchán, Antonio; Lozano-Francisco, Carmen; Vera-Peláez, José Luis

1997-09-01

Nerja Cave is a karstic cavity used by humans from Late Paleolithic to post-Chalcolithic times. Remains of molluscan foods in the uppermost Pleistocene and Holocene sediments were studied with cluster analysis and principal components analysis, in both Qand Rmodes. The results from cluster analysis distinguished interval groups mainly in accordance with chronology and distinguished assemblages of species mainly according to habitat. Significant changes in the shellfish diet through time were revealed. In the Late Magdalenian, most molluscs consumed consisted of pulmonate gastropods and species from sandy sea bottoms. The Epipaleolithic diet was more varied and included species from rocky shorelines. From the Neolithic onward most molluscs consumed were from rocky shorelines. From the principal components analysis in Qmode, the first factor reflected mainly changes in the predominant capture environment, probably because of major paleogeographic changes. The second factor may reflect selective capture along rocky coastlines during certain times. The third factor correlated well with the sea-surface temperature curve in the western Mediterranean (Alboran Sea) during the late Quaternary.
Quantification and clustering of phenotypic screening data using time-series analysis for chemotherapy of schistosomiasis.

PubMed

Lee, Hyokyeong; Moody-Davis, Asher; Saha, Utsab; Suzuki, Brian M; Asarnow, Daniel; Chen, Steven; Arkin, Michelle; Caffrey, Conor R; Singh, Rahul

2012-01-01

Neglected tropical diseases, especially those caused by helminths, constitute some of the most common infections of the world's poorest people. Development of techniques for automated, high-throughput drug screening against these diseases, especially in whole-organism settings, constitutes one of the great challenges of modern drug discovery. We present a method for enabling high-throughput phenotypic drug screening against diseases caused by helminths with a focus on schistosomiasis. The proposed method allows for a quantitative analysis of the systemic impact of a drug molecule on the pathogen as exhibited by the complex continuum of its phenotypic responses. This method consists of two key parts: first, biological image analysis is employed to automatically monitor and quantify shape-, appearance-, and motion-based phenotypes of the parasites. Next, we represent these phenotypes as time-series and show how to compare, cluster, and quantitatively reason about them using techniques of time-series analysis. We present results on a number of algorithmic issues pertinent to the time-series representation of phenotypes. These include results on appropriate representation of phenotypic time-series, analysis of different time-series similarity measures for comparing phenotypic responses over time, and techniques for clustering such responses by similarity. Finally, we show how these algorithmic techniques can be used for quantifying the complex continuum of phenotypic responses of parasites. An important corollary is the ability of our method to recognize and rigorously group parasites based on the variability of their phenotypic response to different drugs. The methods and results presented in this paper enable automatic and quantitative scoring of high-throughput phenotypic screens focused on helmintic diseases. Furthermore, these methods allow us to analyze and stratify parasites based on their phenotypic response to drugs. Together, these advancements represent a significant breakthrough for the process of drug discovery against schistosomiasis in particular and can be extended to other helmintic diseases which together afflict a large part of humankind.
Quantification and clustering of phenotypic screening data using time-series analysis for chemotherapy of schistosomiasis

PubMed Central

2012-01-01

Background Neglected tropical diseases, especially those caused by helminths, constitute some of the most common infections of the world's poorest people. Development of techniques for automated, high-throughput drug screening against these diseases, especially in whole-organism settings, constitutes one of the great challenges of modern drug discovery. Method We present a method for enabling high-throughput phenotypic drug screening against diseases caused by helminths with a focus on schistosomiasis. The proposed method allows for a quantitative analysis of the systemic impact of a drug molecule on the pathogen as exhibited by the complex continuum of its phenotypic responses. This method consists of two key parts: first, biological image analysis is employed to automatically monitor and quantify shape-, appearance-, and motion-based phenotypes of the parasites. Next, we represent these phenotypes as time-series and show how to compare, cluster, and quantitatively reason about them using techniques of time-series analysis. Results We present results on a number of algorithmic issues pertinent to the time-series representation of phenotypes. These include results on appropriate representation of phenotypic time-series, analysis of different time-series similarity measures for comparing phenotypic responses over time, and techniques for clustering such responses by similarity. Finally, we show how these algorithmic techniques can be used for quantifying the complex continuum of phenotypic responses of parasites. An important corollary is the ability of our method to recognize and rigorously group parasites based on the variability of their phenotypic response to different drugs. Conclusions The methods and results presented in this paper enable automatic and quantitative scoring of high-throughput phenotypic screens focused on helmintic diseases. Furthermore, these methods allow us to analyze and stratify parasites based on their phenotypic response to drugs. Together, these advancements represent a significant breakthrough for the process of drug discovery against schistosomiasis in particular and can be extended to other helmintic diseases which together afflict a large part of humankind. PMID:22369037
Clustering of longitudinal data by using an extended baseline: A new method for treatment efficacy clustering in longitudinal data.

PubMed

Schramm, Catherine; Vial, Céline; Bachoud-Lévi, Anne-Catherine; Katsahian, Sandrine

2018-01-01

Heterogeneity in treatment efficacy is a major concern in clinical trials. Clustering may help to identify the treatment responders and the non-responders. In the context of longitudinal cluster analyses, sample size and variability of the times of measurements are the main issues with the current methods. Here, we propose a new two-step method for the Clustering of Longitudinal data by using an Extended Baseline. The first step relies on a piecewise linear mixed model for repeated measurements with a treatment-time interaction. The second step clusters the random predictions and considers several parametric (model-based) and non-parametric (partitioning, ascendant hierarchical clustering) algorithms. A simulation study compares all options of the clustering of longitudinal data by using an extended baseline method with the latent-class mixed model. The clustering of longitudinal data by using an extended baseline method with the two model-based algorithms was the more robust model. The clustering of longitudinal data by using an extended baseline method with all the non-parametric algorithms failed when there were unequal variances of treatment effect between clusters or when the subgroups had unbalanced sample sizes. The latent-class mixed model failed when the between-patients slope variability is high. Two real data sets on neurodegenerative disease and on obesity illustrate the clustering of longitudinal data by using an extended baseline method and show how clustering may help to identify the marker(s) of the treatment response. The application of the clustering of longitudinal data by using an extended baseline method in exploratory analysis as the first stage before setting up stratified designs can provide a better estimation of treatment effect in future clinical trials.
Improved Correction of Atmospheric Pressure Data Obtained by Smartphones through Machine Learning

PubMed Central

Kim, Yong-Hyuk; Ha, Ji-Hun; Kim, Na-Young; Im, Hyo-Hyuc; Sim, Sangjin; Choi, Reno K. Y.

2016-01-01

A correction method using machine learning aims to improve the conventional linear regression (LR) based method for correction of atmospheric pressure data obtained by smartphones. The method proposed in this study conducts clustering and regression analysis with time domain classification. Data obtained in Gyeonggi-do, one of the most populous provinces in South Korea surrounding Seoul with the size of 10,000 km2, from July 2014 through December 2014, using smartphones were classified with respect to time of day (daytime or nighttime) as well as day of the week (weekday or weekend) and the user's mobility, prior to the expectation-maximization (EM) clustering. Subsequently, the results were analyzed for comparison by applying machine learning methods such as multilayer perceptron (MLP) and support vector regression (SVR). The results showed a mean absolute error (MAE) 26% lower on average when regression analysis was performed through EM clustering compared to that obtained without EM clustering. For machine learning methods, the MAE for SVR was around 31% lower for LR and about 19% lower for MLP. It is concluded that pressure data from smartphones are as good as the ones from national automatic weather station (AWS) network. PMID:27524999
Particulate matter time-series and Köppen-Geiger climate classes in North America and Europe

NASA Astrophysics Data System (ADS)

Pražnikar, Jure

2017-02-01

Four years of time-series data on the particulate matter (PM) concentrations from 801 monitoring stations located in Europe and 234 stations in North America were analyzed. Using k-means clustering with distance correlation as a measure for similarity, 5 distinct PM clusters in Europe and 9 clusters across the United States of America (USA) were found. This study shows that meteorology has an important role in controlling PM concentrations, as comparison between Köppen-Geiger climate zones and identified PM clusters revealed very good spatial overlapping. Moreover, the Köppen-Geiger boundaries in Europe show a high similarity to the boundaries as defined by PM clusters. The western USA is much more diverse regarding climate zones; this characteristic was confirmed by cluster analysis, as 6 clusters were identified in the west, and only 3 were identified on the eastern side of the USA. The lowest similarity between PM time-series in Europe was observed between the Iberian Peninsula and the north Europe clusters. These two regions also show considerable differences, as the cold semi-arid climate has a long and hot summer period, while the cool continental climate has a short summertime and long and cold winters. Additionally, intra-continental examination of European clusters showed meteorologically driven phenomena in autumn 2011 encompassing a large European region from Bulgaria in the south, Germany in central Europe and Finland in the north with high PM concentrations in November and a decline in December 2011. Inter-continental comparison between Europe and the USA clusters revealed a remarkable difference between the PM time-series located in humid continental zone. It seems that because of higher shortwave downwelling radiation (≈210 W m-2) over the USA's continental zone, and consequently more intense production of secondary aerosols, a summer peak in PM concentration was observed. On the other hand, Europe's humid continental climate region experiences lower solar radiation (≈180 W m-2); consequently, the elevated summer-time PM concentrations were not detected.
[Visual field progression in glaucoma: cluster analysis].

PubMed

Bresson-Dumont, H; Hatton, J; Foucher, J; Fonteneau, M

2012-11-01

Visual field progression analysis is one of the key points in glaucoma monitoring, but distinction between true progression and random fluctuation is sometimes difficult. There are several different algorithms but no real consensus for detecting visual field progression. The trend analysis of global indices (MD, sLV) may miss localized deficits or be affected by media opacities. Conversely, point-by-point analysis makes progression difficult to differentiate from physiological variability, particularly when the sensitivity of a point is already low. The goal of our study was to analyse visual field progression with the EyeSuite™ Octopus Perimetry Clusters algorithm in patients with no significant changes in global indices or worsening of the analysis of pointwise linear regression. We analyzed the visual fields of 162 eyes (100 patients - 58 women, 42 men, average age 66.8 ± 10.91) with ocular hypertension or glaucoma. For inclusion, at least six reliable visual fields per eye were required, and the trend analysis (EyeSuite™ Perimetry) of visual field global indices (MD and SLV), could show no significant progression. The analysis of changes in cluster mode was then performed. In a second step, eyes with statistically significant worsening of at least one of their clusters were analyzed point-by-point with the Octopus Field Analysis (OFA). Fifty four eyes (33.33%) had a significant worsening in some clusters, while their global indices remained stable over time. In this group of patients, more advanced glaucoma was present than in stable group (MD 6.41 dB vs. 2.87); 64.82% (35/54) of those eyes in which the clusters progressed, however, had no statistically significant change in the trend analysis by pointwise linear regression. Most software algorithms for analyzing visual field progression are essentially trend analyses of global indices, or point-by-point linear regression. This study shows the potential role of analysis by clusters trend. However, for best results, it is preferable to compare the analyses of several tests in combination with morphologic exam. Copyright © 2012 Elsevier Masson SAS. All rights reserved.
The Gaia-ESO Survey: the present-day radial metallicity distribution of the Galactic disc probed by pre-main-sequence clusters

NASA Astrophysics Data System (ADS)

Spina, L.; Randich, S.; Magrini, L.; Jeffries, R. D.; Friel, E. D.; Sacco, G. G.; Pancino, E.; Bonito, R.; Bravi, L.; Franciosini, E.; Klutsch, A.; Montes, D.; Gilmore, G.; Vallenari, A.; Bensby, T.; Bragaglia, A.; Flaccomio, E.; Koposov, S. E.; Korn, A. J.; Lanzafame, A. C.; Smiljanic, R.; Bayo, A.; Carraro, G.; Casey, A. R.; Costado, M. T.; Damiani, F.; Donati, P.; Frasca, A.; Hourihane, A.; Jofré, P.; Lewis, J.; Lind, K.; Monaco, L.; Morbidelli, L.; Prisinzano, L.; Sousa, S. G.; Worley, C. C.; Zaggia, S.

2017-05-01

Context. The radial metallicity distribution in the Galactic thin disc represents a crucial constraint for modelling disc formation and evolution. Open star clusters allow us to derive both the radial metallicity distribution and its evolution over time. Aims: In this paper we perform the first investigation of the present-day radial metallicity distribution based on [Fe/H] determinations in late type members of pre-main-sequence clusters. Because of their youth, these clusters are therefore essential for tracing the current interstellar medium metallicity. Methods: We used the products of the Gaia-ESO Survey analysis of 12 young regions (age < 100 Myr), covering Galactocentric distances from 6.67 to 8.70 kpc. For the first time, we derived the metal content of star forming regions farther than 500 pc from the Sun. Median metallicities were determined through samples of reliable cluster members. For ten clusters the membership analysis is discussed in the present paper, while for other two clusters (I.e. Chamaeleon I and Gamma Velorum) we adopted the members identified in our previous works. Results: All the pre-main-sequence clusters considered in this paper have close-to-solar or slightly sub-solar metallicities. The radial metallicity distribution traced by these clusters is almost flat, with the innermost star forming regions having [Fe/H] values that are 0.10-0.15 dex lower than the majority of the older clusters located at similar Galactocentric radii. Conclusions: This homogeneous study of the present-day radial metallicity distribution in the Galactic thin disc favours models that predict a flattening of the radial gradient over time. On the other hand, the decrease of the average [Fe/H] at young ages is not easily explained by the models. Our results reveal a complex interplay of several processes (e.g. star formation activity, initial mass function, supernova yields, gas flows) that controlled the recent evolution of the Milky Way. Based on observations made with the ESO/VLT, at Paranal Observatory, under program 188.B-3002 (The Gaia-ESO Public Spectroscopic Survey).Full Table 1 is only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/601/A70
Complex time series analysis of PM10 and PM2.5 for a coastal site using artificial neural network modelling and k-means clustering

NASA Astrophysics Data System (ADS)

Elangasinghe, M. A.; Singhal, N.; Dirks, K. N.; Salmond, J. A.; Samarasinghe, S.

2014-09-01

This paper uses artificial neural networks (ANN), combined with k-means clustering, to understand the complex time series of PM10 and PM2.5 concentrations at a coastal location of New Zealand based on data from a single site. Out of available meteorological parameters from the network (wind speed, wind direction, solar radiation, temperature, relative humidity), key factors governing the pattern of the time series concentrations were identified through input sensitivity analysis performed on the trained neural network model. The transport pathways of particulate matter under these key meteorological parameters were further analysed through bivariate concentration polar plots and k-means clustering techniques. The analysis shows that the external sources such as marine aerosols and local sources such as traffic and biomass burning contribute equally to the particulate matter concentrations at the study site. These results are in agreement with the results of receptor modelling by the Auckland Council based on Positive Matrix Factorization (PMF). Our findings also show that contrasting concentration-wind speed relationships exist between marine aerosols and local traffic sources resulting in very noisy and seemingly large random PM10 concentrations. The inclusion of cluster rankings as an input parameter to the ANN model showed a statistically significant (p < 0.005) improvement in the performance of the ANN time series model and also showed better performance in picking up high concentrations. For the presented case study, the correlation coefficient between observed and predicted concentrations improved from 0.77 to 0.79 for PM2.5 and from 0.63 to 0.69 for PM10 and reduced the root mean squared error (RMSE) from 5.00 to 4.74 for PM2.5 and from 6.77 to 6.34 for PM10. The techniques presented here enable the user to obtain an understanding of potential sources and their transport characteristics prior to the implementation of costly chemical analysis techniques or advanced air dispersion models.
From virtual clustering analysis to self-consistent clustering analysis: a mathematical study

NASA Astrophysics Data System (ADS)

Tang, Shaoqiang; Zhang, Lei; Liu, Wing Kam

2018-03-01

In this paper, we propose a new homogenization algorithm, virtual clustering analysis (VCA), as well as provide a mathematical framework for the recently proposed self-consistent clustering analysis (SCA) (Liu et al. in Comput Methods Appl Mech Eng 306:319-341, 2016). In the mathematical theory, we clarify the key assumptions and ideas of VCA and SCA, and derive the continuous and discrete Lippmann-Schwinger equations. Based on a key postulation of "once response similarly, always response similarly", clustering is performed in an offline stage by machine learning techniques (k-means and SOM), and facilitates substantial reduction of computational complexity in an online predictive stage. The clear mathematical setup allows for the first time a convergence study of clustering refinement in one space dimension. Convergence is proved rigorously, and found to be of second order from numerical investigations. Furthermore, we propose to suitably enlarge the domain in VCA, such that the boundary terms may be neglected in the Lippmann-Schwinger equation, by virtue of the Saint-Venant's principle. In contrast, they were not obtained in the original SCA paper, and we discover these terms may well be responsible for the numerical dependency on the choice of reference material property. Since VCA enhances the accuracy by overcoming the modeling error, and reduce the numerical cost by avoiding an outer loop iteration for attaining the material property consistency in SCA, its efficiency is expected even higher than the recently proposed SCA algorithm.
Tweets clustering using latent semantic analysis

NASA Astrophysics Data System (ADS)

Rasidi, Norsuhaili Mahamed; Bakar, Sakhinah Abu; Razak, Fatimah Abdul

2017-04-01

Social media are becoming overloaded with information due to the increasing number of information feeds. Unlike other social media, Twitter users are allowed to broadcast a short message called as `tweet". In this study, we extract tweets related to MH370 for certain of time. In this paper, we present overview of our approach for tweets clustering to analyze the users' responses toward tragedy of MH370. The tweets were clustered based on the frequency of terms obtained from the classification process. The method we used for the text classification is Latent Semantic Analysis. As a result, there are two types of tweets that response to MH370 tragedy which is emotional and non-emotional. We show some of our initial results to demonstrate the effectiveness of our approach.
Spatial and temporal changes in household structure locations using high-resolution satellite imagery for population assessment: an analysis in southern Zambia, 2006-2011.

PubMed

Shields, Timothy; Pinchoff, Jessie; Lubinda, Jailos; Hamapumbu, Harry; Searle, Kelly; Kobayashi, Tamaki; Thuma, Philip E; Moss, William J; Curriero, Frank C

2016-05-31

Satellite imagery is increasingly available at high spatial resolution and can be used for various purposes in public health research and programme implementation. Comparing a census generated from two satellite images of the same region in rural southern Zambia obtained four and a half years apart identified patterns of household locations and change over time. The length of time that a satellite image-based census is accurate determines its utility. Households were enumerated manually from satellite images obtained in 2006 and 2011 of the same area. Spatial statistics were used to describe clustering, cluster detection, and spatial variation in the location of households. A total of 3821 household locations were enumerated in 2006 and 4256 in 2011, a net change of 435 houses (11.4% increase). Comparison of the images indicated that 971 (25.4%) structures were added and 536 (14.0%) removed. Further analysis suggested similar household clustering in the two images and no substantial difference in concentration of households across the study area. Cluster detection analysis identified a small area where significantly more household structures were removed than expected; however, the amount of change was of limited practical significance. These findings suggest that random sampling of households for study participation would not induce geographic bias if based on a 4.5-year-old image in this region. Application of spatial statistical methods provides insights into the population distribution changes between two time periods and can be helpful in assessing the accuracy of satellite imagery.
Phylogenomic and MALDI-TOF MS Analysis of Streptococcus sinensis HKU4T Reveals a Distinct Phylogenetic Clade in the Genus Streptococcus

PubMed Central

Tse, Herman; Chen, Jonathan H.K.; Tang, Ying; Lau, Susanna K.P.; Woo, Patrick C.Y.

2014-01-01

Streptococcus sinensis is a recently discovered human pathogen isolated from blood cultures of patients with infective endocarditis. Its phylogenetic position, as well as those of its closely related species, remains inconclusive when single genes were used for phylogenetic analysis. For example, S. sinensis branched out from members of the anginosus, mitis, and sanguinis groups in the 16S ribosomal RNA gene phylogenetic tree, but it was clustered with members of the anginosus and sanguinis groups when groEL gene sequences used for analysis. In this study, we sequenced the draft genome of S. sinensis and used a polyphasic approach, including concatenated genes, whole genomes, and matrix-assisted laser desorption ionization-time of flight mass spectrometry to analyze the phylogeny of S. sinensis. The size of the S. sinensis draft genome is 2.06 Mb, with GC content of 42.2%. Phylogenetic analysis using 50 concatenated genes or whole genomes revealed that S. sinensis formed a distinct cluster with Streptococcus oligofermentans and Streptococcus cristatus, and these three streptococci were clustered with the “sanguinis group.” As for phylogenetic analysis using hierarchical cluster analysis of the mass spectra of streptococci, S. sinensis also formed a distinct cluster with S. oligofermentans and S. cristatus, but these three streptococci were clustered with the “mitis group.” On the basis of the findings, we propose a novel group, named “sinensis group,” to include S. sinensis, S. oligofermentans, and S. cristatus, in the Streptococcus genus. Our study also illustrates the power of phylogenomic analyses for resolving ambiguities in bacterial taxonomy. PMID:25331233
Phylogenomic and MALDI-TOF MS analysis of Streptococcus sinensis HKU4T reveals a distinct phylogenetic clade in the genus Streptococcus.

PubMed

Teng, Jade L L; Huang, Yi; Tse, Herman; Chen, Jonathan H K; Tang, Ying; Lau, Susanna K P; Woo, Patrick C Y

2014-10-20

Streptococcus sinensis is a recently discovered human pathogen isolated from blood cultures of patients with infective endocarditis. Its phylogenetic position, as well as those of its closely related species, remains inconclusive when single genes were used for phylogenetic analysis. For example, S. sinensis branched out from members of the anginosus, mitis, and sanguinis groups in the 16S ribosomal RNA gene phylogenetic tree, but it was clustered with members of the anginosus and sanguinis groups when groEL gene sequences used for analysis. In this study, we sequenced the draft genome of S. sinensis and used a polyphasic approach, including concatenated genes, whole genomes, and matrix-assisted laser desorption ionization-time of flight mass spectrometry to analyze the phylogeny of S. sinensis. The size of the S. sinensis draft genome is 2.06 Mb, with GC content of 42.2%. Phylogenetic analysis using 50 concatenated genes or whole genomes revealed that S. sinensis formed a distinct cluster with Streptococcus oligofermentans and Streptococcus cristatus, and these three streptococci were clustered with the "sanguinis group." As for phylogenetic analysis using hierarchical cluster analysis of the mass spectra of streptococci, S. sinensis also formed a distinct cluster with S. oligofermentans and S. cristatus, but these three streptococci were clustered with the "mitis group." On the basis of the findings, we propose a novel group, named "sinensis group," to include S. sinensis, S. oligofermentans, and S. cristatus, in the Streptococcus genus. Our study also illustrates the power of phylogenomic analyses for resolving ambiguities in bacterial taxonomy. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Intracluster age gradients in numerous young stellar clusters

NASA Astrophysics Data System (ADS)

Getman, K. V.; Feigelson, E. D.; Kuhn, M. A.; Bate, M. R.; Broos, P. S.; Garmire, G. P.

2018-05-01

The pace and pattern of star formation leading to rich young stellar clusters is quite uncertain. In this context, we analyse the spatial distribution of ages within 19 young (median t ≲ 3 Myr on the Siess et al. time-scale), morphologically simple, isolated, and relatively rich stellar clusters. Our analysis is based on young stellar object (YSO) samples from the Massive Young Star-Forming Complex Study in Infrared and X-ray and Star Formation in Nearby Clouds surveys, and a new estimator of pre-main sequence (PMS) stellar ages, AgeJX, derived from X-ray and near-infrared photometric data. Median cluster ages are computed within four annular subregions of the clusters. We confirm and extend the earlier result of Getman et al. (2014): 80 per cent of the clusters show age trends where stars in cluster cores are younger than in outer regions. Our cluster stacking analyses establish the existence of an age gradient to high statistical significance in several ways. Time-scales vary with the choice of PMS evolutionary model; the inferred median age gradient across the studied clusters ranges from 0.75 to 1.5 Myr pc-1. The empirical finding reported in the present study - late or continuing formation of stars in the cores of star clusters with older stars dispersed in the outer regions - has a strong foundation with other observational studies and with the astrophysical models like the global hierarchical collapse model of Vázquez-Semadeni et al.

Recent increased identification and transmission of HIV-1 unique recombinant forms in Sweden.

PubMed

Neogi, Ujjwal; Siddik, Abu Bakar; Kalaghatgi, Prabhav; Gisslén, Magnus; Bratt, Göran; Marrone, Gaetano; Sönnerborg, Anders

2017-07-25

A temporal increase in non-B subtypes has earlier been described in Sweden by us and we hypothesized that this increased viral heterogeneity may become a hotspot for the development of more complex and unique recombinant forms (URFs) if the epidemics converge. In the present study, we performed subtyping using four automated tools and phylogenetic analysis by RAxML of pol gene sequences (n = 5246) and HIV-1 near full-length genome (HIV-NFLG) sequences (n = 104). A CD4 + T-cell decline trajectory algorithm was used to estimate time of HIV infection. Transmission clusters were identified using the family-joining method. The analysis of HIV-NFLG and pol gene described 10.6% (11/104) and 2.6% (137/5246) of the strains as URFs, respectively. An increasing trend of URFs was observed in recent years by both approaches (p = 0·0082; p < 0·0001). Transmission cluster analysis using the pol gene of all URFs identified 14 clusters with two to eight sequences. Larger transmission clusters of URFs (BF1 and 01B) were observed among MSM who mostly were sero-diagnosed in recent time. Understanding the increased appearance and transmission of URFs in recent years could have importance for public health interventions and the use of HIV-NFLG would provide better statistical support for such assessments.
Open-Source Sequence Clustering Methods Improve the State Of the Art.

PubMed

Kopylova, Evguenia; Navas-Molina, Jose A; Mercier, Céline; Xu, Zhenjiang Zech; Mahé, Frédéric; He, Yan; Zhou, Hong-Wei; Rognes, Torbjørn; Caporaso, J Gregory; Knight, Rob

2016-01-01

Sequence clustering is a common early step in amplicon-based microbial community analysis, when raw sequencing reads are clustered into operational taxonomic units (OTUs) to reduce the run time of subsequent analysis steps. Here, we evaluated the performance of recently released state-of-the-art open-source clustering software products, namely, OTUCLUST, Swarm, SUMACLUST, and SortMeRNA, against current principal options (UCLUST and USEARCH) in QIIME, hierarchical clustering methods in mothur, and USEARCH's most recent clustering algorithm, UPARSE. All the latest open-source tools showed promising results, reporting up to 60% fewer spurious OTUs than UCLUST, indicating that the underlying clustering algorithm can vastly reduce the number of these derived OTUs. Furthermore, we observed that stringent quality filtering, such as is done in UPARSE, can cause a significant underestimation of species abundance and diversity, leading to incorrect biological results. Swarm, SUMACLUST, and SortMeRNA have been included in the QIIME 1.9.0 release. IMPORTANCE Massive collections of next-generation sequencing data call for fast, accurate, and easily accessible bioinformatics algorithms to perform sequence clustering. A comprehensive benchmark is presented, including open-source tools and the popular USEARCH suite. Simulated, mock, and environmental communities were used to analyze sensitivity, selectivity, species diversity (alpha and beta), and taxonomic composition. The results demonstrate that recent clustering algorithms can significantly improve accuracy and preserve estimated diversity without the application of aggressive filtering. Moreover, these tools are all open source, apply multiple levels of multithreading, and scale to the demands of modern next-generation sequencing data, which is essential for the analysis of massive multidisciplinary studies such as the Earth Microbiome Project (EMP) (J. A. Gilbert, J. K. Jansson, and R. Knight, BMC Biol 12:69, 2014, http://dx.doi.org/10.1186/s12915-014-0069-1).
The cosmological analysis of X-ray cluster surveys - I. A new method for interpreting number counts

NASA Astrophysics Data System (ADS)

Clerc, N.; Pierre, M.; Pacaud, F.; Sadibekova, T.

2012-07-01

We present a new method aimed at simplifying the cosmological analysis of X-ray cluster surveys. It is based on purely instrumental observable quantities considered in a two-dimensional X-ray colour-magnitude diagram (hardness ratio versus count rate). The basic principle is that even in rather shallow surveys, substantial information on cluster redshift and temperature is present in the raw X-ray data and can be statistically extracted; in parallel, such diagrams can be readily predicted from an ab initio cosmological modelling. We illustrate the methodology for the case of a 100-deg2XMM survey having a sensitivity of ˜10-14 erg s-1 cm-2 and fit at the same time, the survey selection function, the cluster evolutionary scaling relations and the cosmology; our sole assumption - driven by the limited size of the sample considered in the case study - is that the local cluster scaling relations are known. We devote special attention to the realistic modelling of the count-rate measurement uncertainties and evaluate the potential of the method via a Fisher analysis. In the absence of individual cluster redshifts, the count rate and hardness ratio (CR-HR) method appears to be much more efficient than the traditional approach based on cluster counts (i.e. dn/dz, requiring redshifts). In the case where redshifts are available, our method performs similar to the traditional mass function (dn/dM/dz) for the purely cosmological parameters, but constrains better parameters defining the cluster scaling relations and their evolution. A further practical advantage of the CR-HR method is its simplicity: this fully top-down approach totally bypasses the tedious steps consisting in deriving cluster masses from X-ray temperature measurements.
The rates and time-delay distribution of multiply imaged supernovae behind lensing clusters

NASA Astrophysics Data System (ADS)

Li, Xue; Hjorth, Jens; Richard, Johan

2012-11-01

Time delays of gravitationally lensed sources can be used to constrain the mass model of a deflector and determine cosmological parameters. We here present an analysis of the time-delay distribution of multiply imaged sources behind 17 strong lensing galaxy clusters with well-calibrated mass models. We find that for time delays less than 1000 days, at z = 3.0, their logarithmic probability distribution functions are well represented by P(log Δt) = 5.3 × 10-4Δttilde beta/M2502tilde beta, with tilde beta = 0.77, where M250 is the projected cluster mass inside 250 kpc (in 1014M⊙), and tilde beta is the power-law slope of the distribution. The resultant probability distribution function enables us to estimate the time-delay distribution in a lensing cluster of known mass. For a cluster with M250 = 2 × 1014M⊙, the fraction of time delays less than 1000 days is approximately 3%. Taking Abell 1689 as an example, its dark halo and brightest galaxies, with central velocity dispersions σ>=500kms-1, mainly produce large time delays, while galaxy-scale mass clumps are responsible for generating smaller time delays. We estimate the probability of observing multiple images of a supernova in the known images of Abell 1689. A two-component model of estimating the supernova rate is applied in this work. For a magnitude threshold of mAB = 26.5, the yearly rate of Type Ia (core-collapse) supernovae with time delays less than 1000 days is 0.004±0.002 (0.029±0.001). If the magnitude threshold is lowered to mAB ~ 27.0, the rate of core-collapse supernovae suitable for time delay observation is 0.044±0.015 per year.
Portfolio Decisions and Brain Reactions via the CEAD method.

PubMed

Majer, Piotr; Mohr, Peter N C; Heekeren, Hauke R; Härdle, Wolfgang K

2016-09-01

Decision making can be a complex process requiring the integration of several attributes of choice options. Understanding the neural processes underlying (uncertain) investment decisions is an important topic in neuroeconomics. We analyzed functional magnetic resonance imaging (fMRI) data from an investment decision study for stimulus-related effects. We propose a new technique for identifying activated brain regions: cluster, estimation, activation, and decision method. Our analysis is focused on clusters of voxels rather than voxel units. Thus, we achieve a higher signal-to-noise ratio within the unit tested and a smaller number of hypothesis tests compared with the often used General Linear Model (GLM). We propose to first conduct the brain parcellation by applying spatially constrained spectral clustering. The information within each cluster can then be extracted by the flexible dynamic semiparametric factor model (DSFM) dimension reduction technique and finally be tested for differences in activation between conditions. This sequence of Cluster, Estimation, Activation, and Decision admits a model-free analysis of the local fMRI signal. Applying a GLM on the DSFM-based time series resulted in a significant correlation between the risk of choice options and changes in fMRI signal in the anterior insula and dorsomedial prefrontal cortex. Additionally, individual differences in decision-related reactions within the DSFM time series predicted individual differences in risk attitudes as modeled with the framework of the mean-variance model.
Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions

PubMed Central

Yoshimoto, Junichiro; Shimizu, Yu; Okada, Go; Takamura, Masahiro; Okamoto, Yasumasa; Yamawaki, Shigeto; Doya, Kenji

2017-01-01

We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views) for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data. PMID:29049392
Trajectories of acute low back pain: a latent class growth analysis.

PubMed

Downie, Aron S; Hancock, Mark J; Rzewuska, Magdalena; Williams, Christopher M; Lin, Chung-Wei Christine; Maher, Christopher G

2016-01-01

Characterising the clinical course of back pain by mean pain scores over time may not adequately reflect the complexity of the clinical course of acute low back pain. We analysed pain scores over 12 weeks for 1585 patients with acute low back pain presenting to primary care to identify distinct pain trajectory groups and baseline patient characteristics associated with membership of each cluster. This was a secondary analysis of the PACE trial that evaluated paracetamol for acute low back pain. Latent class growth analysis determined a 5 cluster model, which comprised 567 (35.8%) patients who recovered by week 2 (cluster 1, rapid pain recovery); 543 (34.3%) patients who recovered by week 12 (cluster 2, pain recovery by week 12); 222 (14.0%) patients whose pain reduced but did not recover (cluster 3, incomplete pain recovery); 167 (10.5%) patients whose pain initially decreased but then increased by week 12 (cluster 4, fluctuating pain); and 86 (5.4%) patients who experienced high-level pain for the whole 12 weeks (cluster 5, persistent high pain). Patients with longer pain duration were more likely to experience delayed recovery or nonrecovery. Belief in greater risk of persistence was associated with nonrecovery, but not delayed recovery. Higher pain intensity, longer duration, and workers' compensation were associated with persistent high pain, whereas older age and increased number of episodes were associated with fluctuating pain. Identification of discrete pain trajectory groups offers the potential to better manage acute low back pain.
Outbreaks of syphilis among men who have sex with men attending STI clinics between 2007 and 2015 in the Netherlands: a space-time clustering study.

PubMed

van Aar, F; den Daas, C; van der Sande, M A B; Soetens, L C; de Vries, H J C; van Benthem, B H B

2017-09-01

Infectious syphilis (syphilis) is diagnosed predominantly among men who have sex with men (MSM) in the Netherlands and is a strong indicator for sexual risk behaviour. Therefore, an increase in syphilis can be an early indicator of resurgence of other STIs, including HIV. National and worldwide outbreaks of syphilis, as well as potential changes in sexual networks were reason to explore syphilis trends and clusters in more depth. National STI/HIV surveillance data were used, containing epidemiological, behavioural and clinical data from STI clinics. We examined syphilis positivity rates stratified by HIV status and year. Additionally, we performed space-time cluster analysis on municipality level between 2007 and 2015, using SaTScan to evaluate whether or not there was a higher than expected syphilis incidence in a certain area and time period, using the maximum likelihood ratio test statistic. Among HIV-positive MSM, the syphilis positivity rate decreased between 2007 (12.3%) and 2011 (4.5%), followed by an increasing trend (2015: 8.0%). Among HIV-negative MSM, the positivity rate decreased between 2007 (2.8%) and 2011 also (1.4%) and started to increase from 2013 onwards (2015: 1.8%). In addition, we identified three geospatial clusters. The first cluster consisted of MSM sex workers in the South of the Netherlands (July 2009-September 2010, n=10, p<0.001). The second cluster were mostly HIV-positive MSM (58.5%) (Amsterdam; July 2011-December 2015; n=1123, p<0.001), although the proportion of HIV-negative MSM increased over time. The third cluster was large in space (predominantly the city of Rotterdam; April-September 2015, n=72, p=0.014) and were mostly HIV-negative MSM (62.5%). Using SaTScan analysis, we observed several not yet recognised outbreaks and a rapid resurgence of syphilis among known HIV-positive MSM first, but more recently, also among HIV-negative MSM. The three identified clusters revealed locations, periods and specific characteristics of the involved MSM that could be used when developing targeted interventions. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
The X-ray cluster survey with eRosita: forecasts for cosmology, cluster physics and primordial non-Gaussianity

NASA Astrophysics Data System (ADS)

Pillepich, Annalisa; Porciani, Cristiano; Reiprich, Thomas H.

2012-05-01

Starting in late 2013, the eRosita telescope will survey the X-ray sky with unprecedented sensitivity. Assuming a detection limit of 50 photons in the (0.5-2.0) keV energy band with a typical exposure time of 1.6 ks, we predict that eRosita will detect ˜9.3 × 104 clusters of galaxies more massive than 5 × 1013 h-1 M⊙, with the currently planned all-sky survey. Their median redshift will be z≃ 0.35. We perform a Fisher-matrix analysis to forecast the constraining power of ? on the Λ cold dark matter (ΛCDM) cosmology and, simultaneously, on the X-ray scaling relations for galaxy clusters. Special attention is devoted to the possibility of detecting primordial non-Gaussianity. We consider two experimental probes: the number counts and the angular clustering of a photon-count limited sample of clusters. We discuss how the cluster sample should be split to optimize the analysis and we show that redshift information of the individual clusters is vital to break the strong degeneracies among the model parameters. For example, performing a 'tomographic' analysis based on photometric-redshift estimates and combining one- and two-point statistics will give marginal 1σ errors of Δσ8≃ 0.036 and ΔΩm≃ 0.012 without priors, and improve the current estimates on the slope of the luminosity-mass relation by a factor of 3. Regarding primordial non-Gaussianity, ? clusters alone will give ΔfNL≃ 9, 36 and 144 for the local, orthogonal and equilateral model, respectively. Measuring redshifts with spectroscopic accuracy would further tighten the constraints by nearly 40 per cent (barring fNL which displays smaller improvements). Finally, combining ? data with the analysis of temperature anisotropies in the cosmic microwave background by the Planck satellite should give sensational constraints on both the cosmology and the properties of the intracluster medium.
Choosing appropriate analysis methods for cluster randomised cross-over trials with a binary outcome.

PubMed

Morgan, Katy E; Forbes, Andrew B; Keogh, Ruth H; Jairath, Vipul; Kahan, Brennan C

2017-01-30

In cluster randomised cross-over (CRXO) trials, clusters receive multiple treatments in a randomised sequence over time. In such trials, there is usual correlation between patients in the same cluster. In addition, within a cluster, patients in the same period may be more similar to each other than to patients in other periods. We demonstrate that it is necessary to account for these correlations in the analysis to obtain correct Type I error rates. We then use simulation to compare different methods of analysing a binary outcome from a two-period CRXO design. Our simulations demonstrated that hierarchical models without random effects for period-within-cluster, which do not account for any extra within-period correlation, performed poorly with greatly inflated Type I errors in many scenarios. In scenarios where extra within-period correlation was present, a hierarchical model with random effects for cluster and period-within-cluster only had correct Type I errors when there were large numbers of clusters; with small numbers of clusters, the error rate was inflated. We also found that generalised estimating equations did not give correct error rates in any scenarios considered. An unweighted cluster-level summary regression performed best overall, maintaining an error rate close to 5% for all scenarios, although it lost power when extra within-period correlation was present, especially for small numbers of clusters. Results from our simulation study show that it is important to model both levels of clustering in CRXO trials, and that any extra within-period correlation should be accounted for. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Hierarchical Star Formation in Turbulent Media: Evidence from Young Star Clusters

NASA Astrophysics Data System (ADS)

Grasha, K.; Elmegreen, B. G.; Calzetti, D.; Adamo, A.; Aloisi, A.; Bright, S. N.; Cook, D. O.; Dale, D. A.; Fumagalli, M.; Gallagher, J. S., III; Gouliermis, D. A.; Grebel, E. K.; Kahre, L.; Kim, H.; Krumholz, M. R.; Lee, J. C.; Messa, M.; Ryon, J. E.; Ubeda, L.

2017-06-01

We present an analysis of the positions and ages of young star clusters in eight local galaxies to investigate the connection between the age difference and separation of cluster pairs. We find that star clusters do not form uniformly but instead are distributed so that the age difference increases with the cluster pair separation to the 0.25-0.6 power, and that the maximum size over which star formation is physically correlated ranges from ˜200 pc to ˜1 kpc. The observed trends between age difference and separation suggest that cluster formation is hierarchical both in space and time: clusters that are close to each other are more similar in age than clusters born further apart. The temporal correlations between stellar aggregates have slopes that are consistent with predictions of turbulence acting as the primary driver of star formation. The velocity associated with the maximum size is proportional to the galaxy’s shear, suggesting that the galactic environment influences the maximum size of the star-forming structures.
Information extraction from dynamic PS-InSAR time series using machine learning

NASA Astrophysics Data System (ADS)

van de Kerkhof, B.; Pankratius, V.; Chang, L.; van Swol, R.; Hanssen, R. F.

2017-12-01

Due to the increasing number of SAR satellites, with shorter repeat intervals and higher resolutions, SAR data volumes are exploding. Time series analyses of SAR data, i.e. Persistent Scatterer (PS) InSAR, enable the deformation monitoring of the built environment at an unprecedented scale, with hundreds of scatterers per km2, updated weekly. Potential hazards, e.g. due to failure of aging infrastructure, can be detected at an early stage. Yet, this requires the operational data processing of billions of measurement points, over hundreds of epochs, updating this data set dynamically as new data come in, and testing whether points (start to) behave in an anomalous way. Moreover, the quality of PS-InSAR measurements is ambiguous and heterogeneous, which will yield false positives and false negatives. Such analyses are numerically challenging. Here we extract relevant information from PS-InSAR time series using machine learning algorithms. We cluster (group together) time series with similar behaviour, even though they may not be spatially close, such that the results can be used for further analysis. First we reduce the dimensionality of the dataset in order to be able to cluster the data, since applying clustering techniques on high dimensional datasets often result in unsatisfying results. Our approach is to apply t-distributed Stochastic Neighbor Embedding (t-SNE), a machine learning algorithm for dimensionality reduction of high-dimensional data to a 2D or 3D map, and cluster this result using Density-Based Spatial Clustering of Applications with Noise (DBSCAN). The results show that we are able to detect and cluster time series with similar behaviour, which is the starting point for more extensive analysis into the underlying driving mechanisms. The results of the methods are compared to conventional hypothesis testing as well as a Self-Organising Map (SOM) approach. Hypothesis testing is robust and takes the stochastic nature of the observations into account, but is time consuming. Therefore, we successively apply our machine learning approach with the hypothesis testing approach in order to benefit from both the reduced computation time of the machine learning approach as from the robust quality metrics of hypothesis testing. We acknowledge support from NASA AISTNNX15AG84G (PI V. Pankratius)
Rising prevalence of non-B HIV-1 subtypes in North Carolina and evidence for local onward transmission.

PubMed

Dennis, Ann M; Hué, Stephane; Learner, Emily; Sebastian, Joseph; Miller, William C; Eron, Joseph J

2017-01-01

HIV-1 diversity is increasing in North American and European cohorts which may have public health implications. However, little is known about non-B subtype diversity in the southern United States, despite the region being the epicenter of the nation's epidemic. We characterized HIV-1 diversity and transmission clusters to identify the extent to which non-B strains are transmitted locally. We conducted cross-sectional analyses of HIV-1 partial pol sequences collected from 1997 to 2014 from adults accessing routine clinical care in North Carolina (NC). Subtypes were evaluated using COMET and phylogenetic analysis. Putative transmission clusters were identified using maximum-likelihood trees. Clusters involving non-B strains were confirmed and their dates of origin were estimated using Bayesian phylogenetics. Data were combined with demographic information collected at the time of sample collection and country of origin for a subset of patients. Among 24,972 sequences from 15,246 persons, the non-B subtype prevalence increased from 0% to 3.46% over the study period. Of 325 persons with non-B subtypes, diversity was high with over 15 pure subtypes and recombinants; subtype C (28.9%) and CRF02_AG (24.0%) were most common. While identification of transmission clusters was lower for persons with non-B versus B subtypes, several local transmission clusters (≥3 persons) involving non-B subtypes were identified and all were presumably due to heterosexual transmission. Prevalence of non-B subtype diversity remains low in NC but a statistically significant rise was identified over time which likely reflects multiple importation. However, the combined phylogenetic clustering analysis reveals evidence for local onward transmission. Detection of these non-B clusters suggests heterosexual transmission and may guide diagnostic and prevention interventions.
Salient concerns in using analgesia for cancer pain among outpatients: A cluster analysis study.

PubMed

Meghani, Salimah H; Knafl, George J

2017-02-10

To identify unique clusters of patients based on their concerns in using analgesia for cancer pain and predictors of the cluster membership. This was a 3-mo prospective observational study ( n = 207). Patients were included if they were adults (≥ 18 years), diagnosed with solid tumors or multiple myelomas, and had at least one prescription of around-the-clock pain medication for cancer or cancer-treatment-related pain. Patients were recruited from two outpatient medical oncology clinics within a large health system in Philadelphia. A choice-based conjoint (CBC) analysis experiment was used to elicit analgesic treatment preferences (utilities). Patients employed trade-offs based on five analgesic attributes (percent relief from analgesics, type of analgesic, type of side-effects, severity of side-effects, out of pocket cost). Patients were clustered based on CBC utilities using novel adaptive statistical methods. Multiple logistic regression was used to identify predictors of cluster membership. The analyses found 4 unique clusters: Most patients made trade-offs based on the expectation of pain relief (cluster 1, 41%). For a subset, the main underlying concern was type of analgesic prescribed, i.e ., opioid vs non-opioid (cluster 2, 11%) and type of analgesic side effects (cluster 4, 21%), respectively. About one in four made trade-offs based on multiple concerns simultaneously including pain relief, type of side effects, and severity of side effects (cluster 3, 28%). In multivariable analysis, to identify predictors of cluster membership, clinical and socioeconomic factors (education, health literacy, income, social support) rather than analgesic attitudes and beliefs were found important; only the belief, i.e ., pain medications can mask changes in health or keep you from knowing what is going on in your body was found significant in predicting two of the four clusters [cluster 1 (-); cluster 4 (+)]. Most patients appear to be driven by a single salient concern in using analgesia for cancer pain. Addressing these concerns, perhaps through real time clinical assessments, may improve patients' analgesic adherence patterns and cancer pain outcomes.
Distribution and Genetic Diversity of Bacteriocin Gene Clusters in Rumen Microbial Genomes.

PubMed

Azevedo, Analice C; Bento, Cláudia B P; Ruiz, Jeronimo C; Queiroz, Marisa V; Mantovani, Hilário C

2015-10-01

Some species of ruminal bacteria are known to produce antimicrobial peptides, but the screening procedures have mostly been based on in vitro assays using standardized methods. Recent sequencing efforts have made available the genome sequences of hundreds of ruminal microorganisms. In this work, we performed genome mining of the complete and partial genome sequences of 224 ruminal bacteria and 5 ruminal archaea to determine the distribution and diversity of bacteriocin gene clusters. A total of 46 bacteriocin gene clusters were identified in 33 strains of ruminal bacteria. Twenty gene clusters were related to lanthipeptide biosynthesis, while 11 gene clusters were associated with sactipeptide production, 7 gene clusters were associated with class II bacteriocin production, and 8 gene clusters were associated with class III bacteriocin production. The frequency of strains whose genomes encode putative antimicrobial peptide precursors was 14.4%. Clusters related to the production of sactipeptides were identified for the first time among ruminal bacteria. BLAST analysis indicated that the majority of the gene clusters (88%) encoding putative lanthipeptides contained all the essential genes required for lanthipeptide biosynthesis. Most strains of Streptococcus (66.6%) harbored complete lanthipeptide gene clusters, in addition to an open reading frame encoding a putative class II bacteriocin. Albusin B-like proteins were found in 100% of the Ruminococcus albus strains screened in this study. The in silico analysis provided evidence of novel biosynthetic gene clusters in bacterial species not previously related to bacteriocin production, suggesting that the rumen microbiota represents an underexplored source of antimicrobial peptides. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Exploring relationships between Dairy Herd Improvement monitors of performance and the Transition Cow Index in Wisconsin dairy herds.

PubMed

Schultz, K K; Bennett, T B; Nordlund, K V; Döpfer, D; Cook, N B

2016-09-01

Transition cow management has been tracked via the Transition Cow Index (TCI; AgSource Cooperative Services, Verona, WI) since 2006. Transition Cow Index was developed to measure the difference between actual and predicted milk yield at first test day to evaluate the relative success of the transition period program. This project aimed to assess TCI in relation to all commonly used Dairy Herd Improvement (DHI) metrics available through AgSource Cooperative Services. Regression analysis was used to isolate variables that were relevant to TCI, and then principal components analysis and network analysis were used to determine the relative strength and relatedness among variables. Finally, cluster analysis was used to segregate herds based on similarity of relevant variables. The DHI data were obtained from 2,131 Wisconsin dairy herds with test-day mean ≥30 cows, which were tested ≥10 times throughout the 2014 calendar year. The original list of 940 DHI variables was reduced through expert-driven selection and regression analysis to 23 variables. The K-means cluster analysis produced 5 distinct clusters. Descriptive statistics were calculated for the 23 variables per cluster grouping. Using principal components analysis, cluster analysis, and network analysis, 4 parameters were isolated as most relevant to TCI; these were energy-corrected milk, 3 measures of intramammary infection (dry cow cure rate, linear somatic cell count score in primiparous cows, and new infection rate), peak ratio, and days in milk at peak milk production. These variables together with cow and newborn calf survival measures form a group of metrics that can be used to assist in the evaluation of overall transition period performance. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Epidemiological analysis of Salmonella clusters identified by whole genome sequencing, England and Wales 2014.

PubMed

Waldram, Alison; Dolan, Gayle; Ashton, Philip M; Jenkins, Claire; Dallman, Timothy J

2018-05-01

The unprecedented level of bacterial strain discrimination provided by whole genome sequencing (WGS) presents new challenges with respect to the utility and interpretation of the data. Whole genome sequences from 1445 isolates of Salmonella belonging to the most commonly identified serotypes in England and Wales isolated between April and August 2014 were analysed. Single linkage single nucleotide polymorphism thresholds at the 10, 5 and 0 level were explored for evidence of epidemiological links between clustered cases. Analysis of the WGS data organised 566 of the 1445 isolates into 32 clusters of five or more. A statistically significant epidemiological link was identified for 17 clusters. The clusters were associated with foreign travel (n = 8), consumption of Chinese takeaways (n = 4), chicken eaten at home (n = 2), and one each of the following; eating out, contact with another case in the home and contact with reptiles. In the same time frame, one cluster was detected using traditional outbreak detection methods. WGS can be used for the highly specific and highly sensitive detection of biologically related isolates when epidemiological links are obscured. Improvements in the collection of detailed, standardised exposure information would enhance cluster investigations. Copyright © 2017 Elsevier Ltd. All rights reserved.
Extracting Aggregation Free Energies of Mixed Clusters from Simulations of Small Systems: Application to Ionic Surfactant Micelles.

PubMed

Zhang, X; Patel, L A; Beckwith, O; Schneider, R; Weeden, C J; Kindt, J T

2017-11-14

Micelle cluster distributions from molecular dynamics simulations of a solvent-free coarse-grained model of sodium octyl sulfate (SOS) were analyzed using an improved method to extract equilibrium association constants from small-system simulations containing one or two micelle clusters at equilibrium with free surfactants and counterions. The statistical-thermodynamic and mathematical foundations of this partition-enabled analysis of cluster histograms (PEACH) approach are presented. A dramatic reduction in computational time for analysis was achieved through a strategy similar to the selector variable method to circumvent the need for exhaustive enumeration of the possible partitions of surfactants and counterions into clusters. Using statistics from a set of small-system (up to 60 SOS molecules) simulations as input, equilibrium association constants for micelle clusters were obtained as a function of both number of surfactants and number of associated counterions through a global fitting procedure. The resulting free energies were able to accurately predict micelle size and charge distributions in a large (560 molecule) system. The evolution of micelle size and charge with SOS concentration as predicted by the PEACH-derived free energies and by a phenomenological four-parameter model fit, along with the sensitivity of these predictions to variations in cluster definitions, are analyzed and discussed.
Clustering analysis of water distribution systems: identifying critical components and community impacts.

PubMed

Diao, K; Farmani, R; Fu, G; Astaraie-Imani, M; Ward, S; Butler, D

2014-01-01

Large water distribution systems (WDSs) are networks with both topological and behavioural complexity. Thereby, it is usually difficult to identify the key features of the properties of the system, and subsequently all the critical components within the system for a given purpose of design or control. One way is, however, to more explicitly visualize the network structure and interactions between components by dividing a WDS into a number of clusters (subsystems). Accordingly, this paper introduces a clustering strategy that decomposes WDSs into clusters with stronger internal connections than external connections. The detected cluster layout is very similar to the community structure of the served urban area. As WDSs may expand along with urban development in a community-by-community manner, the correspondingly formed distribution clusters may reveal some crucial configurations of WDSs. For verification, the method is applied to identify all the critical links during firefighting for the vulnerability analysis of a real-world WDS. Moreover, both the most critical pipes and clusters are addressed, given the consequences of pipe failure. Compared with the enumeration method, the method used in this study identifies the same group of the most critical components, and provides similar criticality prioritizations of them in a more computationally efficient time.
the-wizz: clustering redshift estimation for everyone

NASA Astrophysics Data System (ADS)

Morrison, C. B.; Hildebrandt, H.; Schmidt, S. J.; Baldry, I. K.; Bilicki, M.; Choi, A.; Erben, T.; Schneider, P.

2017-05-01

We present the-wizz, an open source and user-friendly software for estimating the redshift distributions of photometric galaxies with unknown redshifts by spatially cross-correlating them against a reference sample with known redshifts. The main benefit of the-wizz is in separating the angular pair finding and correlation estimation from the computation of the output clustering redshifts allowing anyone to create a clustering redshift for their sample without the intervention of an 'expert'. It allows the end user of a given survey to select any subsample of photometric galaxies with unknown redshifts, match this sample's catalogue indices into a value-added data file and produce a clustering redshift estimation for this sample in a fraction of the time it would take to run all the angular correlations needed to produce a clustering redshift. We show results with this software using photometric data from the Kilo-Degree Survey (KiDS) and spectroscopic redshifts from the Galaxy and Mass Assembly survey and the Sloan Digital Sky Survey. The results we present for KiDS are consistent with the redshift distributions used in a recent cosmic shear analysis from the survey. We also present results using a hybrid machine learning-clustering redshift analysis that enables the estimation of clustering redshifts for individual galaxies. the-wizz can be downloaded at http://github.com/morriscb/The-wiZZ/.

Bruker Biotyper Matrix-Assisted Laser Desorption Ionization–Time of Flight Mass Spectrometry System for Identification of Nocardia, Rhodococcus, Kocuria, Gordonia, Tsukamurella, and Listeria Species

PubMed Central

Lee, Tai-Fen; Du, Shin-Hei; Teng, Shih-Hua; Liao, Chun-Hsing; Sheng, Wang-Hui; Teng, Lee-Jene

2014-01-01

We evaluated whether the Bruker Biotyper matrix-associated laser desorption ionization–time of flight mass spectrometry (MALDI-TOF MS) system provides accurate species-level identifications of 147 isolates of aerobically growing Gram-positive rods (GPRs). The bacterial isolates included Nocardia (n = 74), Listeria (n = 39), Kocuria (n = 15), Rhodococcus (n = 10), Gordonia (n = 7), and Tsukamurella (n = 2) species, which had all been identified by conventional methods, molecular methods, or both. In total, 89.7% of Listeria monocytogenes, 80% of Rhodococcus species, 26.7% of Kocuria species, and 14.9% of Nocardia species (n = 11, all N. nova and N. otitidiscaviarum) were correctly identified to the species level (score values, ≥2.0). A clustering analysis of spectra generated by the Bruker Biotyper identified six clusters of Nocardia species, i.e., cluster 1 (N. cyriacigeorgica), cluster 2 (N. brasiliensis), cluster 3 (N. farcinica), cluster 4 (N. puris), cluster 5 (N. asiatica), and cluster 6 (N. beijingensis), based on the six peaks generated by ClinProTools with the genetic algorithm, i.e., m/z 2,774.477 (cluster 1), m/z 5,389.792 (cluster 2), m/z 6,505.720 (cluster 3), m/z 5,428.795 (cluster 4), m/z 6,525.326 (cluster 5), and m/z 16,085.216 (cluster 6). Two clusters of L. monocytogenes spectra were also found according to the five peaks, i.e., m/z 5,594.85, m/z 6,184.39, and m/z 11,187.31, for cluster 1 (serotype 1/2a) and m/z 5,601.21 and m/z 11,199.33 for cluster 2 (serotypes 1/2b and 4b). The Bruker Biotyper system was unable to accurately identify Nocardia (except for N. nova and N. otitidiscaviarum), Tsukamurella, or Gordonia species. Continuous expansion of the MALDI-TOF MS databases to include more GPRs is necessary. PMID:24759706
Comparative 1H NMR Metabolomic Urinalysis of People Diagnosed with Balkan Endemic Nephropathy, and Healthy Subjects, in Romania and Bulgaria: A Pilot Study

PubMed Central

Mantle, Peter; Modalca, Mirela; Nicholls, Andrew; Tatu, Calin; Tatu, Diana; Toncheva, Draga

2011-01-01

1H NMR spectroscopy of urine has been applied to exploring metabolomic differences between people diagnosed with Balkan endemic nephropathy (BEN), and treated by haemodialysis, and those without overt renal disease in Romania and Bulgaria. Convenience sampling was made from patients receiving haemodialysis in hospital and healthy controls in their village. Principal component analysis clustered healthy controls from both countries together. Bulgarian BEN patients clustered separately from controls, though in the same space. However, Romanian BEN patients not only also clustered away from controls but also clustered separately from the BEN patients in Bulgaria. Notably, the urinary metabolomic data of two people sampled as Romanian controls clustered within the Romanian BEN group. One of these had been suspected of incipient symptoms of BEN at the time of selection as a ‘healthy’ control. This implies, at first sight, that metabolomic analysis can be predictive of impending morbidity before conventional criteria can diagnose BEN. Separate clustering of BEN patients from Romania and Bulgaria could indicate difference in aetiology of this particular silent renal atrophy in different geographic foci across the Balkans. PMID:22069742
Discrete Wavelet Transform-Based Whole-Spectral and Subspectral Analysis for Improved Brain Tumor Clustering Using Single Voxel MR Spectroscopy.

PubMed

Yang, Guang; Nawaz, Tahir; Barrick, Thomas R; Howe, Franklyn A; Slabaugh, Greg

2015-12-01

Many approaches have been considered for automatic grading of brain tumors by means of pattern recognition with magnetic resonance spectroscopy (MRS). Providing an improved technique which can assist clinicians in accurately identifying brain tumor grades is our main objective. The proposed technique, which is based on the discrete wavelet transform (DWT) of whole-spectral or subspectral information of key metabolites, combined with unsupervised learning, inspects the separability of the extracted wavelet features from the MRS signal to aid the clustering. In total, we included 134 short echo time single voxel MRS spectra (SV MRS) in our study that cover normal controls, low grade and high grade tumors. The combination of DWT-based whole-spectral or subspectral analysis and unsupervised clustering achieved an overall clustering accuracy of 94.8% and a balanced error rate of 7.8%. To the best of our knowledge, it is the first study using DWT combined with unsupervised learning to cluster brain SV MRS. Instead of dimensionality reduction on SV MRS or feature selection using model fitting, our study provides an alternative method of extracting features to obtain promising clustering results.
Using Clustering to Establish Climate Regimes from PCM Output

NASA Technical Reports Server (NTRS)

Oglesby, Robert; Arnold, James E. (Technical Monitor); Hoffman, Forrest; Hargrove, W. W.; Erickson, D.

2002-01-01

A multivariate statistical clustering technique--based on the k-means algorithm of Hartigan has been used to extract patterns of climatological significance from 200 years of general circulation model (GCM) output. Originally developed and implemented on a Beowulf-style parallel computer constructed by Hoffman and Hargrove from surplus commodity desktop PCs, the high performance parallel clustering algorithm was previously applied to the derivation of ecoregions from map stacks of 9 and 25 geophysical conditions or variables for the conterminous U.S. at a resolution of 1 sq km. Now applied both across space and through time, the clustering technique yields temporally-varying climate regimes predicted by transient runs of the Parallel Climate Model (PCM). Using a business-as-usual (BAU) scenario and clustering four fields of significance to the global water cycle (surface temperature, precipitation, soil moisture, and snow depth) from 1871 through 2098, the authors' analysis shows an increase in spatial area occupied by the cluster or climate regime which typifies desert regions (i.e., an increase in desertification) and a decrease in the spatial area occupied by the climate regime typifying winter-time high latitude perma-frost regions. The patterns of cluster changes have been analyzed to understand the predicted variability in the water cycle on global and continental scales. In addition, representative climate regimes were determined by taking three 10-year averages of the fields 100 years apart for northern hemisphere winter (December, January, and February) and summer (June, July, and August). The result is global maps of typical seasonal climate regimes for 100 years in the past, for the present, and for 100 years into the future. Using three-dimensional data or phase space representations of these climate regimes (i.e., the cluster centroids), the authors demonstrate the portion of this phase space occupied by the land surface at all points in space and time. Any single spot on the globe will exist in one of these climate regimes at any single point in time. By incrementing time, that same spot will trace out a trajectory or orbit between and among these climate regimes (or atmospheric states) in phase (or state) space. When a geographic region enters a state it never previously visited, a climatic change is said to have occurred. Tracing out the entire trajectory of a single spot on the globe yields a 'manifold' in state space representing the shape of its predicted climate occupancy. This sort of analysis enables a researcher to more easily grasp the multivariate behavior of the climate system.
Toward An Understanding of Cluster Evolution: A Deep X-Ray Selected Cluster Catalog from ROSAT

NASA Technical Reports Server (NTRS)

Jones, Christine; Oliversen, Ronald (Technical Monitor)

2002-01-01

In the past year, we have focussed on studying individual clusters found in this sample with Chandra, as well as using Chandra to measure the luminosity-temperature relation for a sample of distant clusters identified through the ROSAT study, and finally we are continuing our study of fossil groups. For the luminosity-temperature study, we compared a sample of nearby clusters with a sample of distant clusters and, for the first time, measured a significant change in the relation as a function of redshift (Vikhlinin et al. in final preparation for submission to Cape). We also used our ROSAT analysis to select and propose for Chandra observations of individual clusters. We are now analyzing the Chandra observations of the distant cluster A520, which appears to have undergone a recent merger. Finally, we have completed the analysis of the fossil groups identified in ROM observations. In the past few months, we have derived X-ray fluxes and luminosities as well as X-ray extents for an initial sample of 89 objects. Based on the X-ray extents and the lack of bright galaxies, we have identified 16 fossil groups. We are comparing their X-ray and optical properties with those of optically rich groups. A paper is being readied for submission (Jones, Forman, and Vikhlinin in preparation).
Initial Analysis of and Predictive Model Development for Weather Reroute Advisory Use

NASA Technical Reports Server (NTRS)

Arneson, Heather M.

2016-01-01

In response to severe weather conditions, traffic management coordinators specify reroutes to route air traffic around affected regions of airspace. Providing analysis and recommendations of available reroute options would assist the traffic management coordinators in making more efficient rerouting decisions. These recommendations can be developed by examining historical data to determine which previous reroute options were used in similar weather and traffic conditions. Essentially, using previous information to inform future decisions. This paper describes the initial steps and methodology used towards this goal. A method to extract relevant features from the large volume of weather data to quantify the convective weather scenario during a particular time range is presented. Similar routes are clustered. A description of the algorithm to identify which cluster of reroute advisories were actually followed by pilots is described. Models built for fifteen of the top twenty most frequently used reroute clusters correctly predict the use of the cluster for over 60 of the test examples. Results are preliminary but indicate that the methodology is worth pursuing with modifications based on insight gained from this analysis.
Network-constrained spatio-temporal clustering analysis of traffic collisions in Jianghan District of Wuhan, China

PubMed Central

Fan, Yaxin; Zhu, Xinyan; Guo, Wei; Guo, Tao

2018-01-01

The analysis of traffic collisions is essential for urban safety and the sustainable development of the urban environment. Reducing the road traffic injuries and the financial losses caused by collisions is the most important goal of traffic management. In addition, traffic collisions are a major cause of traffic congestion, which is a serious issue that affects everyone in the society. Therefore, traffic collision analysis is essential for all parties, including drivers, pedestrians, and traffic officers, to understand the road risks at a finer spatio-temporal scale. However, traffic collisions in the urban context are dynamic and complex. Thus, it is important to detect how the collision hotspots evolve over time through spatio-temporal clustering analysis. In addition, traffic collisions are not isolated events in space. The characteristics of the traffic collisions and their surrounding locations also present an influence of the clusters. This work tries to explore the spatio-temporal clustering patterns of traffic collisions by combining a set of network-constrained methods. These methods were tested using the traffic collision data in Jianghan District of Wuhan, China. The results demonstrated that these methods offer different perspectives of the spatio-temporal clustering patterns. The weighted network kernel density estimation provides an intuitive way to incorporate attribute information. The network cross K-function shows that there are varying clustering tendencies between traffic collisions and different types of POIs. The proposed network differential Local Moran’s I and network local indicators of mobility association provide straightforward and quantitative measures of the hotspot changes. This case study shows that these methods could help researchers, practitioners, and policy-makers to better understand the spatio-temporal clustering patterns of traffic collisions. PMID:29672551
Self-similarity of temperature profiles in distant galaxy clusters: the quest for a universal law

NASA Astrophysics Data System (ADS)

Baldi, A.; Ettori, S.; Molendi, S.; Gastaldello, F.

2012-09-01

Context. We present the XMM-Newton temperature profiles of 12 bright (LX > 4 × 1044 erg s-1) clusters of galaxies at 0.4 < z < 0.9, having an average temperature in the range 5 ≲ kT ≲ 11 keV. Aims: The main goal of this paper is to study for the first time the temperature profiles of a sample of high-redshift clusters, to investigate their properties, and to define a universal law to describe the temperature radial profiles in galaxy clusters as a function of both cosmic time and their state of relaxation. Methods: We performed a spatially resolved spectral analysis, using Cash statistics, to measure the temperature in the intracluster medium at different radii. Results: We extracted temperature profiles for the clusters in our sample, finding that all profiles are declining toward larger radii. The normalized temperature profiles (normalized by the mean temperature T500) are found to be generally self-similar. The sample was subdivided into five cool-core (CC) and seven non cool-core (NCC) clusters by introducing a pseudo-entropy ratio σ = (TIN/TOUT) × (EMIN/EMOUT)-1/3 and defining the objects with σ < 0.6 as CC clusters and those with σ ≥ 0.6 as NCC clusters. The profiles of CC and NCC clusters differ mainly in the central regions, with the latter exhibiting a slightly flatter central profile. A significant dependence of the temperature profiles on the pseudo-entropy ratio σ is detected by fitting a function of r and σ, showing an indication that the outer part of the profiles becomes steeper for higher values of σ (i.e. transitioning toward the NCC clusters). No significant evidence of redshift evolution could be found within the redshift range sampled by our clusters (0.4 < z < 0.9). A comparison of our high-z sample with intermediate clusters at 0.1 < z < 0.3 showed how the CC and NCC cluster temperature profiles have experienced some sort of evolution. This can happen because higher z clusters are at a less advanced stage of their formation and did not have enough time to create a relaxed structure, which is characterized by a central temperature dip in CC clusters and by flatter profiles in NCC clusters. Conclusions: This is the first time that a systematic study of the temperature profiles of galaxy clusters at z > 0.4 has been attempted. We were able to define the closest possible relation to a universal law for the temperature profiles of galaxy clusters at 0.1 < z < 0.9, showing a dependence on both the relaxation state of the clusters and the redshift. Appendix A is only available in electronic form at http://www.aanda.org
On the Surface Mapping using Individual Cluster Impacts

PubMed Central

Fernandez-Lima, F.A.; Eller, M.J.; DeBord, J.D.; Verkhoturov, S.V.; Della-Negra, S.; Schweikert, E.A.

2011-01-01

This paper describes the advantages of using single impacts of large cluster projectiles (e.g. C60 and Au400) for surface mapping and characterization. The analysis of co-emitted time-resolved photon spectra, electron distributions and characteristic secondary ions shows that they can be used as surface fingerprints for target composition, morphology and structure. Photon, electron and secondary ion emission increases with the projectile cluster size and energy. The observed, high abundant secondary ion emission makes cluster projectiles good candidates for surface mapping of atomic and fragment ions (e.g., yield >1 per nominal mass) and molecular ions (e.g., few tens of percent in the 500 < m/z < 1500 range). PMID:22393269
A Survey of Variable Extragalactic Sources with XTE's All Sky Monitor (ASM)

NASA Technical Reports Server (NTRS)

Jernigan, Garrett

1998-01-01

The original goal of the project was the near real-time detection of AGN utilizing the SSC 3 of the ASM on XTE which does a deep integration on one 100 square degree region of the sky. While the SSC never performed sufficiently well to allow the success of this goal, the work on the project has led to the development of a new analysis method for coded aperture systems which has now been applied to ASM data for mapping regions near clusters of galaxies such as the Perseus Cluster and the Coma Cluster. Publications are in preparation that describe both the new method and the results from mapping clusters of galaxies.
Modified multidimensional scaling approach to analyze financial markets.

PubMed

Yin, Yi; Shang, Pengjian

2014-06-01

Detrended cross-correlation coefficient (σDCCA) and dynamic time warping (DTW) are introduced as the dissimilarity measures, respectively, while multidimensional scaling (MDS) is employed to translate the dissimilarities between daily price returns of 24 stock markets. We first propose MDS based on σDCCA dissimilarity and MDS based on DTW dissimilarity creatively, while MDS based on Euclidean dissimilarity is also employed to provide a reference for comparisons. We apply these methods in order to further visualize the clustering between stock markets. Moreover, we decide to confront MDS with an alternative visualization method, "Unweighed Average" clustering method, for comparison. The MDS analysis and "Unweighed Average" clustering method are employed based on the same dissimilarity. Through the results, we find that MDS gives us a more intuitive mapping for observing stable or emerging clusters of stock markets with similar behavior, while the MDS analysis based on σDCCA dissimilarity can provide more clear, detailed, and accurate information on the classification of the stock markets than the MDS analysis based on Euclidean dissimilarity. The MDS analysis based on DTW dissimilarity indicates more knowledge about the correlations between stock markets particularly and interestingly. Meanwhile, it reflects more abundant results on the clustering of stock markets and is much more intensive than the MDS analysis based on Euclidean dissimilarity. In addition, the graphs, originated from applying MDS methods based on σDCCA dissimilarity and DTW dissimilarity, may also guide the construction of multivariate econometric models.
When the wind goes out of the sail - declining recovery expectations in the first weeks of back pain.

PubMed

Carstens, J K P; Shaw, W S; Boersma, K; Reme, S E; Pransky, G; Linton, S J

2014-02-01

Expectations for recovery are a known predictor for returning to work. Most studies seem to conclude that the higher the expectancy the better the outcome. However, the development of expectations over time is rarely researched and experimental studies show that realistic expectations rather than high expectancies are the most adaptive. This study aims to explore patterns of stability and change in expectations for recovery during the first weeks of a back-pain episode and how these patterns relate to other psychological variables and outcome. The study included 496 volunteer patients seeking treatment for work-related, acute back pain. The participants were measured with self-report scales of depression, fear of pain, life impact of pain, catastrophizing and expectations for recovery at two time points. A follow-up focusing on recovery and return to work was conducted 3 months later. A cluster analysis was conducted, categorizing the data on the trajectories of recovery expectations. Cluster analysis revealed four clusters regarding the development of expectations for recovery during a 2-week period after pain onset. Three out of four clusters showed stability in their expectations as well as corresponding levels of proximal psychological factors. The fourth cluster showed increases in distress and a decrease in expectations for recovery. This cluster also has poor odds ratios for returning to work and recovery. Decreases in expectancies for recovery seem as important as baseline values in terms of outcome, which has clinical and theoretical implications. © 2013 European Pain Federation - EFIC®
A pattern-mixture model approach for handling missing continuous outcome data in longitudinal cluster randomized trials.

PubMed

Fiero, Mallorie H; Hsu, Chiu-Hsieh; Bell, Melanie L

2017-11-20

We extend the pattern-mixture approach to handle missing continuous outcome data in longitudinal cluster randomized trials, which randomize groups of individuals to treatment arms, rather than the individuals themselves. Individuals who drop out at the same time point are grouped into the same dropout pattern. We approach extrapolation of the pattern-mixture model by applying multilevel multiple imputation, which imputes missing values while appropriately accounting for the hierarchical data structure found in cluster randomized trials. To assess parameters of interest under various missing data assumptions, imputed values are multiplied by a sensitivity parameter, k, which increases or decreases imputed values. Using simulated data, we show that estimates of parameters of interest can vary widely under differing missing data assumptions. We conduct a sensitivity analysis using real data from a cluster randomized trial by increasing k until the treatment effect inference changes. By performing a sensitivity analysis for missing data, researchers can assess whether certain missing data assumptions are reasonable for their cluster randomized trial. Copyright © 2017 John Wiley & Sons, Ltd.
Symptom Clusters Change over Time in Women Receiving Adjuvant Chemotherapy for Breast Cancer

PubMed Central

Albusoul, Randa M.; Berger, Ann M.; Gay, Caryl L.; Janson, Susan L.; Lee, Kathryn A.

2017-01-01

Context Patients with breast cancer receiving chemotherapy (CTX) experience multiple concurrent symptoms, but little is known about how symptoms change during and after treatment. Knowledge of the identity and trajectory of symptom clusters (SCs) would enhance measurement and management. Objectives We aimed to identify SCs and their change over time from baseline to completion of breast cancer CTX. Methods SCs were identified and assessed for change in 219 women from Nebraska at four times: baseline, during cycles #3 and #4 of CTX, and one-month after finishing CTX. Ten symptoms were measured: two using the Hospital Anxiety and Depression Scale and eight using the Symptom Experience Scale. Exploratory factor analysis was conducted at each time point, then changes in SCs were evaluated at different times. Results Two SCs were identified before and after initiating CTX: Gastrointestinal (GI) and Treatment-related (Tr). The number and type of symptoms in each cluster differed over time. Clusters were dynamic during CTX with changes in the number and type of symptoms. Only one Tr SC, which consisted of fatigue, pain, and sleep disturbance, was identified after CTX completion. Conclusion SCs during CTX appear to be dynamic, changing over time from before until after CTX completion. Repeated assessments of SCs reveal symptoms that are present and when patients are most burdened and in need of additional support. PMID:28062343
CHIMERA: Top-down model for hierarchical, overlapping and directed cluster structures in directed and weighted complex networks

NASA Astrophysics Data System (ADS)

Franke, R.

2016-11-01

In many networks discovered in biology, medicine, neuroscience and other disciplines special properties like a certain degree distribution and hierarchical cluster structure (also called communities) can be observed as general organizing principles. Detecting the cluster structure of an unknown network promises to identify functional subdivisions, hierarchy and interactions on a mesoscale. It is not trivial choosing an appropriate detection algorithm because there are multiple network, cluster and algorithmic properties to be considered. Edges can be weighted and/or directed, clusters overlap or build a hierarchy in several ways. Algorithms differ not only in runtime, memory requirements but also in allowed network and cluster properties. They are based on a specific definition of what a cluster is, too. On the one hand, a comprehensive network creation model is needed to build a large variety of benchmark networks with different reasonable structures to compare algorithms. On the other hand, if a cluster structure is already known, it is desirable to separate effects of this structure from other network properties. This can be done with null model networks that mimic an observed cluster structure to improve statistics on other network features. A third important application is the general study of properties in networks with different cluster structures, possibly evolving over time. Currently there are good benchmark and creation models available. But what is left is a precise sandbox model to build hierarchical, overlapping and directed clusters for undirected or directed, binary or weighted complex random networks on basis of a sophisticated blueprint. This gap shall be closed by the model CHIMERA (Cluster Hierarchy Interconnection Model for Evaluation, Research and Analysis) which will be introduced and described here for the first time.
Transcription factor clusters regulate genes in eukaryotic cells

PubMed Central

Hedlund, Erik G; Friemann, Rosmarie; Hohmann, Stefan

2017-01-01

Transcription is regulated through binding factors to gene promoters to activate or repress expression, however, the mechanisms by which factors find targets remain unclear. Using single-molecule fluorescence microscopy, we determined in vivo stoichiometry and spatiotemporal dynamics of a GFP tagged repressor, Mig1, from a paradigm signaling pathway of Saccharomyces cerevisiae. We find the repressor operates in clusters, which upon extracellular signal detection, translocate from the cytoplasm, bind to nuclear targets and turnover. Simulations of Mig1 configuration within a 3D yeast genome model combined with a promoter-specific, fluorescent translation reporter confirmed clusters are the functional unit of gene regulation. In vitro and structural analysis on reconstituted Mig1 suggests that clusters are stabilized by depletion forces between intrinsically disordered sequences. We observed similar clusters of a co-regulatory activator from a different pathway, supporting a generalized cluster model for transcription factors that reduces promoter search times through intersegment transfer while stabilizing gene expression. PMID:28841133
Equilibrium geometries, electronic and magnetic properties of small AunNi- (n = 1-9) clusters

NASA Astrophysics Data System (ADS)

Tang, Cui-Ming; Chen, Xiao-Xu; Yang, Xiang-Dong

2014-05-01

Geometrical, electronic and magnetic properties of small AunNi- (n = 1-9) clusters have been investigated based on density functional theory (DFT) at PW91P86 level. An extensive structural search shows that the relative stable structures of AunNi- (n = 1-9) clusters adopt 2D structure for n = 1-5, 7 and 3D structure for n = 6, 8-9. And the substitution of a Ni atom for an Au atom in the Au-n+1 cluster obviously changes the structure of the host cluster. Moreover, an odd-even alternation phenomenon has been found for HOMO-LUMO energy gaps, indicating that the relative stable structures of the AunNi- clusters with odd-numbered gold atoms have a higher relative stability. Finally, the natural population analysis (NPA) and the vertical detachment energies (VDE) are studied, respectively. The theoretical values of VDE are reported for the first time to our best knowledge.
Construction and Utilization of a Beowulf Computing Cluster: A User's Perspective

NASA Technical Reports Server (NTRS)

Woods, Judy L.; West, Jeff S.; Sulyma, Peter R.

2000-01-01

Lockheed Martin Space Operations - Stennis Programs (LMSO) at the John C Stennis Space Center (NASA/SSC) has designed and built a Beowulf computer cluster which is owned by NASA/SSC and operated by LMSO. The design and construction of the cluster are detailed in this paper. The cluster is currently used for Computational Fluid Dynamics (CFD) simulations. The CFD codes in use and their applications are discussed. Examples of some of the work are also presented. Performance benchmark studies have been conducted for the CFD codes being run on the cluster. The results of two of the studies are presented and discussed. The cluster is not currently being utilized to its full potential; therefore, plans are underway to add more capabilities. These include the addition of structural, thermal, fluid, and acoustic Finite Element Analysis codes as well as real-time data acquisition and processing during test operations at NASA/SSC. These plans are discussed as well.
The geometry of chaotic dynamics — a complex network perspective

NASA Astrophysics Data System (ADS)

Donner, R. V.; Heitzig, J.; Donges, J. F.; Zou, Y.; Marwan, N.; Kurths, J.

2011-12-01

Recently, several complex network approaches to time series analysis have been developed and applied to study a wide range of model systems as well as real-world data, e.g., geophysical or financial time series. Among these techniques, recurrence-based concepts and prominently ɛ-recurrence networks, most faithfully represent the geometrical fine structure of the attractors underlying chaotic (and less interestingly non-chaotic) time series. In this paper we demonstrate that the well known graph theoretical properties local clustering coefficient and global (network) transitivity can meaningfully be exploited to define two new local and two new global measures of dimension in phase space: local upper and lower clustering dimension as well as global upper and lower transitivity dimension. Rigorous analytical as well as numerical results for self-similar sets and simple chaotic model systems suggest that these measures are well-behaved in most non-pathological situations and that they can be estimated reasonably well using ɛ-recurrence networks constructed from relatively short time series. Moreover, we study the relationship between clustering and transitivity dimensions on the one hand, and traditional measures like pointwise dimension or local Lyapunov dimension on the other hand. We also provide further evidence that the local clustering coefficients, or equivalently the local clustering dimensions, are useful for identifying unstable periodic orbits and other dynamically invariant objects from time series. Our results demonstrate that ɛ-recurrence networks exhibit an important link between dynamical systems and graph theory.
Using time series structural characteristics to analyze grain prices in food insecure countries

USGS Publications Warehouse

Davenport, Frank; Funk, Chris

2015-01-01

Two components of food security monitoring are accurate forecasts of local grain prices and the ability to identify unusual price behavior. We evaluated a method that can both facilitate forecasts of cross-country grain price data and identify dissimilarities in price behavior across multiple markets. This method, characteristic based clustering (CBC), identifies similarities in multiple time series based on structural characteristics in the data. Here, we conducted a simulation experiment to determine if CBC can be used to improve the accuracy of maize price forecasts. We then compared forecast accuracies among clustered and non-clustered price series over a rolling time horizon. We found that the accuracy of forecasts on clusters of time series were equal to or worse than forecasts based on individual time series. However, in the following experiment we found that CBC was still useful for price analysis. We used the clusters to explore the similarity of price behavior among Kenyan maize markets. We found that price behavior in the isolated markets of Mandera and Marsabit has become increasingly dissimilar from markets in other Kenyan cities, and that these dissimilarities could not be explained solely by geographic distance. The structural isolation of Mandera and Marsabit that we find in this paper is supported by field studies on food security and market integration in Kenya. Our results suggest that a market with a unique price series (as measured by structural characteristics that differ from neighboring markets) may lack market integration and food security.

Mass spectrometric identification of intermediates in the O2-driven [4Fe-4S] to [2Fe-2S] cluster conversion in FNR

PubMed Central

Crack, Jason C.; Thomson, Andrew J.

2017-01-01

The iron-sulfur cluster containing protein Fumarate and Nitrate Reduction (FNR) is the master regulator for the switch between anaerobic and aerobic respiration in Escherichia coli and many other bacteria. The [4Fe-4S] cluster functions as the sensory module, undergoing reaction with O2 that leads to conversion to a [2Fe-2S] form with loss of high-affinity DNA binding. Here, we report studies of the FNR cluster conversion reaction using time-resolved electrospray ionization mass spectrometry. The data provide insight into the reaction, permitting the detection of cluster conversion intermediates and products, including a [3Fe-3S] cluster and persulfide-coordinated [2Fe-2S] clusters [[2Fe-2S](S)n, where n = 1 or 2]. Analysis of kinetic data revealed a branched mechanism in which cluster sulfide oxidation occurs in parallel with cluster conversion and not as a subsequent, secondary reaction to generate [2Fe-2S](S)n species. This methodology shows great potential for broad application to studies of protein cofactor–small molecule interactions. PMID:28373574
Interrupted time-series analysis yielded an effect estimate concordant with the cluster-randomized controlled trial result.

PubMed

Fretheim, Atle; Soumerai, Stephen B; Zhang, Fang; Oxman, Andrew D; Ross-Degnan, Dennis

2013-08-01

We reanalyzed the data from a cluster-randomized controlled trial (C-RCT) of a quality improvement intervention for prescribing antihypertensive medication. Our objective was to estimate the effectiveness of the intervention using both interrupted time-series (ITS) and RCT methods, and to compare the findings. We first conducted an ITS analysis using data only from the intervention arm of the trial because our main objective was to compare the findings from an ITS analysis with the findings from the C-RCT. We used segmented regression methods to estimate changes in level or slope coincident with the intervention, controlling for baseline trend. We analyzed the C-RCT data using generalized estimating equations. Last, we estimated the intervention effect by including data from both study groups and by conducting a controlled ITS analysis of the difference between the slope and level changes in the intervention and control groups. The estimates of absolute change resulting from the intervention were ITS analysis, 11.5% (95% confidence interval [CI]: 9.5, 13.5); C-RCT, 9.0% (95% CI: 4.9, 13.1); and the controlled ITS analysis, 14.0% (95% CI: 8.6, 19.4). ITS analysis can provide an effect estimate that is concordant with the results of a cluster-randomized trial. A broader range of comparisons from other RCTs would help to determine whether these are generalizable results. Copyright © 2013 Elsevier Inc. All rights reserved.
Spatio-Temporal Trends and Risk Factors for Shigella from 2001 to 2011 in Jiangsu Province, People's Republic of China

PubMed Central

Bao, Changjun; Hu, Jianli; Liu, Wendong; Liang, Qi; Wu, Ying; Norris, Jessie; Peng, Zhihang; Yu, Rongbin; Shen, Hongbing; Chen, Feng

2014-01-01

Objective This study aimed to describe the spatial and temporal trends of Shigella incidence rates in Jiangsu Province, People's Republic of China. It also intended to explore complex risk modes facilitating Shigella transmission. Methods County-level incidence rates were obtained for analysis using geographic information system (GIS) tools. Trend surface and incidence maps were established to describe geographic distributions. Spatio-temporal cluster analysis and autocorrelation analysis were used for detecting clusters. Based on the number of monthly Shigella cases, an autoregressive integrated moving average (ARIMA) model successfully established a time series model. A spatial correlation analysis and a case-control study were conducted to identify risk factors contributing to Shigella transmissions. Results The far southwestern and northwestern areas of Jiangsu were the most infected. A cluster was detected in southwestern Jiangsu (LLR = 11674.74, P<0.001). The time series model was established as ARIMA (1, 12, 0), which predicted well for cases from August to December, 2011. Highways and water sources potentially caused spatial variation in Shigella development in Jiangsu. The case-control study confirmed not washing hands before dinner (OR = 3.64) and not having access to a safe water source (OR = 2.04) as the main causes of Shigella in Jiangsu Province. Conclusion Improvement of sanitation and hygiene should be strengthened in economically developed counties, while access to a safe water supply in impoverished areas should be increased at the same time. PMID:24416167
First passage times in homogeneous nucleation: Dependence on the total number of particles

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yvinec, Romain; Bernard, Samuel; Pujo-Menjouet, Laurent

2016-01-21

Motivated by nucleation and molecular aggregation in physical, chemical, and biological settings, we present an extension to a thorough analysis of the stochastic self-assembly of a fixed number of identical particles in a finite volume. We study the statistics of times required for maximal clusters to be completed, starting from a pure-monomeric particle configuration. For finite volumes, we extend previous analytical approaches to the case of arbitrary size-dependent aggregation and fragmentation kinetic rates. For larger volumes, we develop a scaling framework to study the first assembly time behavior as a function of the total quantity of particles. We find thatmore » the mean time to first completion of a maximum-sized cluster may have a surprisingly weak dependence on the total number of particles. We highlight how higher statistics (variance, distribution) of the first passage time may nevertheless help to infer key parameters, such as the size of the maximum cluster. Finally, we present a framework to quantify formation of macroscopic sized clusters, which are (asymptotically) very unlikely and occur as a large deviation phenomenon from the mean-field limit. We argue that this framework is suitable to describe phase transition phenomena, as inherent infrequent stochastic processes, in contrast to classical nucleation theory.« less
First passage times in homogeneous nucleation: Dependence on the total number of particles

NASA Astrophysics Data System (ADS)

Yvinec, Romain; Bernard, Samuel; Hingant, Erwan; Pujo-Menjouet, Laurent

2016-01-01

Motivated by nucleation and molecular aggregation in physical, chemical, and biological settings, we present an extension to a thorough analysis of the stochastic self-assembly of a fixed number of identical particles in a finite volume. We study the statistics of times required for maximal clusters to be completed, starting from a pure-monomeric particle configuration. For finite volumes, we extend previous analytical approaches to the case of arbitrary size-dependent aggregation and fragmentation kinetic rates. For larger volumes, we develop a scaling framework to study the first assembly time behavior as a function of the total quantity of particles. We find that the mean time to first completion of a maximum-sized cluster may have a surprisingly weak dependence on the total number of particles. We highlight how higher statistics (variance, distribution) of the first passage time may nevertheless help to infer key parameters, such as the size of the maximum cluster. Finally, we present a framework to quantify formation of macroscopic sized clusters, which are (asymptotically) very unlikely and occur as a large deviation phenomenon from the mean-field limit. We argue that this framework is suitable to describe phase transition phenomena, as inherent infrequent stochastic processes, in contrast to classical nucleation theory.
High-Resolution Melting-Curve Analysis of Ligation-Mediated Real-Time PCR for Rapid Evaluation of an Epidemiological Outbreak of Extended-Spectrum-Beta-Lactamase-Producing Escherichia coli ▿

PubMed Central

Woksepp, Hanna; Jernberg, Cecilia; Tärnberg, Maria; Ryberg, Anna; Brolund, Alma; Nordvall, Michaela; Olsson-Liljequist, Barbro; Wisell, Karin Tegmark; Monstein, Hans-Jürg; Nilsson, Lennart E.; Schön, Thomas

2011-01-01

Methods for the confirmation of nosocomial outbreaks of bacterial pathogens are complex, expensive, and time-consuming. Recently, a method based on ligation-mediated PCR (LM/PCR) using a low denaturation temperature which produces specific melting-profile patterns of DNA products has been described. Our objective was to further develop this method for real-time PCR and high-resolution melting analysis (HRM) in a single-tube system optimized in order to achieve results within 1 day. Following the optimization of LM/PCR for real-time PCR and HRM (LM/HRM), the method was applied for a nosocomial outbreak of extended-spectrum-beta-lactamase (ESBL)-producing and ST131-associated Escherichia coli isolates (n = 15) and control isolates (n = 29), including four previous clusters. The results from LM/HRM were compared to results from pulsed-field gel electrophoresis (PFGE), which served as the gold standard. All isolates from the nosocomial outbreak clustered by LM/HRM, which was confirmed by gel electrophoresis of the LM/PCR products and PFGE. Control isolates that clustered by LM/PCR (n = 4) but not by PFGE were resolved by confirmatory gel electrophoresis. We conclude that LM/HRM is a rapid method for the detection of nosocomial outbreaks of bacterial infections caused by ESBL-producing E. coli strains. It allows the analysis of isolates in a single-tube system within a day, and the discriminatory power is comparable to that of PFGE. PMID:21956981
High-resolution melting-curve analysis of ligation-mediated real-time PCR for rapid evaluation of an epidemiological outbreak of extended-spectrum-beta-lactamase-producing Escherichia coli.

PubMed

Woksepp, Hanna; Jernberg, Cecilia; Tärnberg, Maria; Ryberg, Anna; Brolund, Alma; Nordvall, Michaela; Olsson-Liljequist, Barbro; Wisell, Karin Tegmark; Monstein, Hans-Jürg; Nilsson, Lennart E; Schön, Thomas

2011-12-01

Methods for the confirmation of nosocomial outbreaks of bacterial pathogens are complex, expensive, and time-consuming. Recently, a method based on ligation-mediated PCR (LM/PCR) using a low denaturation temperature which produces specific melting-profile patterns of DNA products has been described. Our objective was to further develop this method for real-time PCR and high-resolution melting analysis (HRM) in a single-tube system optimized in order to achieve results within 1 day. Following the optimization of LM/PCR for real-time PCR and HRM (LM/HRM), the method was applied for a nosocomial outbreak of extended-spectrum-beta-lactamase (ESBL)-producing and ST131-associated Escherichia coli isolates (n = 15) and control isolates (n = 29), including four previous clusters. The results from LM/HRM were compared to results from pulsed-field gel electrophoresis (PFGE), which served as the gold standard. All isolates from the nosocomial outbreak clustered by LM/HRM, which was confirmed by gel electrophoresis of the LM/PCR products and PFGE. Control isolates that clustered by LM/PCR (n = 4) but not by PFGE were resolved by confirmatory gel electrophoresis. We conclude that LM/HRM is a rapid method for the detection of nosocomial outbreaks of bacterial infections caused by ESBL-producing E. coli strains. It allows the analysis of isolates in a single-tube system within a day, and the discriminatory power is comparable to that of PFGE.
The relationship of dynamical heterogeneity to the Adam-Gibbs and random first-order transition theories of glass formation.

PubMed

Starr, Francis W; Douglas, Jack F; Sastry, Srikanth

2013-03-28

We carefully examine common measures of dynamical heterogeneity for a model polymer melt and test how these scales compare with those hypothesized by the Adam and Gibbs (AG) and random first-order transition (RFOT) theories of relaxation in glass-forming liquids. To this end, we first analyze clusters of highly mobile particles, the string-like collective motion of these mobile particles, and clusters of relative low mobility. We show that the time scale of the high-mobility clusters and strings is associated with a diffusive time scale, while the low-mobility particles' time scale relates to a structural relaxation time. The difference of the characteristic times for the high- and low-mobility particles naturally explains the well-known decoupling of diffusion and structural relaxation time scales. Despite the inherent difference of dynamics between high- and low-mobility particles, we find a high degree of similarity in the geometrical structure of these particle clusters. In particular, we show that the fractal dimensions of these clusters are consistent with those of swollen branched polymers or branched polymers with screened excluded-volume interactions, corresponding to lattice animals and percolation clusters, respectively. In contrast, the fractal dimension of the strings crosses over from that of self-avoiding walks for small strings, to simple random walks for longer, more strongly interacting, strings, corresponding to flexible polymers with screened excluded-volume interactions. We examine the appropriateness of identifying the size scales of either mobile particle clusters or strings with the size of cooperatively rearranging regions (CRR) in the AG and RFOT theories. We find that the string size appears to be the most consistent measure of CRR for both the AG and RFOT models. Identifying strings or clusters with the "mosaic" length of the RFOT model relaxes the conventional assumption that the "entropic droplets" are compact. We also confirm the validity of the entropy formulation of the AG theory, constraining the exponent values of the RFOT theory. This constraint, together with the analysis of size scales, enables us to estimate the characteristic exponents of RFOT.
A parsimonious characterization of change in global age-specific and total fertility rates

PubMed Central

2018-01-01

This study aims to understand trends in global fertility from 1950-2010 though the analysis of age-specific fertility rates. This approach incorporates both the overall level, as when the total fertility rate is modeled, and different patterns of age-specific fertility to examine the relationship between changes in age-specific fertility and fertility decline. Singular value decomposition is used to capture the variation in age-specific fertility curves while reducing the number of dimensions, allowing curves to be described nearly fully with three parameters. Regional patterns and trends over time are evident in parameter values, suggesting this method provides a useful tool for considering fertility decline globally. The second and third parameters were analyzed using model-based clustering to examine patterns of age-specific fertility over time and place; four clusters were obtained. A country’s demographic transition can be traced through time by membership in the different clusters, and regional patterns in the trajectories through time and with fertility decline are identified. PMID:29377899
Joint model-based clustering of nonlinear longitudinal trajectories and associated time-to-event data analysis, linked by latent class membership: with application to AIDS clinical studies.

PubMed

Huang, Yangxin; Lu, Xiaosun; Chen, Jiaqing; Liang, Juan; Zangmeister, Miriam

2017-10-27

Longitudinal and time-to-event data are often observed together. Finite mixture models are currently used to analyze nonlinear heterogeneous longitudinal data, which, by releasing the homogeneity restriction of nonlinear mixed-effects (NLME) models, can cluster individuals into one of the pre-specified classes with class membership probabilities. This clustering may have clinical significance, and be associated with clinically important time-to-event data. This article develops a joint modeling approach to a finite mixture of NLME models for longitudinal data and proportional hazard Cox model for time-to-event data, linked by individual latent class indicators, under a Bayesian framework. The proposed joint models and method are applied to a real AIDS clinical trial data set, followed by simulation studies to assess the performance of the proposed joint model and a naive two-step model, in which finite mixture model and Cox model are fitted separately.
[Optimization of cluster analysis based on drug resistance profiles of MRSA isolates].

PubMed

Tani, Hiroya; Kishi, Takahiko; Gotoh, Minehiro; Yamagishi, Yuka; Mikamo, Hiroshige

2015-12-01

We examined 402 methicillin-resistant Staphylococcus aureus (MRSA) strains isolated from clinical specimens in our hospital between November 19, 2010 and December 27, 2011 to evaluate the similarity between cluster analysis of drug susceptibility tests and pulsed-field gel electrophoresis (PFGE). The results showed that the 402 strains tested were classified into 27 PFGE patterns (151 subtypes of patterns). Cluster analyses of drug susceptibility tests with the cut-off distance yielding a similar classification capability showed favorable results--when the MIC method was used, and minimum inhibitory concentration (MIC) values were used directly in the method, the level of agreement with PFGE was 74.2% when 15 drugs were tested. The Unweighted Pair Group Method with Arithmetic mean (UPGMA) method was effective when the cut-off distance was 16. Using the SIR method in which susceptible (S), intermediate (I), and resistant (R) were coded as 0, 2, and 3, respectively, according to the Clinical and Laboratory Standards Institute (CLSI) criteria, the level of agreement with PFGE was 75.9% when the number of drugs tested was 17, the method used for clustering was the UPGMA, and the cut-off distance was 3.6. In addition, to assess the reproducibility of the results, 10 strains were randomly sampled from the overall test and subjected to cluster analysis. This was repeated 100 times under the same conditions. The results indicated good reproducibility of the results, with the level of agreement with PFGE showing a mean of 82.0%, standard deviation of 12.1%, and mode of 90.0% for the MIC method and a mean of 80.0%, standard deviation of 13.4%, and mode of 90.0% for the SIR method. In summary, cluster analysis for drug susceptibility tests is useful for the epidemiological analysis of MRSA.
Cluster analysis of quantitative MRI T2 and T1ρ relaxation times of cartilage identifies differences between healthy and ACL-injured individuals at 3T.

PubMed

Monu, U D; Jordan, C D; Samuelson, B L; Hargreaves, B A; Gold, G E; McWalter, E J

2017-04-01

To identify focal lesions of elevated MRI T 2 and T 1ρ relaxation times in articular cartilage of an ACL-injured group using a novel cluster analysis technique. Eighteen ACL-injured patients underwent 3T MRI T 2 and T 1ρ relaxometry at baseline, 6 months and 1 year and six healthy volunteers at baseline, 1 day and 1 year. Clusters of contiguous pixels above or below T 2 and T 1ρ intensity and area thresholds were identified on a projection map of the 3D femoral cartilage surface. The total area of femoral cartilage plate covered by clusters (%CA) was split into areas above (%CA+) and below (%CA-) the thresholds and the differences in %CA(+ or -) over time in the ACL-injured group were determined using the Wilcoxon signed rank test. %CA+ was greater in the ACL-injured patients than the healthy volunteers at 6 months and 1 year with average %CA+ of 5.2 ± 4.0% (p = 0.0054) and 6.6 ± 3.7% (p = 0.0041) for T 2 and 6.2 ± 7.1% (p = 0.063) and 8.2 ± 6.9% (p = 0.042) for T 1ρ , respectively. %CA- at 6 months and 1 year was 3.0 ± 1.8% (p > 0.1) and 5.9 ± 5.0% (p > 0.1) for T 2 and 4.4 ± 4.9% (p > 0.1) and 4.5 ± 4.6% (p > 0.1) for T 1ρ , respectively. With the proposed cluster analysis technique, we have quantified cartilage lesion coverage and demonstrated that the ACL-injured group had greater areas of elevated T 2 and T 1ρ relaxation times as compared to healthy volunteers. Copyright © 2016 Osteoarthritis Research Society International. Published by Elsevier Ltd. All rights reserved.
The engine design engine. A clustered computer platform for the aerodynamic inverse design and analysis of a full engine

NASA Technical Reports Server (NTRS)

Sanz, J.; Pischel, K.; Hubler, D.

1992-01-01

An application for parallel computation on a combined cluster of powerful workstations and supercomputers was developed. A Parallel Virtual Machine (PVM) is used as message passage language on a macro-tasking parallelization of the Aerodynamic Inverse Design and Analysis for a Full Engine computer code. The heterogeneous nature of the cluster is perfectly handled by the controlling host machine. Communication is established via Ethernet with the TCP/IP protocol over an open network. A reasonable overhead is imposed for internode communication, rendering an efficient utilization of the engaged processors. Perhaps one of the most interesting features of the system is its versatile nature, that permits the usage of the computational resources available that are experiencing less use at a given point in time.
IoT Big-Data Centred Knowledge Granule Analytic and Cluster Framework for BI Applications: A Case Base Analysis.

PubMed

Chang, Hsien-Tsung; Mishra, Nilamadhab; Lin, Chung-Chih

2015-01-01

The current rapid growth of Internet of Things (IoT) in various commercial and non-commercial sectors has led to the deposition of large-scale IoT data, of which the time-critical analytic and clustering of knowledge granules represent highly thought-provoking application possibilities. The objective of the present work is to inspect the structural analysis and clustering of complex knowledge granules in an IoT big-data environment. In this work, we propose a knowledge granule analytic and clustering (KGAC) framework that explores and assembles knowledge granules from IoT big-data arrays for a business intelligence (BI) application. Our work implements neuro-fuzzy analytic architecture rather than a standard fuzzified approach to discover the complex knowledge granules. Furthermore, we implement an enhanced knowledge granule clustering (e-KGC) mechanism that is more elastic than previous techniques when assembling the tactical and explicit complex knowledge granules from IoT big-data arrays. The analysis and discussion presented here show that the proposed framework and mechanism can be implemented to extract knowledge granules from an IoT big-data array in such a way as to present knowledge of strategic value to executives and enable knowledge users to perform further BI actions.
IoT Big-Data Centred Knowledge Granule Analytic and Cluster Framework for BI Applications: A Case Base Analysis

PubMed Central

Chang, Hsien-Tsung; Mishra, Nilamadhab; Lin, Chung-Chih

2015-01-01

The current rapid growth of Internet of Things (IoT) in various commercial and non-commercial sectors has led to the deposition of large-scale IoT data, of which the time-critical analytic and clustering of knowledge granules represent highly thought-provoking application possibilities. The objective of the present work is to inspect the structural analysis and clustering of complex knowledge granules in an IoT big-data environment. In this work, we propose a knowledge granule analytic and clustering (KGAC) framework that explores and assembles knowledge granules from IoT big-data arrays for a business intelligence (BI) application. Our work implements neuro-fuzzy analytic architecture rather than a standard fuzzified approach to discover the complex knowledge granules. Furthermore, we implement an enhanced knowledge granule clustering (e-KGC) mechanism that is more elastic than previous techniques when assembling the tactical and explicit complex knowledge granules from IoT big-data arrays. The analysis and discussion presented here show that the proposed framework and mechanism can be implemented to extract knowledge granules from an IoT big-data array in such a way as to present knowledge of strategic value to executives and enable knowledge users to perform further BI actions. PMID:26600156
Exploring the effects of climatic variables on monthly precipitation variation using a continuous wavelet-based multiscale entropy approach.

PubMed

Roushangar, Kiyoumars; Alizadeh, Farhad; Adamowski, Jan

2018-08-01

Understanding precipitation on a regional basis is an important component of water resources planning and management. The present study outlines a methodology based on continuous wavelet transform (CWT) and multiscale entropy (CWME), combined with self-organizing map (SOM) and k-means clustering techniques, to measure and analyze the complexity of precipitation. Historical monthly precipitation data from 1960 to 2010 at 31 rain gauges across Iran were preprocessed by CWT. The multi-resolution CWT approach segregated the major features of the original precipitation series by unfolding the structure of the time series which was often ambiguous. The entropy concept was then applied to components obtained from CWT to measure dispersion, uncertainty, disorder, and diversification of subcomponents. Based on different validity indices, k-means clustering captured homogenous areas more accurately, and additional analysis was performed based on the outcome of this approach. The 31 rain gauges in this study were clustered into 6 groups, each one having a unique CWME pattern across different time scales. The results of clustering showed that hydrologic similarity (multiscale variation of precipitation) was not based on geographic contiguity. According to the pattern of entropy across the scales, each cluster was assigned an entropy signature that provided an estimation of the entropy pattern of precipitation data in each cluster. Based on the pattern of mean CWME for each cluster, a characteristic signature was assigned, which provided an estimation of the CWME of a cluster across scales of 1-2, 3-8, and 9-13 months relative to other stations. The validity of the homogeneous clusters demonstrated the usefulness of the proposed approach to regionalize precipitation. Further analysis based on wavelet coherence (WTC) was performed by selecting central rain gauges in each cluster and analyzing against temperature, wind, Multivariate ENSO index (MEI), and East Atlantic (EA) and North Atlantic Oscillation (NAO), indeces. The results revealed that all climatic features except NAO influenced precipitation in Iran during the 1960-2010 period. Copyright © 2018 Elsevier Inc. All rights reserved.
Functional Interference Clusters in Cancer Patients With Bone Metastases: A Secondary Analysis of RTOG 9714

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chow, Edward, E-mail: Edward.Chow@sunnybrook.c; James, Jennifer; Barsevick, Andrea

Purpose: To explore the relationships (clusters) among the functional interference items in the Brief Pain Inventory (BPI) in patients with bone metastases. Methods: Patients enrolled in the Radiation Therapy Oncology Group (RTOG) 9714 bone metastases study were eligible. Patients were assessed at baseline and 4, 8, and 12 weeks after randomization for the palliative radiotherapy with the BPI, which consists of seven functional items: general activity, mood, walking ability, normal work, relations with others, sleep, and enjoyment of life. Principal component analysis with varimax rotation was used to determine the clusters between the functional items at baseline and the follow-up.more » Cronbach's alpha was used to determine the consistency and reliability of each cluster at baseline and follow-up. Results: There were 448 male and 461 female patients, with a median age of 67 years. There were two functional interference clusters at baseline, which accounted for 71% of the total variance. The first cluster (physical interference) included normal work and walking ability, which accounted for 58% of the total variance. The second cluster (psychosocial interference) included relations with others and sleep, which accounted for 13% of the total variance. The Cronbach's alpha statistics were 0.83 and 0.80, respectively. The functional clusters changed at week 12 in responders but persisted through week 12 in nonresponders. Conclusion: Palliative radiotherapy is effective in reducing bone pain. Functional interference component clusters exist in patients treated for bone metastases. These clusters changed over time in this study, possibly attributable to treatment. Further research is needed to examine these effects.« less
Performance comparison analysis library communication cluster system using merge sort

NASA Astrophysics Data System (ADS)

Wulandari, D. A. R.; Ramadhan, M. E.

2018-04-01

Begins by using a single processor, to increase the speed of computing time, the use of multi-processor was introduced. The second paradigm is known as parallel computing, example cluster. The cluster must have the communication potocol for processing, one of it is message passing Interface (MPI). MPI have many library, both of them OPENMPI and MPICH2. Performance of the cluster machine depend on suitable between performance characters of library communication and characters of the problem so this study aims to analyze the comparative performances libraries in handling parallel computing process. The case study in this research are MPICH2 and OpenMPI. This case research execute sorting’s problem to know the performance of cluster system. The sorting problem use mergesort method. The research method is by implementing OpenMPI and MPICH2 on a Linux-based cluster by using five computer virtual then analyze the performance of the system by different scenario tests and three parameters for to know the performance of MPICH2 and OpenMPI. These performances are execution time, speedup and efficiency. The results of this study showed that the addition of each data size makes OpenMPI and MPICH2 have an average speed-up and efficiency tend to increase but at a large data size decreases. increased data size doesn’t necessarily increased speed up and efficiency but only execution time example in 100000 data size. OpenMPI has a execution time greater than MPICH2 example in 1000 data size average execution time with MPICH2 is 0,009721 and OpenMPI is 0,003895 OpenMPI can customize communication needs.
On the determination of age and mass functions of stars in young open star clusters from the analysis of their luminosity functions

NASA Astrophysics Data System (ADS)

Piskunov, A. E.; Belikov, A. N.; Kharchenko, N. V.; Sagar, R.; Subramaniam, A.

2004-04-01

We construct the observed luminosity functions of the remote young open clusters NGC 2383, 2384, 4103, 4755, 7510 and Hogg 15 from CCD observations of them. The observed LFs are corrected for field star contamination determined with the help of a Galactic star count model. In the case of Hogg 15 and NGC 2383 we also consider the additional contamination from neighbouring clusters NGC 4609 and 2384, respectively. These corrections provide a realistic pattern of cluster LF in the vicinity of the main-sequence (MS) turn-on point and at fainter magnitudes reveal the so-called H-feature arising as a result of the transition of the pre-MS phase to the MS, which is dependent on the cluster age. The theoretical LFs are constructed representing a cluster population model with continuous star formation for a short time-scale and a power-law initial mass function (IMF), and these are fitted to the observed LF. As a result, we are able to determine for each cluster a set of parameters describing the cluster population (the age, duration of star formation, IMF slope and percentage of field star contamination). It is found that in spite of the non-monotonic behaviour of observed LFs, cluster IMFs can be described as power-law functions with slopes similar to Salpeter's value. The present main-sequence turn-on cluster ages are several times lower than those derived from the fitting of theoretical isochrones to the turn-off region of the upper main sequences.
Identification of new participants in the rainbow trout (Oncorhynchus mykiss) oocyte maturation and ovulation processes using cDNA microarrays

PubMed Central

Bobe, Julien; Montfort, Jerôme; Nguyen, Thaovi; Fostier, Alexis

2006-01-01

Background The hormonal control of oocyte maturation and ovulation as well as the molecular mechanisms of nuclear maturation have been thoroughly studied in fish. In contrast, the other molecular events occurring in the ovary during post-vitellogenesis have received far less attention. Methods Nylon microarrays displaying 9152 rainbow trout cDNAs were hybridized using RNA samples originating from ovarian tissue collected during late vitellogenesis, post-vitellogenesis and oocyte maturation. Differentially expressed genes were identified using a statistical analysis. A supervised clustering analysis was performed using only differentially expressed genes in order to identify gene clusters exhibiting similar expression profiles. In addition, specific genes were selected and their preovulatory ovarian expression was analyzed using real-time PCR. Results From the statistical analysis, 310 differentially expressed genes were identified. Among those genes, 90 were up-regulated at the time of oocyte maturation while 220 exhibited an opposite pattern. After clustering analysis, 90 clones belonging to 3 gene clusters exhibiting the most remarkable expression patterns were kept for further analysis. Using real-time PCR analysis, we observed a strong up-regulation of ion and water transport genes such as aquaporin 4 (aqp4) and pendrin (slc26). In addition, a dramatic up-regulation of vasotocin (avt) gene was observed. Furthermore, angiotensin-converting-enzyme 2 (ace2), coagulation factor V (cf5), adam 22, and the chemokine cxcl14 genes exhibited a sharp up-regulation at the time of oocyte maturation. Finally, ovarian aromatase (cyp19a1) exhibited a dramatic down-regulation over the post-vitellogenic period while a down-regulation of Cytidine monophosphate-N-acetylneuraminic acid hydroxylase (cmah) was observed at the time of oocyte maturation. Conclusion We showed the over or under expression of more that 300 genes, most of them being previously unstudied or unknown in the fish preovulatory ovary. Our data confirmed the down-regulation of estrogen synthesis genes during the preovulatory period. In addition, the strong up-regulation of aqp4 and slc26 genes prior to ovulation suggests their participation in the oocyte hydration process occurring at that time. Furthermore, among the most up-regulated clones, several genes such as cxcl14, ace2, adam22, cf5 have pro-inflammatory, vasodilatory, proteolytics and coagulatory functions. The identity and expression patterns of those genes support the theory comparing ovulation to an inflammatory-like reaction. PMID:16872517

Using Cluster Analysis to Examine Husband-Wife Decision Making

ERIC Educational Resources Information Center

Bonds-Raacke, Jennifer M.

2006-01-01

Cluster analysis has a rich history in many disciplines and although cluster analysis has been used in clinical psychology to identify types of disorders, its use in other areas of psychology has been less popular. The purpose of the current experiments was to use cluster analysis to investigate husband-wife decision making. Cluster analysis was…
HICOSMO - cosmology with a complete sample of galaxy clusters - I. Data analysis, sample selection and luminosity-mass scaling relation

NASA Astrophysics Data System (ADS)

Schellenberger, G.; Reiprich, T. H.

2017-08-01

The X-ray regime, where the most massive visible component of galaxy clusters, the intracluster medium, is visible, offers directly measured quantities, like the luminosity, and derived quantities, like the total mass, to characterize these objects. The aim of this project is to analyse a complete sample of galaxy clusters in detail and constrain cosmological parameters, like the matter density, Ωm, or the amplitude of initial density fluctuations, σ8. The purely X-ray flux-limited sample (HIFLUGCS) consists of the 64 X-ray brightest galaxy clusters, which are excellent targets to study the systematic effects, that can bias results. We analysed in total 196 Chandra observations of the 64 HIFLUGCS clusters, with a total exposure time of 7.7 Ms. Here, we present our data analysis procedure (including an automated substructure detection and an energy band optimization for surface brightness profile analysis) that gives individually determined, robust total mass estimates. These masses are tested against dynamical and Planck Sunyaev-Zeldovich (SZ) derived masses of the same clusters, where good overall agreement is found with the dynamical masses. The Planck SZ masses seem to show a mass-dependent bias to our hydrostatic masses; possible biases in this mass-mass comparison are discussed including the Planck selection function. Furthermore, we show the results for the (0.1-2.4) keV luminosity versus mass scaling relation. The overall slope of the sample (1.34) is in agreement with expectations and values from literature. Splitting the sample into galaxy groups and clusters reveals, even after a selection bias correction, that galaxy groups exhibit a significantly steeper slope (1.88) compared to clusters (1.06).
Space-Time Analysis of Testicular Cancer Clusters Using Residential Histories: A Case-Control Study in Denmark

PubMed Central

Sloan, Chantel D.; Nordsborg, Rikke B.; Jacquez, Geoffrey M.; Raaschou-Nielsen, Ole; Meliker, Jaymie R.

2015-01-01

Though the etiology is largely unknown, testicular cancer incidence has seen recent significant increases in northern Europe and throughout many Western regions. The most common cancer in males under age 40, age period cohort models have posited exposures in the in utero environment or in early childhood as possible causes of increased risk of testicular cancer. Some of these factors may be tied to geography through being associated with behavioral, cultural, sociodemographic or built environment characteristics. If so, this could result in detectable geographic clusters of cases that could lead to hypotheses regarding environmental targets for intervention. Given a latency period between exposure to an environmental carcinogen and testicular cancer diagnosis, mobility histories are beneficial for spatial cluster analyses. Nearest-neighbor based Q-statistics allow for the incorporation of changes in residency in spatial disease cluster detection. Using these methods, a space-time cluster analysis was conducted on a population-wide case-control population selected from the Danish Cancer Registry with mobility histories since 1971 extracted from the Danish Civil Registration System. Cases (N=3297) were diagnosed between 1991 and 2003, and two sets of controls (N=3297 for each set) matched on sex and date of birth were included in the study. We also examined spatial patterns in maternal residential history for those cases and controls born in 1971 or later (N= 589 case-control pairs). Several small clusters were detected when aligning individuals by year prior to diagnosis, age at diagnosis and calendar year of diagnosis. However, the largest of these clusters contained only 2 statistically significant individuals at their center, and were not replicated in SaTScan spatial-only analyses which are less susceptible to multiple testing bias. We found little evidence of local clusters in residential histories of testicular cancer cases in this Danish population. PMID:25756204
Space-time analysis of testicular cancer clusters using residential histories: a case-control study in Denmark.

PubMed

Sloan, Chantel D; Nordsborg, Rikke B; Jacquez, Geoffrey M; Raaschou-Nielsen, Ole; Meliker, Jaymie R

2015-01-01

Though the etiology is largely unknown, testicular cancer incidence has seen recent significant increases in northern Europe and throughout many Western regions. The most common cancer in males under age 40, age period cohort models have posited exposures in the in utero environment or in early childhood as possible causes of increased risk of testicular cancer. Some of these factors may be tied to geography through being associated with behavioral, cultural, sociodemographic or built environment characteristics. If so, this could result in detectable geographic clusters of cases that could lead to hypotheses regarding environmental targets for intervention. Given a latency period between exposure to an environmental carcinogen and testicular cancer diagnosis, mobility histories are beneficial for spatial cluster analyses. Nearest-neighbor based Q-statistics allow for the incorporation of changes in residency in spatial disease cluster detection. Using these methods, a space-time cluster analysis was conducted on a population-wide case-control population selected from the Danish Cancer Registry with mobility histories since 1971 extracted from the Danish Civil Registration System. Cases (N=3297) were diagnosed between 1991 and 2003, and two sets of controls (N=3297 for each set) matched on sex and date of birth were included in the study. We also examined spatial patterns in maternal residential history for those cases and controls born in 1971 or later (N= 589 case-control pairs). Several small clusters were detected when aligning individuals by year prior to diagnosis, age at diagnosis and calendar year of diagnosis. However, the largest of these clusters contained only 2 statistically significant individuals at their center, and were not replicated in SaTScan spatial-only analyses which are less susceptible to multiple testing bias. We found little evidence of local clusters in residential histories of testicular cancer cases in this Danish population.
Socioscape: Real-Time Analysis of Dynamic Heterogeneous Networks In Complex Socio-Cultural Systems

DTIC Science & Technology

2015-10-22

Cluster Mixed-Membership Blockmodel for Time-Evolving Networks, Proceedings of the 14th International Conference on Artifical Intelligence and...Learning With Simultaneous Orthogonal Matching Pursuit, Proceedings of the 13th International Conference on Artifical Intelligence and Statistics
Relation between financial market structure and the real economy: comparison between clustering methods.

PubMed

Musmeci, Nicoló; Aste, Tomaso; Di Matteo, T

2015-01-01

We quantify the amount of information filtered by different hierarchical clustering methods on correlations between stock returns comparing the clustering structure with the underlying industrial activity classification. We apply, for the first time to financial data, a novel hierarchical clustering approach, the Directed Bubble Hierarchical Tree and we compare it with other methods including the Linkage and k-medoids. By taking the industrial sector classification of stocks as a benchmark partition, we evaluate how the different methods retrieve this classification. The results show that the Directed Bubble Hierarchical Tree can outperform other methods, being able to retrieve more information with fewer clusters. Moreover,we show that the economic information is hidden at different levels of the hierarchical structures depending on the clustering method. The dynamical analysis on a rolling window also reveals that the different methods show different degrees of sensitivity to events affecting financial markets, like crises. These results can be of interest for all the applications of clustering methods to portfolio optimization and risk hedging [corrected].
Interactive visual exploration and analysis of origin-destination data

NASA Astrophysics Data System (ADS)

Ding, Linfang; Meng, Liqiu; Yang, Jian; Krisp, Jukka M.

2018-05-01

In this paper, we propose a visual analytics approach for the exploration of spatiotemporal interaction patterns of massive origin-destination data. Firstly, we visually query the movement database for data at certain time windows. Secondly, we conduct interactive clustering to allow the users to select input variables/features (e.g., origins, destinations, distance, and duration) and to adjust clustering parameters (e.g. distance threshold). The agglomerative hierarchical clustering method is applied for the multivariate clustering of the origin-destination data. Thirdly, we design a parallel coordinates plot for visualizing the precomputed clusters and for further exploration of interesting clusters. Finally, we propose a gradient line rendering technique to show the spatial and directional distribution of origin-destination clusters on a map view. We implement the visual analytics approach in a web-based interactive environment and apply it to real-world floating car data from Shanghai. The experiment results show the origin/destination hotspots and their spatial interaction patterns. They also demonstrate the effectiveness of our proposed approach.
Assessment of the climatic potential for tourism in Iran through biometeorology clustering.

PubMed

Roshan, Gholamreza; Yousefi, Robabe; Błażejczyk, Krzysztof

2018-04-01

This study presents a spatiotemporal analysis of bioclimatic comfort conditions for Iran using mean daily meteorological data from 1995 to 2014, analyzed through Physiological Equivalent Temperature (PET) index and Universal Thermal Climate Index (UTCI) indices, and bioclimatic clustering. The results of this study demonstrate that due to the climate variability across Iran during the year, there is at any point in time a location with climatic condition suitable for tourism. Mean values demonstrate maxima in bioclimatic comfort indices for the country in late winter and spring and minima for summer. Seven statistically significant clusters in bioclimatic indices were identified. Comparing these with clustering performed on PET and UTCI, the maximum overlaps between the two indices. In the following, the outputs of this research showed that most appropriate bioclimatic clustering for Iran includes seven clusters. These clustering locations according to climatic suitability for tourism provide a valuable contribution to tourism management in the country, particularly through marketing destinations to maximize tourist flow.
K-Means Algorithm Performance Analysis With Determining The Value Of Starting Centroid With Random And KD-Tree Method

NASA Astrophysics Data System (ADS)

Sirait, Kamson; Tulus; Budhiarti Nababan, Erna

2017-12-01

Clustering methods that have high accuracy and time efficiency are necessary for the filtering process. One method that has been known and applied in clustering is K-Means Clustering. In its application, the determination of the begining value of the cluster center greatly affects the results of the K-Means algorithm. This research discusses the results of K-Means Clustering with starting centroid determination with a random and KD-Tree method. The initial determination of random centroid on the data set of 1000 student academic data to classify the potentially dropout has a sse value of 952972 for the quality variable and 232.48 for the GPA, whereas the initial centroid determination by KD-Tree has a sse value of 504302 for the quality variable and 214,37 for the GPA variable. The smaller sse values indicate that the result of K-Means Clustering with initial KD-Tree centroid selection have better accuracy than K-Means Clustering method with random initial centorid selection.
Relation between Financial Market Structure and the Real Economy: Comparison between Clustering Methods

PubMed Central

Musmeci, Nicoló; Aste, Tomaso; Di Matteo, T.

2015-01-01

We quantify the amount of information filtered by different hierarchical clustering methods on correlations between stock returns comparing the clustering structure with the underlying industrial activity classification. We apply, for the first time to financial data, a novel hierarchical clustering approach, the Directed Bubble Hierarchical Tree and we compare it with other methods including the Linkage and k-medoids. By taking the industrial sector classification of stocks as a benchmark partition, we evaluate how the different methods retrieve this classification. The results show that the Directed Bubble Hierarchical Tree can outperform other methods, being able to retrieve more information with fewer clusters. Moreover, we show that the economic information is hidden at different levels of the hierarchical structures depending on the clustering method. The dynamical analysis on a rolling window also reveals that the different methods show different degrees of sensitivity to events affecting financial markets, like crises. These results can be of interest for all the applications of clustering methods to portfolio optimization and risk hedging. PMID:25786703
Pattern recognition approach to the subsequent event of damaging earthquakes in Italy

NASA Astrophysics Data System (ADS)

Gentili, S.; Di Giovambattista, R.

2017-05-01

In this study, we investigate the occurrence of large aftershocks following the most significant earthquakes that occurred in Italy after 1980. In accordance with previous studies (Vorobieva and Panza, 1993; Vorobieva, 1999), we group clusters associated with mainshocks into two categories: ;type A; if, given a main shock of magnitude M, the subsequent strongest earthquake in the cluster has magnitude ≥M - 1 or type B otherwise. In this paper, we apply a pattern recognition approach using statistical features to foresee the class of the analysed clusters. The classification of the two categories is based on some features of the time, space, and magnitude distribution of the aftershocks. Specifically, we analyse the temporal evolution of the radiated energy at different elapsed times after the mainshock, the spatio-temporal evolution of the aftershocks occurring within a few days, and the probability of a strong earthquake. An attempt is made to classify the studied region into smaller seismic zones with a prevalence of type A and B clusters. We demonstrate that the two types of clusters have distinct preferred geographic locations inside the Italian territory that likely reflected key properties of the deforming regions, different crustal domains and faulting style. We use decision trees as classifiers of single features to characterize the features depending on the cluster type. The performance of the classification is tested by the Leave-One-Out method. The analysis is performed on different time-spans after the mainshock to simulate the dependence of the accuracy on the information available as data increased over a longer period with increasing time after the mainshock.
Spatial Analysis of Hemorrhagic Fever with Renal Syndrome in Zibo City, China, 2009–2012

PubMed Central

Wang, Ling; Yang, Shuxia; Zhang, Ling; Cao, Haixia; Zhang, Yan; Hu, Haodong; Zhai, Shenyong

2013-01-01

Background Hemorrhagic fever with renal syndrome (HFRS) is highly endemic in mainland China, where human cases account for 90% of the total global cases. Zibo City is one of the most serious affected areas in Shandong Province China with the HFRS incidence increasing sharply from 2009 to 2012. However, the hotspots of HFRS in Zibo remained unclear. Thus, a spatial analysis was conducted with the aim to explore the spatial, spatial-temporal and seasonal patterns of HFRS in Zibo from 2009 to 2012, and to provide guidance for formulating regional prevention and control strategies. Methods The study was based on the reported cases of HFRS from the National Notifiable Disease Surveillance System. Annualized incidence maps and seasonal incidence maps were produced to analyze the spatial and seasonal distribution of HFRS in Zibo City. Then spatial scan statistics and space-time scan statistics were conducted to identify clusters of HFRS. Results There were 200 cases reported in Zibo City during the 4-year study period. One most likely cluster and one secondary cluster for high incidence of HFRS were identified by the space-time analysis. And the most likely cluster was found to exist at Yiyuan County in October to December 2012. The human infections in the fall and winter reflected a seasonal characteristic pattern of Hantaan virus (HTNV) transmission. The secondary cluster was detected at the center of Zibo in May to June 2009, presenting a seasonal characteristic of Seoul virus (SEOV) transmission. Conclusion To control and prevent HFRS in Zibo city, the comprehensive preventive strategy should be implemented in the southern areas of Zibo in autumn and in the northern areas of Zibo in spring. PMID:23840719
Functional feature embedded space mapping of fMRI data.

PubMed

Hu, Jin; Tian, Jie; Yang, Lei

2006-01-01

We have proposed a new method for fMRI data analysis which is called Functional Feature Embedded Space Mapping (FFESM). Our work mainly focuses on the experimental design with periodic stimuli which can be described by a number of Fourier coefficients in the frequency domain. A nonlinear dimension reduction technique Isomap is applied to the high dimensional features obtained from frequency domain of the fMRI data for the first time. Finally, the presence of activated time series is identified by the clustering method in which the information theoretic criterion of minimum description length (MDL) is used to estimate the number of clusters. The feasibility of our algorithm is demonstrated by real human experiments. Although we focus on analyzing periodic fMRI data, the approach can be extended to analyze non-periodic fMRI data (event-related fMRI) by replacing the Fourier analysis with a wavelet analysis.
Electronic medical records and physician stress in primary care: results from the MEMO Study

PubMed Central

Babbott, Stewart; Manwell, Linda Baier; Brown, Roger; Montague, Enid; Williams, Eric; Schwartz, Mark; Hess, Erik; Linzer, Mark

2014-01-01

Background Little has been written about physician stress that may be associated with electronic medical records (EMR). Objective We assessed relationships between the number of EMR functions, primary care work conditions, and physician satisfaction, stress and burnout. Design and participants 379 primary care physicians and 92 managers at 92 clinics from New York City and the upper Midwest participating in the 2001–5 Minimizing Error, Maximizing Outcome (MEMO) Study. A latent class analysis identified clusters of physicians within clinics with low, medium and high EMR functions. Main measures We assessed physician-reported stress, burnout, satisfaction, and intent to leave the practice, and predictors including time pressure during visits. We used a two-level regression model to estimate the mean response for each physician cluster to each outcome, adjusting for physician age, sex, specialty, work hours and years using the EMR. Effect sizes (ES) of these relationships were considered small (0.14), moderate (0.39), and large (0.61). Key results Compared to the low EMR cluster, physicians in the moderate EMR cluster reported more stress (ES 0.35, p=0.03) and lower satisfaction (ES −0.45, p=0.006). Physicians in the high EMR cluster indicated lower satisfaction than low EMR cluster physicians (ES −0.39, p=0.01). Time pressure was associated with significantly more burnout, dissatisfaction and intent to leave only within the high EMR cluster. Conclusions Stress may rise for physicians with a moderate number of EMR functions. Time pressure was associated with poor physician outcomes mainly in the high EMR cluster. Work redesign may address these stressors. PMID:24005796
Low-level processing for real-time image analysis

NASA Technical Reports Server (NTRS)

Eskenazi, R.; Wilf, J. M.

1979-01-01

A system that detects object outlines in television images in real time is described. A high-speed pipeline processor transforms the raw image into an edge map and a microprocessor, which is integrated into the system, clusters the edges, and represents them as chain codes. Image statistics, useful for higher level tasks such as pattern recognition, are computed by the microprocessor. Peak intensity and peak gradient values are extracted within a programmable window and are used for iris and focus control. The algorithms implemented in hardware and the pipeline processor architecture are described. The strategy for partitioning functions in the pipeline was chosen to make the implementation modular. The microprocessor interface allows flexible and adaptive control of the feature extraction process. The software algorithms for clustering edge segments, creating chain codes, and computing image statistics are also discussed. A strategy for real time image analysis that uses this system is given.
Cluster analysis of word frequency dynamics

NASA Astrophysics Data System (ADS)

Maslennikova, Yu S.; Bochkarev, V. V.; Belashova, I. A.

2015-01-01

This paper describes the analysis and modelling of word usage frequency time series. During one of previous studies, an assumption was put forward that all word usage frequencies have uniform dynamics approaching the shape of a Gaussian function. This assumption can be checked using the frequency dictionaries of the Google Books Ngram database. This database includes 5.2 million books published between 1500 and 2008. The corpus contains over 500 billion words in American English, British English, French, German, Spanish, Russian, Hebrew, and Chinese. We clustered time series of word usage frequencies using a Kohonen neural network. The similarity between input vectors was estimated using several algorithms. As a result of the neural network training procedure, more than ten different forms of time series were found. They describe the dynamics of word usage frequencies from birth to death of individual words. Different groups of word forms were found to have different dynamics of word usage frequency variations.
Three-dimensional visualization of cultural clusters in the 1878 yellow fever epidemic of New Orleans

PubMed Central

Curtis, Andrew J

2008-01-01

Background An epidemic may exhibit different spatial patterns with a change in geographic scale, with each scale having different conduits and impediments to disease spread. Mapping disease at each of these scales often reveals different cluster patterns. This paper will consider this change of geographic scale in an analysis of yellow fever deaths for New Orleans in 1878. Global clustering for the whole city, will be followed by a focus on the French Quarter, then clusters of that area, and finally street-level patterns of a single cluster. The three-dimensional visualization capabilities of a GIS will be used as part of a cluster creation process that incorporates physical buildings in calculating mortality-to-mortality distance. Including nativity of the deceased will also capture cultural connection. Results Twenty-two yellow fever clusters were identified for the French Quarter. These generally mirror the results of other global cluster and density surfaces created for the entire epidemic in New Orleans. However, the addition of building-distance, and disease specific time frame between deaths reveal that disease spread contains a cultural component. Same nativity mortality clusters emerge in a similar time frame irrespective of proximity. Italian nativity mortalities were far more densely grouped than any of the other cohorts. A final examination of mortalities for one of the nativity clusters reveals that further sub-division is present, and that this pattern would only be revealed at this scale (street level) of investigation. Conclusion Disease spread in an epidemic is complex resulting from a combination of geographic distance, geographic distance with specific connection to the built environment, disease-specific time frame between deaths, impediments such as herd immunity, and social or cultural connection. This research has shown that the importance of cultural connection may be more important than simple proximity, which in turn might mean traditional quarantine measures should be re-evaluated. PMID:18721469
Three-dimensional visualization of cultural clusters in the 1878 yellow fever epidemic of New Orleans.

PubMed

Curtis, Andrew J

2008-08-22

An epidemic may exhibit different spatial patterns with a change in geographic scale, with each scale having different conduits and impediments to disease spread. Mapping disease at each of these scales often reveals different cluster patterns. This paper will consider this change of geographic scale in an analysis of yellow fever deaths for New Orleans in 1878. Global clustering for the whole city, will be followed by a focus on the French Quarter, then clusters of that area, and finally street-level patterns of a single cluster. The three-dimensional visualization capabilities of a GIS will be used as part of a cluster creation process that incorporates physical buildings in calculating mortality-to-mortality distance. Including nativity of the deceased will also capture cultural connection. Twenty-two yellow fever clusters were identified for the French Quarter. These generally mirror the results of other global cluster and density surfaces created for the entire epidemic in New Orleans. However, the addition of building-distance, and disease specific time frame between deaths reveal that disease spread contains a cultural component. Same nativity mortality clusters emerge in a similar time frame irrespective of proximity. Italian nativity mortalities were far more densely grouped than any of the other cohorts. A final examination of mortalities for one of the nativity clusters reveals that further sub-division is present, and that this pattern would only be revealed at this scale (street level) of investigation. Disease spread in an epidemic is complex resulting from a combination of geographic distance, geographic distance with specific connection to the built environment, disease-specific time frame between deaths, impediments such as herd immunity, and social or cultural connection. This research has shown that the importance of cultural connection may be more important than simple proximity, which in turn might mean traditional quarantine measures should be re-evaluated.
A Proposal to Investigate Outstanding Problems in Astronomy

NASA Technical Reports Server (NTRS)

Ford, Holland

2003-01-01

During the past year the ACS science team has concentrated on analyzing ACS observations, writing papers, and disseminating our results to the astronomy community at conferences and workshops around the world. We also have put considerable effort in getting our results to the public via public lectures and through press releases. Taking a very broad view of our program, we are investigating the evolution of galaxies and clusters of galaxies from their birth, approximately one billion years after the beginning of the Universe, to the present. We have found and characterized a population of galaxies that are no more than 1.4 billion years old. These may well be the Universe s first generation of infant galaxies. Looking at the Universe 500,000 years later, we see what appears to be a cluster of galaxies just beginning to form (a proto-cluster) around a luminous radio galaxy. Moving forward in time and closer to the present, we are studying clusters of galaxies that are less than half the age of the Universe. Our observations and analysis lead us to the important conclusion that the elliptical galaxies in these clusters must have had their last significant star formation some three billion years earlier, which is about the time when the proto-cluster was forming. Coming still closer to home, we are observing nearby massive clusters of galaxies that are approximately 12 billion years old. The gravity from these large aggregates of dark and luminous matter is so strong it warps space-time itself, and makes the cluster act as a cosmic telescope that magnifies the distant galaxies behind the cluster. We used the magnified (or lensed) galaxies to map the distribution of the dominant matter within the clusters, which is the so-called dark matter (the matter is invisible, and its nature is unknown). We also are using these cosmic telescopes to study the distant lensed galaxies that would otherwise be too small and too faint to be seen even by Hubble and the ACS.
Clustering of health-related behaviors, health outcomes and demographics in Dutch adolescents: a cross-sectional study.

PubMed

Busch, Vincent; Van Stel, Henk F; Schrijvers, Augustinus J P; de Leeuw, Johannes R J

2013-12-04

Recent studies show several health-related behaviors to cluster in adolescents. This has important implications for public health. Interrelated behaviors have been shown to be most effectively targeted by multimodal interventions addressing wider-ranging improvements in lifestyle instead of via separate interventions targeting individual behaviors. However, few previous studies have taken into account a broad, multi-disciplinary range of health-related behaviors and connected these behavioral patterns to health-related outcomes. This paper presents an analysis of the clustering of a broad range of health-related behaviors with relevant demographic factors and several health-related outcomes in adolescents. Self-report questionnaire data were collected from a sample of 2,690 Dutch high school adolescents. Behavioral patterns were deducted via Principal Components Analysis. Subsequently a Two-Step Cluster Analysis was used to identify groups of adolescents with similar behavioral patterns and health-related outcomes. Four distinct behavioral patterns describe the analyzed individual behaviors: 1- risk-prone behavior, 2- bully behavior, 3- problematic screen time use, and 4- sedentary behavior. Subsequent cluster analysis identified four clusters of adolescents. Multi-problem behavior was associated with problematic physical and psychosocial health outcomes, as opposed to those exerting relatively few unhealthy behaviors. These associations were relatively independent of demographics such as ethnicity, gender and socio-economic status. The results show that health-related behaviors tend to cluster, indicating that specific behavioral patterns underlie individual health behaviors. In addition, specific patterns of health-related behaviors were associated with specific health outcomes and demographic factors. In general, unhealthy behavior on account of multiple health-related behaviors was associated with both poor psychosocial and physical health. These findings have significant meaning for future public health programs, which should be more tailored with use of such knowledge on behavioral clustering via e.g. Transfer Learning.

Clustering of health-related behaviors, health outcomes and demographics in Dutch adolescents: a cross-sectional study

PubMed Central

2013-01-01

Background Recent studies show several health-related behaviors to cluster in adolescents. This has important implications for public health. Interrelated behaviors have been shown to be most effectively targeted by multimodal interventions addressing wider-ranging improvements in lifestyle instead of via separate interventions targeting individual behaviors. However, few previous studies have taken into account a broad, multi-disciplinary range of health-related behaviors and connected these behavioral patterns to health-related outcomes. This paper presents an analysis of the clustering of a broad range of health-related behaviors with relevant demographic factors and several health-related outcomes in adolescents. Methods Self-report questionnaire data were collected from a sample of 2,690 Dutch high school adolescents. Behavioral patterns were deducted via Principal Components Analysis. Subsequently a Two-Step Cluster Analysis was used to identify groups of adolescents with similar behavioral patterns and health-related outcomes. Results Four distinct behavioral patterns describe the analyzed individual behaviors: 1- risk-prone behavior, 2- bully behavior, 3- problematic screen time use, and 4- sedentary behavior. Subsequent cluster analysis identified four clusters of adolescents. Multi-problem behavior was associated with problematic physical and psychosocial health outcomes, as opposed to those exerting relatively few unhealthy behaviors. These associations were relatively independent of demographics such as ethnicity, gender and socio-economic status. Conclusions The results show that health-related behaviors tend to cluster, indicating that specific behavioral patterns underlie individual health behaviors. In addition, specific patterns of health-related behaviors were associated with specific health outcomes and demographic factors. In general, unhealthy behavior on account of multiple health-related behaviors was associated with both poor psychosocial and physical health. These findings have significant meaning for future public health programs, which should be more tailored with use of such knowledge on behavioral clustering via e.g. Transfer Learning. PMID:24305509
The Membership and Distance of the Open Cluster Collinder 419

DTIC Science & Technology

2010-09-01

distance based upon new spectral classifications of the brighter members, UBV photometry , and an analysis of astrometric and photometric data from the... photometry of the fainter cluster members in Section 4. Our results are summarized in Section 5. 2. SPECTROSCOPY AND REDDENING OF THE BRIGHTER STARS...including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing
Statistical indicators of collective behavior and functional clusters in gene networks of yeast

NASA Astrophysics Data System (ADS)

Živković, J.; Tadić, B.; Wick, N.; Thurner, S.

2006-03-01

We analyze gene expression time-series data of yeast (S. cerevisiae) measured along two full cell-cycles. We quantify these data by using q-exponentials, gene expression ranking and a temporal mean-variance analysis. We construct gene interaction networks based on correlation coefficients and study the formation of the corresponding giant components and minimum spanning trees. By coloring genes according to their cell function we find functional clusters in the correlation networks and functional branches in the associated trees. Our results suggest that a percolation point of functional clusters can be identified on these gene expression correlation networks.
Remarkable Second-Order Optical Nonlinearity of Nano-Sized Au Cluster: A TDDFT Study

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wu, Kechen; Li, Jun; Lin, Chensheng

2004-04-21

The dipole polarizability, static first hyperpolarizability, and UV-vis spectrum of the recently identified nano-sized tetrahedral cluster of Au have been investigated by using time-dependent density functional response theory. We have discovered that the Au cluster possesses remarkably large molecular second-order optical nonlinearity with the first hyperpolarizabilty (xyz) calculated to be 14.3 x 10 electrostatic unit (esu). The analysis of the low-energy absorption band suggests that the charge transfer from the edged gold atoms to the vertex ones plays the key role in nonlinear optical (NLO) response of Au.
Clustering XCO2 temporal change to assess CO2 exchanging strength of biosphere-atmosphere with GOSAT observations

NASA Astrophysics Data System (ADS)

He, Zhonghua; Lei, Liping; Bie, Nian; Yang, Shaoyuan; Wu, Changjiang; Zeng, Zhao-Cheng

2017-04-01

The temporal change of atmospheric carbon dioxide (CO2) concentration, greatly related to the local activities of CO2 uptake and emission, including biospheric exchange and anthropogenic emission, is one of important information for regions identification of carbon source and sink. Satellite observations of CO2 has been used for detecting the change of CO2 concentration for a long time. In this study, we used the grid data of column-averaged CO2 dry air mole fraction (XCO2) with the spatial resolution of 1 degree and the temporal resolution of 3 days from 1 June 2009 to 31 May 2014 over the land area of 30° - 60° N to implement a clustering of temporal changing characteristics for the Greenhouse Gases Observing Satellite (GOSAT) XCO2 retrievals. Grid data is derived using the gap filling method of spatio-temporal geostatistics. The clustering method is one adjusted K-mean for the gap existed time-series data. As a result, types and number of clusters are specified based on the temporal characteristic of XCO2 by using the optimal clustering parameters. The biospheric absorption and surface emission of atmospheric CO2 is discussed through the analysis of the different yearly increase and seasonal amplitude of XCO2 each cluster combined with correlation analysis with vegetation index from the Moderate-resolution Imaging Spectroradiometer (MODIS) and fossil fuel CO2 emission data from Open-source Data Inventory for Anthropogenic CO2 (Odiac). Regions of strong or weak biosphere-atmosphere exchange, or significant disturbance from anthropogenic activities can be identified. In conclusion, gap filled XCO2 from satellite observations can help us to take an analysis of atmospheric CO2, results of the coupled biosphere-atmosphere, by their spatio-temporal characteristics as well as the relationship with the other remote sensing parameters e.g. MODIS related with biospheric photosynthetic or respiration activities.
Transcriptional regulation of gene expression clusters in motor neurons following spinal cord injury.

PubMed

Ryge, Jesper; Winther, Ole; Wienecke, Jacob; Sandelin, Albin; Westerdahl, Ann-Charlotte; Hultborn, Hans; Kiehn, Ole

2010-06-09

Spinal cord injury leads to neurological dysfunctions affecting the motor, sensory as well as the autonomic systems. Increased excitability of motor neurons has been implicated in injury-induced spasticity, where the reappearance of self-sustained plateau potentials in the absence of modulatory inputs from the brain correlates with the development of spasticity. Here we examine the dynamic transcriptional response of motor neurons to spinal cord injury as it evolves over time to unravel common gene expression patterns and their underlying regulatory mechanisms. For this we use a rat-tail-model with complete spinal cord transection causing injury-induced spasticity, where gene expression profiles are obtained from labeled motor neurons extracted with laser microdissection 0, 2, 7, 21 and 60 days post injury. Consensus clustering identifies 12 gene clusters with distinct time expression profiles. Analysis of these gene clusters identifies early immunological/inflammatory and late developmental responses as well as a regulation of genes relating to neuron excitability that support the development of motor neuron hyper-excitability and the reappearance of plateau potentials in the late phase of the injury response. Transcription factor motif analysis identifies differentially expressed transcription factors involved in the regulation of each gene cluster, shaping the expression of the identified biological processes and their associated genes underlying the changes in motor neuron excitability. This analysis provides important clues to the underlying mechanisms of transcriptional regulation responsible for the increased excitability observed in motor neurons in the late chronic phase of spinal cord injury suggesting alternative targets for treatment of spinal cord injury. Several transcription factors were identified as potential regulators of gene clusters containing elements related to motor neuron hyper-excitability, the manipulation of which potentially could be used to alter the transcriptional response to prevent the motor neurons from entering a state of hyper-excitability.
Transcriptional regulation of gene expression clusters in motor neurons following spinal cord injury

PubMed Central

2010-01-01

Background Spinal cord injury leads to neurological dysfunctions affecting the motor, sensory as well as the autonomic systems. Increased excitability of motor neurons has been implicated in injury-induced spasticity, where the reappearance of self-sustained plateau potentials in the absence of modulatory inputs from the brain correlates with the development of spasticity. Results Here we examine the dynamic transcriptional response of motor neurons to spinal cord injury as it evolves over time to unravel common gene expression patterns and their underlying regulatory mechanisms. For this we use a rat-tail-model with complete spinal cord transection causing injury-induced spasticity, where gene expression profiles are obtained from labeled motor neurons extracted with laser microdissection 0, 2, 7, 21 and 60 days post injury. Consensus clustering identifies 12 gene clusters with distinct time expression profiles. Analysis of these gene clusters identifies early immunological/inflammatory and late developmental responses as well as a regulation of genes relating to neuron excitability that support the development of motor neuron hyper-excitability and the reappearance of plateau potentials in the late phase of the injury response. Transcription factor motif analysis identifies differentially expressed transcription factors involved in the regulation of each gene cluster, shaping the expression of the identified biological processes and their associated genes underlying the changes in motor neuron excitability. Conclusions This analysis provides important clues to the underlying mechanisms of transcriptional regulation responsible for the increased excitability observed in motor neurons in the late chronic phase of spinal cord injury suggesting alternative targets for treatment of spinal cord injury. Several transcription factors were identified as potential regulators of gene clusters containing elements related to motor neuron hyper-excitability, the manipulation of which potentially could be used to alter the transcriptional response to prevent the motor neurons from entering a state of hyper-excitability. PMID:20534130
Phylodynamic Analysis Reveals CRF01_AE Dissemination between Japan and Neighboring Asian Countries and the Role of Intravenous Drug Use in Transmission

PubMed Central

Shiino, Teiichiro; Hattori, Junko; Yokomaku, Yoshiyuki; Iwatani, Yasumasa; Sugiura, Wataru

2014-01-01

Background One major circulating HIV-1 subtype in Southeast Asian countries is CRF01_AE, but little is known about its epidemiology in Japan. We conducted a molecular phylodynamic study of patients newly diagnosed with CRF01_AE from 2003 to 2010. Methods Plasma samples from patients registered in Japanese Drug Resistance HIV-1 Surveillance Network were analyzed for protease-reverse transcriptase sequences; all sequences undergo subtyping and phylogenetic analysis using distance-matrix-based, maximum likelihood and Bayesian coalescent Markov Chain Monte Carlo (MCMC) phylogenetic inferences. Transmission clusters were identified using interior branch test and depth-first searches for sub-tree partitions. Times of most recent common ancestor (tMRCAs) of significant clusters were estimated using Bayesian MCMC analysis. Results Among 3618 patient registered in our network, 243 were infected with CRF01_AE. The majority of individuals with CRF01_AE were Japanese, predominantly male, and reported heterosexual contact as their risk factor. We found 5 large clusters with ≥5 members and 25 small clusters consisting of pairs of individuals with highly related CRF01_AE strains. The earliest cluster showed a tMRCA of 1996, and consisted of individuals with their known risk as heterosexual contacts. The other four large clusters showed later tMRCAs between 2000 and 2002 with members including intravenous drug users (IVDU) and non-Japanese, but not men who have sex with men (MSM). In contrast, small clusters included a high frequency of individuals reporting MSM risk factors. Phylogenetic analysis also showed that some individuals infected with HIV strains spread in East and South-eastern Asian countries. Conclusions Introduction of CRF01_AE viruses into Japan is estimated to have occurred in the 1990s. CFR01_AE spread via heterosexual behavior, then among persons connected with non-Japanese, IVDU, and MSM. Phylogenetic analysis demonstrated that some viral variants are largely restricted to Japan, while others have a broad geographic distribution. PMID:25025900
Direct on-strip analysis of size- and time-resolved aerosol impactor samples using laser induced fluorescence spectra excited at 263 and 351 nm.

PubMed

Wang, Chuji; Pan, Yong-Le; James, Deryck; Wetmore, Alan E; Redding, Brandon

2014-04-11

We report a novel atmospheric aerosol characterization technique, in which dual wavelength UV laser induced fluorescence (LIF) spectrometry marries an eight-stage rotating drum impactor (RDI), namely UV-LIF-RDI, to achieve size- and time-resolved analysis of aerosol particles on-strip. The UV-LIF-RDI technique measured LIF spectra via direct laser beam illumination onto the particles that were impacted on a RDI strip with a spatial resolution of 1.2mm, equivalent to an averaged time resolution in the aerosol sampling of 3.6 h. Excited by a 263 nm or 351 nm laser, more than 2000 LIF spectra within a 3-week aerosol collection time period were obtained from the eight individual RDI strips that collected particles in eight different sizes ranging from 0.09 to 10 μm in Djibouti. Based on the known fluorescence database from atmospheric aerosols in the US, the LIF spectra obtained from the Djibouti aerosol samples were found to be dominated by fluorescence clusters 2, 5, and 8 (peaked at 330, 370, and 475 nm) when excited at 263 nm and by fluorescence clusters 1, 2, 5, and 6 (peaked at 390 and 460 nm) when excited at 351 nm. Size- and time-dependent variations of the fluorescence spectra revealed some size and time evolution behavior of organic and biological aerosols from the atmosphere in Djibouti. Moreover, this analytical technique could locate the possible sources and chemical compositions contributing to these fluorescence clusters. Advantages, limitations, and future developments of this new aerosol analysis technique are also discussed. Published by Elsevier B.V.
A Lagrangian analysis of cold cloud clusters and their life cycles with satellite observations

PubMed Central

Esmaili, Rebekah Bradley; Tian, Yudong; Vila, Daniel Alejandro; Kim, Kyu-Myong

2018-01-01

Cloud movement and evolution signify the complex water and energy transport in the atmosphere-ocean-land system. Detecting, clustering, and tracking clouds as semi-coherent cluster objects enables study of their evolution which can complement climate model simulations and enhance satellite retrieval algorithms, where there are large gaps between overpasses. Using an area-overlap cluster tracking algorithm, in this study we examine the trajectories, horizontal extent, and brightness temperature variations of millions of individual cloud clusters over their lifespan, from infrared satellite observations at 30-minute, 4-km resolution, for a period of 11 years. We found that the majority of cold clouds were both small and short-lived and that their frequency and location are influenced by El Niño. More importantly, this large sample of individually tracked clouds shows their horizontal size and temperature evolution. Longer lived clusters tended to achieve their temperature and size maturity milestones at different times, while these stages often occurred simultaneously in shorter lived clusters. On average, clusters with this lag also exhibited a greater rainfall contribution than those where minimum temperature and maximum size stages occurred simultaneously. Furthermore, by examining the diurnal cycle of cluster development over Africa and the Indian subcontinent, we observed differences in the local timing of the maximum occurrence at different life cycle stages. Over land there was a strong diurnal peak in the afternoon while over the ocean there was a semi-diurnal peak composed of longer-lived clusters in the early morning hours and shorter-lived clusters in the afternoon. Building on regional specific work, this study provides a long-term, high-resolution, and global survey of object-based cloud characteristics. PMID:29744257
Descriptive epidemiology of typhoid fever during an epidemic in Harare, Zimbabwe, 2012.

PubMed

Polonsky, Jonathan A; Martínez-Pino, Isabel; Nackers, Fabienne; Chonzi, Prosper; Manangazira, Portia; Van Herp, Michel; Maes, Peter; Porten, Klaudia; Luquero, Francisco J

2014-01-01

Typhoid fever remains a significant public health problem in developing countries. In October 2011, a typhoid fever epidemic was declared in Harare, Zimbabwe - the fourth enteric infection epidemic since 2008. To orient control activities, we described the epidemiology and spatiotemporal clustering of the epidemic in Dzivaresekwa and Kuwadzana, the two most affected suburbs of Harare. A typhoid fever case-patient register was analysed to describe the epidemic. To explore clustering, we constructed a dataset comprising GPS coordinates of case-patient residences and randomly sampled residential locations (spatial controls). The scale and significance of clustering was explored with Ripley K functions. Cluster locations were determined by a random labelling technique and confirmed using Kulldorff's spatial scan statistic. We analysed data from 2570 confirmed and suspected case-patients, and found significant spatiotemporal clustering of typhoid fever in two non-overlapping areas, which appeared to be linked to environmental sources. Peak relative risk was more than six times greater than in areas lying outside the cluster ranges. Clusters were identified in similar geographical ranges by both random labelling and Kulldorff's spatial scan statistic. The spatial scale at which typhoid fever clustered was highly localised, with significant clustering at distances up to 4.5 km and peak levels at approximately 3.5 km. The epicentre of infection transmission shifted from one cluster to the other during the course of the epidemic. This study demonstrated highly localised clustering of typhoid fever during an epidemic in an urban African setting, and highlights the importance of spatiotemporal analysis for making timely decisions about targetting prevention and control activities and reinforcing treatment during epidemics. This approach should be integrated into existing surveillance systems to facilitate early detection of epidemics and identify their spatial range.
Descriptive Epidemiology of Typhoid Fever during an Epidemic in Harare, Zimbabwe, 2012

PubMed Central

Polonsky, Jonathan A.; Martínez-Pino, Isabel; Nackers, Fabienne; Chonzi, Prosper; Manangazira, Portia; Van Herp, Michel; Maes, Peter; Porten, Klaudia; Luquero, Francisco J.

2014-01-01

Background Typhoid fever remains a significant public health problem in developing countries. In October 2011, a typhoid fever epidemic was declared in Harare, Zimbabwe - the fourth enteric infection epidemic since 2008. To orient control activities, we described the epidemiology and spatiotemporal clustering of the epidemic in Dzivaresekwa and Kuwadzana, the two most affected suburbs of Harare. Methods A typhoid fever case-patient register was analysed to describe the epidemic. To explore clustering, we constructed a dataset comprising GPS coordinates of case-patient residences and randomly sampled residential locations (spatial controls). The scale and significance of clustering was explored with Ripley K functions. Cluster locations were determined by a random labelling technique and confirmed using Kulldorff's spatial scan statistic. Principal Findings We analysed data from 2570 confirmed and suspected case-patients, and found significant spatiotemporal clustering of typhoid fever in two non-overlapping areas, which appeared to be linked to environmental sources. Peak relative risk was more than six times greater than in areas lying outside the cluster ranges. Clusters were identified in similar geographical ranges by both random labelling and Kulldorff's spatial scan statistic. The spatial scale at which typhoid fever clustered was highly localised, with significant clustering at distances up to 4.5 km and peak levels at approximately 3.5 km. The epicentre of infection transmission shifted from one cluster to the other during the course of the epidemic. Conclusions This study demonstrated highly localised clustering of typhoid fever during an epidemic in an urban African setting, and highlights the importance of spatiotemporal analysis for making timely decisions about targetting prevention and control activities and reinforcing treatment during epidemics. This approach should be integrated into existing surveillance systems to facilitate early detection of epidemics and identify their spatial range. PMID:25486292
A Lagrangian analysis of cold cloud clusters and their life cycles with satellite observations.

PubMed

Esmaili, Rebekah Bradley; Tian, Yudong; Vila, Daniel Alejandro; Kim, Kyu-Myong

2016-10-16

Cloud movement and evolution signify the complex water and energy transport in the atmosphere-ocean-land system. Detecting, clustering, and tracking clouds as semi-coherent cluster objects enables study of their evolution which can complement climate model simulations and enhance satellite retrieval algorithms, where there are large gaps between overpasses. Using an area-overlap cluster tracking algorithm, in this study we examine the trajectories, horizontal extent, and brightness temperature variations of millions of individual cloud clusters over their lifespan, from infrared satellite observations at 30-minute, 4-km resolution, for a period of 11 years. We found that the majority of cold clouds were both small and short-lived and that their frequency and location are influenced by El Niño. More importantly, this large sample of individually tracked clouds shows their horizontal size and temperature evolution. Longer lived clusters tended to achieve their temperature and size maturity milestones at different times, while these stages often occurred simultaneously in shorter lived clusters. On average, clusters with this lag also exhibited a greater rainfall contribution than those where minimum temperature and maximum size stages occurred simultaneously. Furthermore, by examining the diurnal cycle of cluster development over Africa and the Indian subcontinent, we observed differences in the local timing of the maximum occurrence at different life cycle stages. Over land there was a strong diurnal peak in the afternoon while over the ocean there was a semi-diurnal peak composed of longer-lived clusters in the early morning hours and shorter-lived clusters in the afternoon. Building on regional specific work, this study provides a long-term, high-resolution, and global survey of object-based cloud characteristics.
A Lagrangian Analysis of Cold Cloud Clusters and Their Life Cycles With Satellite Observations

NASA Technical Reports Server (NTRS)

Esmaili, Rebekah Bradley; Tian, Yudong; Vila, Daniel Alejandro; Kim, Kyu-Myong

2016-01-01

Cloud movement and evolution signify the complex water and energy transport in the atmosphere-ocean-land system. Detecting, clustering, and tracking clouds as semi coherent cluster objects enables study of their evolution which can complement climate model simulations and enhance satellite retrieval algorithms, where there are large gaps between overpasses. Using an area-overlap cluster tracking algorithm, in this study we examine the trajectories, horizontal extent, and brightness temperature variations of millions of individual cloud clusters over their lifespan, from infrared satellite observations at 30-minute, 4-km resolution, for a period of 11 years. We found that the majority of cold clouds were both small and short-lived and that their frequency and location are influenced by El Nino. More importantly, this large sample of individually tracked clouds shows their horizontal size and temperature evolution. Longer lived clusters tended to achieve their temperature and size maturity milestones at different times, while these stages often occurred simultaneously in shorter lived clusters. On average, clusters with this lag also exhibited a greater rainfall contribution than those where minimum temperature and maximum size stages occurred simultaneously. Furthermore, by examining the diurnal cycle of cluster development over Africa and the Indian subcontinent, we observed differences in the local timing of the maximum occurrence at different life cycle stages. Over land there was a strong diurnal peak in the afternoon while over the ocean there was a semi-diurnal peak composed of longer-lived clusters in the early morning hours and shorter-lived clusters in the afternoon. Building on regional specific work, this study provides a long-term, high-resolution, and global survey of object-based cloud characteristics.
Seismic facies analysis based on self-organizing map and empirical mode decomposition

NASA Astrophysics Data System (ADS)

Du, Hao-kun; Cao, Jun-xing; Xue, Ya-juan; Wang, Xing-jian

2015-01-01

Seismic facies analysis plays an important role in seismic interpretation and reservoir model building by offering an effective way to identify the changes in geofacies inter wells. The selections of input seismic attributes and their time window have an obvious effect on the validity of classification and require iterative experimentation and prior knowledge. In general, it is sensitive to noise when waveform serves as the input data to cluster analysis, especially with a narrow window. To conquer this limitation, the Empirical Mode Decomposition (EMD) method is introduced into waveform classification based on SOM. We first de-noise the seismic data using EMD and then cluster the data using 1D grid SOM. The main advantages of this method are resolution enhancement and noise reduction. 3D seismic data from the western Sichuan basin, China, are collected for validation. The application results show that seismic facies analysis can be improved and better help the interpretation. The powerful tolerance for noise makes the proposed method to be a better seismic facies analysis tool than classical 1D grid SOM method, especially for waveform cluster with a narrow window.
The Open Connectome Project Data Cluster: Scalable Analysis and Vision for High-Throughput Neuroscience.

PubMed

Burns, Randal; Roncal, William Gray; Kleissas, Dean; Lillaney, Kunal; Manavalan, Priya; Perlman, Eric; Berger, Daniel R; Bock, Davi D; Chung, Kwanghun; Grosenick, Logan; Kasthuri, Narayanan; Weiler, Nicholas C; Deisseroth, Karl; Kazhdan, Michael; Lichtman, Jeff; Reid, R Clay; Smith, Stephen J; Szalay, Alexander S; Vogelstein, Joshua T; Vogelstein, R Jacob

2013-01-01

We describe a scalable database cluster for the spatial analysis and annotation of high-throughput brain imaging data, initially for 3-d electron microscopy image stacks, but for time-series and multi-channel data as well. The system was designed primarily for workloads that build connectomes - neural connectivity maps of the brain-using the parallel execution of computer vision algorithms on high-performance compute clusters. These services and open-science data sets are publicly available at openconnecto.me. The system design inherits much from NoSQL scale-out and data-intensive computing architectures. We distribute data to cluster nodes by partitioning a spatial index. We direct I/O to different systems-reads to parallel disk arrays and writes to solid-state storage-to avoid I/O interference and maximize throughput. All programming interfaces are RESTful Web services, which are simple and stateless, improving scalability and usability. We include a performance evaluation of the production system, highlighting the effec-tiveness of spatial data organization.
Characteristics of voxel prediction power in full-brain Granger causality analysis of fMRI data

NASA Astrophysics Data System (ADS)

Garg, Rahul; Cecchi, Guillermo A.; Rao, A. Ravishankar

2011-03-01

Functional neuroimaging research is moving from the study of "activations" to the study of "interactions" among brain regions. Granger causality analysis provides a powerful technique to model spatio-temporal interactions among brain regions. We apply this technique to full-brain fMRI data without aggregating any voxel data into regions of interest (ROIs). We circumvent the problem of dimensionality using sparse regression from machine learning. On a simple finger-tapping experiment we found that (1) a small number of voxels in the brain have very high prediction power, explaining the future time course of other voxels in the brain; (2) these voxels occur in small sized clusters (of size 1-4 voxels) distributed throughout the brain; (3) albeit small, these clusters overlap with most of the clusters identified with the non-temporal General Linear Model (GLM); and (4) the method identifies clusters which, while not determined by the task and not detectable by GLM, still influence brain activity.
The Open Connectome Project Data Cluster: Scalable Analysis and Vision for High-Throughput Neuroscience

PubMed Central

Burns, Randal; Roncal, William Gray; Kleissas, Dean; Lillaney, Kunal; Manavalan, Priya; Perlman, Eric; Berger, Daniel R.; Bock, Davi D.; Chung, Kwanghun; Grosenick, Logan; Kasthuri, Narayanan; Weiler, Nicholas C.; Deisseroth, Karl; Kazhdan, Michael; Lichtman, Jeff; Reid, R. Clay; Smith, Stephen J.; Szalay, Alexander S.; Vogelstein, Joshua T.; Vogelstein, R. Jacob

2013-01-01

We describe a scalable database cluster for the spatial analysis and annotation of high-throughput brain imaging data, initially for 3-d electron microscopy image stacks, but for time-series and multi-channel data as well. The system was designed primarily for workloads that build connectomes— neural connectivity maps of the brain—using the parallel execution of computer vision algorithms on high-performance compute clusters. These services and open-science data sets are publicly available at openconnecto.me. The system design inherits much from NoSQL scale-out and data-intensive computing architectures. We distribute data to cluster nodes by partitioning a spatial index. We direct I/O to different systems—reads to parallel disk arrays and writes to solid-state storage—to avoid I/O interference and maximize throughput. All programming interfaces are RESTful Web services, which are simple and stateless, improving scalability and usability. We include a performance evaluation of the production system, highlighting the effec-tiveness of spatial data organization. PMID:24401992
Inductive Approaches to Improving Diagnosis and Design for Diagnosability

NASA Technical Reports Server (NTRS)

Fisher, Douglas H. (Principal Investigator)

1995-01-01

The first research area under this grant addresses the problem of classifying time series according to their morphological features in the time domain. A supervised learning system called CALCHAS, which induces a classification procedure for signatures from preclassified examples, was developed. For each of several signature classes, the system infers a model that captures the class's morphological features using Bayesian model induction and the minimum message length approach to assign priors. After induction, a time series (signature) is classified in one of the classes when there is enough evidence to support that decision. Time series with sufficiently novel features, belonging to classes not present in the training set, are recognized as such. A second area of research assumes two sources of information about a system: a model or domain theory that encodes aspects of the system under study and data from actual system operations over time. A model, when it exists, represents strong prior expectations about how a system will perform. Our work with a diagnostic model of the RCS (Reaction Control System) of the Space Shuttle motivated the development of SIG, a system which combines information from a model (or domain theory) and data. As it tracks RCS behavior, the model computes quantitative and qualitative values. Induction is then performed over the data represented by both the 'raw' features and the model-computed high-level features. Finally, work on clustering for operating mode discovery motivated some important extensions to the clustering strategy we had used. One modification appends an iterative optimization technique onto the clustering system; this optimization strategy appears to be novel in the clustering literature. A second modification improves the noise tolerance of the clustering system. In particular, we adapt resampling-based pruning strategies used by supervised learning systems to the task of simplifying hierarchical clusterings, thus making post-clustering analysis easier.
Effects of Slow-Stroke Back Massage on Symptom Cluster in Adult Patients With Acute Leukemia: Supportive Care in Cancer Nursing.

PubMed

Miladinia, Mojtaba; Baraz, Shahram; Shariati, Abdolali; Malehi, Amal Saki

Patients with acute leukemia usually experience pain, fatigue, and sleep disorders, which affect their quality of life. Massage therapy, as a nondrug approach, can be useful in controlling such problems. However, very few studies have been conducted on the effects of massage therapy on the complications of leukemia. The aim of this study was to examine the effects of slow-stroke back massage (SSBM) on the symptom cluster in acute leukemia adult patients undergoing chemotherapy. In this randomized controlled trial, 60 patients with acute leukemia were allocated randomly to either the intervention or control group. The intervention group received SSBM 3 times a week (every other day for 10 minutes) for 4 weeks. The pain, fatigue, and sleep disorder intensities were measured using the numeric rating scale. The sleep quality was measured using the Pittsburgh Sleep Quality Index. Statistical tests of χ, t test, and the repeated-measure analysis of variance were used for data analysis. Results showed that the SSBM intervention significantly reduced the progressive sleep disorder, pain, fatigue, and improved sleep quality over time. Slow-stroke back massage, as a simple, noninvasive, and cost-effective approach, along with routine nursing care, can be used to improve the symptom cluster of pain, fatigue, and sleep disorders in leukemia patients. Oncology nurses can increase their knowledge regarding this symptom cluster and work to diminish the cluster components by using SSBM in adult leukemia patients.

Using exploratory data analysis to identify and predict patterns of human Lyme disease case clustering within a multistate region, 2010-2014.

PubMed

Hendricks, Brian; Mark-Carew, Miguella

2017-02-01

Lyme disease is the most commonly reported vectorborne disease in the United States. The objective of our study was to identify patterns of Lyme disease reporting after multistate inclusion to mitigate potential border effects. County-level human Lyme disease surveillance data were obtained from Kentucky, Maryland, Ohio, Pennsylvania, Virginia, and West Virginia state health departments. Rate smoothing and Local Moran's I was performed to identify clusters of reporting activity and identify spatial outliers. A logistic generalized estimating equation was performed to identify significant associations in disease clustering over time. Resulting analyses identified statistically significant (P=0.05) clusters of high reporting activity and trends over time. High reporting activity aggregated near border counties in high incidence states, while low reporting aggregated near shared county borders in non-high incidence states. Findings highlight the need for exploratory surveillance approaches to describe the extent to which state level reporting affects accurate estimation of Lyme disease progression. Copyright © 2017 Elsevier Ltd. All rights reserved.
Risk profiles for poor treatment response to internet-delivered CBT in people with social anxiety disorder.

PubMed

Tillfors, Maria; Furmark, Tomas; Carlbring, Per; Andersson, Gerhard

2015-06-01

In social anxiety disorder (SAD) co-morbid depressive symptoms as well as avoidance behaviors have been shown to predict insufficient treatment response. It is likely that subgroups of individuals with different profiles of risk factors for poor treatment response exist. This study aimed to identify subgroups of social avoidance and depressive symptoms in a clinical sample (N = 167) with SAD before and after guided internet-delivered CBT, and to compare these groups on diagnostic status and social anxiety. We further examined individual movement between subgroups over time. Using cluster analysis we identified four subgroups, including a high-problem cluster at both time-points. Individuals in this cluster showed less remission after treatment, exhibited higher levels of social anxiety at both assessments, and typically remained in the high-problem cluster after treatment. Thus, in individuals with SAD, high levels of social avoidance and depressive symptoms constitute a risk profile for poor treatment response. Copyright © 2015 Elsevier Ltd. All rights reserved.
Including foreshocks and aftershocks in time-independent probabilistic seismic hazard analyses

USGS Publications Warehouse

Boyd, Oliver S.

2012-01-01

Time‐independent probabilistic seismic‐hazard analysis treats each source as being temporally and spatially independent; hence foreshocks and aftershocks, which are both spatially and temporally dependent on the mainshock, are removed from earthquake catalogs. Yet, intuitively, these earthquakes should be considered part of the seismic hazard, capable of producing damaging ground motions. In this study, I consider the mainshock and its dependents as a time‐independent cluster, each cluster being temporally and spatially independent from any other. The cluster has a recurrence time of the mainshock; and, by considering the earthquakes in the cluster as a union of events, dependent events have an opportunity to contribute to seismic ground motions and hazard. Based on the methods of the U.S. Geological Survey for a high‐hazard site, the inclusion of dependent events causes ground motions that are exceeded at probability levels of engineering interest to increase by about 10% but could be as high as 20% if variations in aftershock productivity can be accounted for reliably.
Generalized Correlation Coefficient for Non-Parametric Analysis of Microarray Time-Course Data.

PubMed

Tan, Qihua; Thomassen, Mads; Burton, Mark; Mose, Kristian Fredløv; Andersen, Klaus Ejner; Hjelmborg, Jacob; Kruse, Torben

2017-06-06

Modeling complex time-course patterns is a challenging issue in microarray study due to complex gene expression patterns in response to the time-course experiment. We introduce the generalized correlation coefficient and propose a combinatory approach for detecting, testing and clustering the heterogeneous time-course gene expression patterns. Application of the method identified nonlinear time-course patterns in high agreement with parametric analysis. We conclude that the non-parametric nature in the generalized correlation analysis could be an useful and efficient tool for analyzing microarray time-course data and for exploring the complex relationships in the omics data for studying their association with disease and health.
Spatio-Temporal Clustering of Monitoring Network

NASA Astrophysics Data System (ADS)

Hussain, I.; Pilz, J.

2009-04-01

Pakistan has much diversity in seasonal variation of different locations. Some areas are in desserts and remain very hot and waterless, for example coastal areas are situated along the Arabian Sea and have very warm season and a little rainfall. Some areas are covered with mountains, have very low temperature and heavy rainfall; for instance Karakoram ranges. The most important variables that have an impact on the climate are temperature, precipitation, humidity, wind speed and elevation. Furthermore, it is hard to find homogeneous regions in Pakistan with respect to climate variation. Identification of homogeneous regions in Pakistan can be useful in many aspects. It can be helpful for prediction of the climate in the sub-regions and for optimizing the number of monitoring sites. In the earlier literature no one tried to identify homogeneous regions of Pakistan with respect to climate variation. There are only a few papers about spatio-temporal clustering of monitoring network. Steinhaus (1956) presented the well-known K-means clustering method. It can identify a predefined number of clusters by iteratively assigning centriods to clusters based. Castro et al. (1997) developed a genetic heuristic algorithm to solve medoids based clustering. Their method is based on genetic recombination upon random assorting recombination. The suggested method is appropriate for clustering the attributes which have genetic characteristics. Sap and Awan (2005) presented a robust weighted kernel K-means algorithm incorporating spatial constraints for clustering climate data. The proposed algorithm can effectively handle noise, outliers and auto-correlation in the spatial data, for effective and efficient data analysis by exploring patterns and structures in the data. Soltani and Modarres (2006) used hierarchical and divisive cluster analysis to categorize patterns of rainfall in Iran. They only considered rainfall at twenty-eight monitoring sites and concluded that eight clusters existed. Soltani and Modarres (2006) classified the sites by using only average rainfall of sites, they did not consider time replications and spatial coordinates. Kerby et.al (2007) purposed spatial clustering method based on likelihood. They took account of the geographic locations through the variance covariance matrix. Their purposed method works like hierarchical clustering methods. Moreovere, it is inappropiriate for time replication data and could not perform well for large number of sites. Tuia.et.al (2008) used scan statistics for identifying spatio-temporal clusters for fire sequences in the Tuscany region in Italy. The scan statistics clustering method was developed by Kulldorff et al. (1997) to detect spatio-temporal clusters in epidemiology and assessing their significance. The purposed scan statistics method is used only for univariate discrete stochastic random variables. In this paper we make use of a very simple approach for spatio-temporal clustering which can create separable and homogeneous clusters. Most of the clustering methods are based on Euclidean distances. It is well known that geographic coordinates are spherical coordinates and estimating Euclidean distances from spherical coordinates is inappropriate. As a transformation from geographic coordinates to rectangular (D-plane) coordinates we use the Lambert projection method. The partition around medoids clustering method is incorporated on the data including D-plane coordinates. Ordinary kriging is taken as validity measure for the precipitation data. The kriging results for clusters are more accurate and have less variation compared to complete monitoring network precipitation data. References Casto.V.E and Murray.A.T (1997). Spatial Clustering with Data Mining with Genetic Algorithms. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.56.8573 Kaufman.L and Rousseeuw.P.J (1990). Finding Groups in Data: An Introduction to Cluster Analysis. Wiley series of Probability and Mathematical Statistics, New York. Kulldorf.M (1997). A spatial scan statistic. Commun. Stat.-Theor. Math. 26(6), 1481-1496 Kerby. A , Marx. D, Samal. A and Adamchuck. V. (2007). Spatial Clustering Using the Likelihood Function. Seventh IEEE International Conference on Data Mining - Workshops Steinhaus.H (1956). Sur la division des corp materiels en parties. Bull. Acad. Polon. Sci., C1. III vol IV:801- 804 Snyder, J. P. (1987). Map Projection: A Working Manual. U. S. Geological Survey Professional Paper 1395. Washington, DC: U. S. Government Printing Office, pp. 104-110 Sap.M.N and Awan. A.M (2005). Finding Spatio-Temporal Patterns in Climate Data Using Clustering. Proceedings of the International Conference on Cyberworlds (CW'05) Soltani.S and Modarres.R (2006). Classification of Spatio -Temporal Pattern of Rainfall in Iran: Using Hierarchical and Divisive Cluster Analysis. Journal of Spatial Hydrology Vol.6, No.2 Tuia.D, Ratle.F, Lasaponara.R, Telesca.L and Kanevski.M (2008). Scan Statistics Analysis for Forest Fire Clusters. Commun. in Nonlinear science and numerical simulation 13,1689-1694.
Spatial Analysis of Great Lakes Regional Icing Cloud Liquid Water Content

NASA Technical Reports Server (NTRS)

Ryerson, Charles C.; Koenig, George G.; Melloh, Rae A.; Meese, Debra A.; Reehorst, Andrew L.; Miller, Dean R.

2003-01-01

Abstract Clustering of cloud microphysical conditions, such as liquid water content (LWC) and drop size, can affect the rate and shape of ice accretion and the airworthiness of aircraft. Clustering may also degrade the accuracy of cloud LWC measurements from radars and microwave radiometers being developed by the government for remotely mapping icing conditions ahead of aircraft in flight. This paper evaluates spatial clustering of LWC in icing clouds using measurements collected during NASA research flights in the Great Lakes region. We used graphical and analytical approaches to describe clustering. The analytical approach involves determining the average size of clusters and computing a clustering intensity parameter. We analyzed flight data composed of 1-s-frequency LWC measurements for 12 periods ranging from 17.4 minutes (73 km) to 45.3 minutes (190 km) in duration. Graphically some flight segments showed evidence of consistency with regard to clustering patterns. Cluster intensity varied from 0.06, indicating little clustering, to a high of 2.42. Cluster lengths ranged from 0.1 minutes (0.6 km) to 4.1 minutes (17.3 km). Additional analyses will allow us to determine if clustering climatologies can be developed to characterize cluster conditions by region, time period, or weather condition. Introduction
The clustering of diet, physical activity and sedentary behavior in children and adolescents: a review.

PubMed

Leech, Rebecca M; McNaughton, Sarah A; Timperio, Anna

2014-01-22

Diet, physical activity (PA) and sedentary behavior are important, yet modifiable, determinants of obesity. Recent research into the clustering of these behaviors suggests that children and adolescents have multiple obesogenic risk factors. This paper reviews studies using empirical, data-driven methodologies, such as cluster analysis (CA) and latent class analysis (LCA), to identify clustering patterns of diet, PA and sedentary behavior among children or adolescents and their associations with socio-demographic indicators, and overweight and obesity. A literature search of electronic databases was undertaken to identify studies which have used data-driven methodologies to investigate the clustering of diet, PA and sedentary behavior among children and adolescents aged 5-18 years old. Eighteen studies (62% of potential studies) were identified that met the inclusion criteria, of which eight examined the clustering of PA and sedentary behavior and eight examined diet, PA and sedentary behavior. Studies were mostly cross-sectional and conducted in older children and adolescents (≥ 9 years). Findings from the review suggest that obesogenic cluster patterns are complex with a mixed PA/sedentary behavior cluster observed most frequently, but healthy and unhealthy patterning of all three behaviors was also reported. Cluster membership was found to differ according to age, gender and socio-economic status (SES). The tendency for older children/adolescents, particularly females, to comprise clusters defined by low PA was the most robust finding. Findings to support an association between obesogenic cluster patterns and overweight and obesity were inconclusive, with longitudinal research in this area limited. Diet, PA and sedentary behavior cluster together in complex ways that are not well understood. Further research, particularly in younger children, is needed to understand how cluster membership differs according to socio-demographic profile. Longitudinal research is also essential to establish how different cluster patterns track over time and their influence on the development of overweight and obesity.
The clustering of diet, physical activity and sedentary behavior in children and adolescents: a review

PubMed Central

2014-01-01

Diet, physical activity (PA) and sedentary behavior are important, yet modifiable, determinants of obesity. Recent research into the clustering of these behaviors suggests that children and adolescents have multiple obesogenic risk factors. This paper reviews studies using empirical, data-driven methodologies, such as cluster analysis (CA) and latent class analysis (LCA), to identify clustering patterns of diet, PA and sedentary behavior among children or adolescents and their associations with socio-demographic indicators, and overweight and obesity. A literature search of electronic databases was undertaken to identify studies which have used data-driven methodologies to investigate the clustering of diet, PA and sedentary behavior among children and adolescents aged 5–18 years old. Eighteen studies (62% of potential studies) were identified that met the inclusion criteria, of which eight examined the clustering of PA and sedentary behavior and eight examined diet, PA and sedentary behavior. Studies were mostly cross-sectional and conducted in older children and adolescents (≥9 years). Findings from the review suggest that obesogenic cluster patterns are complex with a mixed PA/sedentary behavior cluster observed most frequently, but healthy and unhealthy patterning of all three behaviors was also reported. Cluster membership was found to differ according to age, gender and socio-economic status (SES). The tendency for older children/adolescents, particularly females, to comprise clusters defined by low PA was the most robust finding. Findings to support an association between obesogenic cluster patterns and overweight and obesity were inconclusive, with longitudinal research in this area limited. Diet, PA and sedentary behavior cluster together in complex ways that are not well understood. Further research, particularly in younger children, is needed to understand how cluster membership differs according to socio-demographic profile. Longitudinal research is also essential to establish how different cluster patterns track over time and their influence on the development of overweight and obesity. PMID:24450617
SSAW: A new sequence similarity analysis method based on the stationary discrete wavelet transform.

PubMed

Lin, Jie; Wei, Jing; Adjeroh, Donald; Jiang, Bing-Hua; Jiang, Yue

2018-05-02

Alignment-free sequence similarity analysis methods often lead to significant savings in computational time over alignment-based counterparts. A new alignment-free sequence similarity analysis method, called SSAW is proposed. SSAW stands for Sequence Similarity Analysis using the Stationary Discrete Wavelet Transform (SDWT). It extracts k-mers from a sequence, then maps each k-mer to a complex number field. Then, the series of complex numbers formed are transformed into feature vectors using the stationary discrete wavelet transform. After these steps, the original sequence is turned into a feature vector with numeric values, which can then be used for clustering and/or classification. Using two different types of applications, namely, clustering and classification, we compared SSAW against the the-state-of-the-art alignment free sequence analysis methods. SSAW demonstrates competitive or superior performance in terms of standard indicators, such as accuracy, F-score, precision, and recall. The running time was significantly better in most cases. These make SSAW a suitable method for sequence analysis, especially, given the rapidly increasing volumes of sequence data required by most modern applications.
Space-time patterns of Campylobacter spp. colonization in broiler flocks, 2002-2006.

PubMed

Jonsson, M E; Norström, M; Sandberg, M; Ersbøll, A K; Hofshagen, M

2010-09-01

This study was performed to investigate space-time patterns of Campylobacter spp. colonization in broiler flocks in Norway. Data on the Campylobacter spp. status at the time of slaughter of 16 054 broiler flocks from 580 farms between 2002 and 2006 was included in the study. Spatial relative risk maps together with maps of space-time clustering were generated, the latter by using spatial scan statistics. These maps identified the same areas almost every year where there was a higher risk for a broiler flock to test positive for Campylobacter spp. during the summer months. A modified K-function analysis showed significant clustering at distances between 2.5 and 4 km within different years. The identification of geographical areas with higher risk for Campylobacter spp. colonization in broilers indicates that there are risk factors associated with Campylobacter spp. colonization in broiler flocks varying with region and time, e.g. climate, landscape or geography. These need to be further explored. The results also showed clustering at shorter distances indicating that there are risk factors for Campylobacter spp. acting in a more narrow scale as well.
Investigating the usefulness of a cluster-based trend analysis to detect visual field progression in patients with open-angle glaucoma.

PubMed

Aoki, Shuichiro; Murata, Hiroshi; Fujino, Yuri; Matsuura, Masato; Miki, Atsuya; Tanito, Masaki; Mizoue, Shiro; Mori, Kazuhiko; Suzuki, Katsuyoshi; Yamashita, Takehiro; Kashiwagi, Kenji; Hirasawa, Kazunori; Shoji, Nobuyuki; Asaoka, Ryo

2017-12-01

To investigate the usefulness of the Octopus (Haag-Streit) EyeSuite's cluster trend analysis in glaucoma. Ten visual fields (VFs) with the Humphrey Field Analyzer (Carl Zeiss Meditec), spanning 7.7 years on average were obtained from 728 eyes of 475 primary open angle glaucoma patients. Mean total deviation (mTD) trend analysis and EyeSuite's cluster trend analysis were performed on various series of VFs (from 1st to 10th: VF1-10 to 6th to 10th: VF6-10). The results of the cluster-based trend analysis, based on different lengths of VF series, were compared against mTD trend analysis. Cluster-based trend analysis and mTD trend analysis results were significantly associated in all clusters and with all lengths of VF series. Between 21.2% and 45.9% (depending on VF series length and location) of clusters were deemed to progress when the mTD trend analysis suggested no progression. On the other hand, 4.8% of eyes were observed to progress using the mTD trend analysis when cluster trend analysis suggested no progression in any two (or more) clusters. Whole field trend analysis can miss local VF progression. Cluster trend analysis appears as robust as mTD trend analysis and useful to assess both sectorial and whole field progression. Cluster-based trend analyses, in particular the definition of two or more progressing cluster, may help clinicians to detect glaucomatous progression in a timelier manner than using a whole field trend analysis, without significantly compromising specificity. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Constructing storyboards based on hierarchical clustering analysis

NASA Astrophysics Data System (ADS)

Hasebe, Satoshi; Sami, Mustafa M.; Muramatsu, Shogo; Kikuchi, Hisakazu

2005-07-01

There are growing needs for quick preview of video contents for the purpose of improving accessibility of video archives as well as reducing network traffics. In this paper, a storyboard that contains a user-specified number of keyframes is produced from a given video sequence. It is based on hierarchical cluster analysis of feature vectors that are derived from wavelet coefficients of video frames. Consistent use of extracted feature vectors is the key to avoid a repetition of computationally-intensive parsing of the same video sequence. Experimental results suggest that a significant reduction in computational time is gained by this strategy.
Diary Data Subjected to Cluster Analysis of Intake/Output/Void Habits with Resulting Clusters Compared by Continence Status, Age, Race

PubMed Central

Miller, Janis M; Guo, Ying; Rodseth, Sarah Becker

2011-01-01

Background Data that incorporate the full complexity of healthy beverage intake and voiding frequency do not exist; therefore, clinicians reviewing bladder habits or voiding diaries for continence care must rely on expert opinion recommendations. Objective To use data-driven cluster analyses to reduce complex voiding diary variables into discrete patterns or data cluster profiles, descriptively name the clusters, and perform validity testing. Method Participants were 352 community women who filled out a 3-day voiding diary. Six variables (void frequency during daytime hours, void frequency during nighttime hours, modal output, total output, total intake, and body mass index) were entered into cluster analyses. The clusters were analyzed for differences by continence status, age, race (Black women, n = 196 White women, n = 156), and for those who were incontinent, by leakage episode severity. Results Three clusters emerged, labeled descriptively as Conventional, Benchmark, and Superplus. The Conventional cluster (68% of the sample) demonstrated mean daily intake of 45 ±13 ounces; mean daily output of 37 ± 15 ounces, mean daily voids 5 ± 2 times, mean modal daytime output 10±0.5 ounces, and mean nighttime voids 1±1 times. The Superplus cluster (7% of the sample) showed double or triple these values across the 5 variables, and the Benchmark cluster (25%) showed values consistent with current popular recommendations on intake and output (e.g., meeting or exceeding the 8 × 8 fluid intake rule of thumb). The clusters differed significantly (p < .05) by age, race, amount of irritating beverages consumed, and incontinence status. Discussion Identification of three discrete clusters provides for a potential parsimonious but data-driven means of classifying individuals for additional epidemiological or clinical study. The clinical utility rests with potential for intervening to move an individual from a high risk to low risk cluster with regards to incontinence. PMID:21317828
Depth data research of GIS based on clustering analysis algorithm

NASA Astrophysics Data System (ADS)

Xiong, Yan; Xu, Wenli

2018-03-01

The data of GIS have spatial distribution. Geographic data has both spatial characteristics and attribute characteristics, and also changes with time. Therefore, the amount of data is very large. Nowadays, many industries and departments in the society are using GIS. However, without proper data analysis and mining scheme, GIS will not exert its maximum effectiveness and will waste a lot of data. In this paper, we use the geographic information demand of a national security department as the experimental object, combining the characteristics of GIS data, taking into account the characteristics of time, space, attributes and so on, and using cluster analysis algorithm. We further study the mining scheme for depth data, and get the algorithm model. This algorithm can automatically classify sample data, and then carry out exploratory analysis. The research shows that the algorithm model and the information mining scheme can quickly find hidden depth information from the surface data of GIS, thus improving the efficiency of the security department. This algorithm can also be extended to other fields.
Seismic Data Analysis throught Multi-Class Classification.

NASA Astrophysics Data System (ADS)

Anderson, P.; Kappedal, R. D.; Magana-Zook, S. A.

2017-12-01

In this research, we conducted twenty experiments of varying time and frequency bands on 5000seismic signals with the intent of finding a method to classify signals as either an explosion or anearthquake in an automated fashion. We used a multi-class approach by clustering of the data throughvarious techniques. Dimensional reduction was examined through the use of wavelet transforms withthe use of the coiflet mother wavelet and various coefficients to explore possible computational time vsaccuracy dependencies. Three and four classes were generated from the clustering techniques andexamined with the three class approach producing the most accurate and realistic results.
Conformational Clusters of Phosphorylated Tyrosine.

PubMed

Abdelrasoul, Maha; Ponniah, Komala; Mao, Alice; Warden, Meghan S; Elhefnawy, Wessam; Li, Yaohang; Pascal, Steven M

2017-12-06

Tyrosine phosphorylation plays an important role in many cellular and intercellular processes including signal transduction, subcellular localization, and regulation of enzymatic activity. In 1999, Blom et al., using the limited number of protein data bank (PDB) structures available at that time, reported that the side chain structures of phosphorylated tyrosine (pY) are partitioned into two conserved conformational clusters ( Blom, N.; Gammeltoft, S.; Brunak, S. J. Mol. Biol. 1999 , 294 , 1351 - 1362 ). We have used the spectral clustering algorithm to cluster the increasingly growing number of protein structures with pY sites, and have found that the pY residues cluster into three distinct side chain conformations. Two of these pY conformational clusters associate strongly with a narrow range of tyrosine backbone conformation. The novel cluster also highly correlates with the identity of the n + 1 residue, and is strongly associated with a sequential pYpY conformation which places two adjacent pY side chains in a specific relative orientation. Further analysis shows that the three pY clusters are associated with distinct distributions of cognate protein kinases.
On the Analysis of Clustering in an Irradiated Low Alloy Reactor Pressure Vessel Steel Weld.

PubMed

Lindgren, Kristina; Stiller, Krystyna; Efsing, Pål; Thuvander, Mattias

2017-04-01

Radiation induced clustering affects the mechanical properties, that is the ductile to brittle transition temperature (DBTT), of reactor pressure vessel (RPV) steel of nuclear power plants. The combination of low Cu and high Ni used in some RPV welds is known to further enhance the DBTT shift during long time operation. In this study, RPV weld samples containing 0.04 at% Cu and 1.6 at% Ni were irradiated to 2.0 and 6.4×1023 n/m2 in the Halden test reactor. Atom probe tomography (APT) was applied to study clustering of Ni, Mn, Si, and Cu. As the clusters are in the nanometer-range, APT is a very suitable technique for this type of study. From APT analyses information about size distribution, number density, and composition of the clusters can be obtained. However, the quantification of these attributes is not trivial. The maximum separation method (MSM) has been used to characterize the clusters and a detailed study about the influence of the choice of MSM cluster parameters, primarily on the cluster number density, has been undertaken.
Mapping of terrain by computer clustering techniques using multispectral scanner data and using color aerial film

NASA Technical Reports Server (NTRS)

Smedes, H. W.; Linnerud, H. J.; Woolaver, L. B.; Su, M. Y.; Jayroe, R. R.

1972-01-01

Two clustering techniques were used for terrain mapping by computer of test sites in Yellowstone National Park. One test was made with multispectral scanner data using a composite technique which consists of (1) a strictly sequential statistical clustering which is a sequential variance analysis, and (2) a generalized K-means clustering. In this composite technique, the output of (1) is a first approximation of the cluster centers. This is the input to (2) which consists of steps to improve the determination of cluster centers by iterative procedures. Another test was made using the three emulsion layers of color-infrared aerial film as a three-band spectrometer. Relative film densities were analyzed using a simple clustering technique in three-color space. Important advantages of the clustering technique over conventional supervised computer programs are (1) human intervention, preparation time, and manipulation of data are reduced, (2) the computer map, gives unbiased indication of where best to select the reference ground control data, (3) use of easy to obtain inexpensive film, and (4) the geometric distortions can be easily rectified by simple standard photogrammetric techniques.
Hierarchical Star Formation in Turbulent Media: Evidence from Young Star Clusters

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grasha, K.; Calzetti, D.; Elmegreen, B. G.

We present an analysis of the positions and ages of young star clusters in eight local galaxies to investigate the connection between the age difference and separation of cluster pairs. We find that star clusters do not form uniformly but instead are distributed so that the age difference increases with the cluster pair separation to the 0.25–0.6 power, and that the maximum size over which star formation is physically correlated ranges from ∼200 pc to ∼1 kpc. The observed trends between age difference and separation suggest that cluster formation is hierarchical both in space and time: clusters that are closemore » to each other are more similar in age than clusters born further apart. The temporal correlations between stellar aggregates have slopes that are consistent with predictions of turbulence acting as the primary driver of star formation. The velocity associated with the maximum size is proportional to the galaxy’s shear, suggesting that the galactic environment influences the maximum size of the star-forming structures.« less
Combinatoric analysis of heterogeneous stochastic self-assembly.

PubMed

D'Orsogna, Maria R; Zhao, Bingyu; Berenji, Bijan; Chou, Tom

2013-09-28

We analyze a fully stochastic model of heterogeneous nucleation and self-assembly in a closed system with a fixed total particle number M, and a fixed number of seeds Ns. Each seed can bind a maximum of N particles. A discrete master equation for the probability distribution of the cluster sizes is derived and the corresponding cluster concentrations are found using kinetic Monte-Carlo simulations in terms of the density of seeds, the total mass, and the maximum cluster size. In the limit of slow detachment, we also find new analytic expressions and recursion relations for the cluster densities at intermediate times and at equilibrium. Our analytic and numerical findings are compared with those obtained from classical mass-action equations and the discrepancies between the two approaches analyzed.

Messier 35 (NGC 2168) DANCe. I. Membership, proper motions, and multiwavelength photometry

NASA Astrophysics Data System (ADS)

Bouy, H.; Bertin, E.; Barrado, D.; Sarro, L. M.; Olivares, J.; Moraux, E.; Bouvier, J.; Cuillandre, J.-C.; Ribas, Á.; Beletsky, Y.

2015-03-01

Context. Messier 35 (NGC 2168) is an important young nearby cluster. Its age, richness and relative proximity make it an ideal target for stellar evolution studies. The Kepler K2 mission recently observed it and provided a high accuracy photometric time series of a large number of sources in this area of the sky. Identifying the cluster's members is therefore of high importance to optimize the interpretation and analysis of the Kepler K2 data. Aims: We aim to identify the cluster's members by deriving membership probabilities for the sources within 1° of the cluster's center, which is farther away than equivalent previous studies. Methods: We measure accurate proper motions and multiwavelength (optical and near-infrared) photometry using ground-based archival images of the cluster. We use these measurements to compute membership probabilities. The list of candidate members from the literature is used as a training set to identify the cluster's locus in a multidimensional space made of proper motions, luminosities, and colors. Results: The final catalog includes 338 892 sources with multiwavelength photometry. Approximately half (194 452) were detected at more than two epochs and we measured their proper motion and used it to derive membership probability. A total of 4349 candidate members with membership probabilities greater than 50% are found in this sample in the luminosity range between 10 mag and 22 mag. The slow proper motion of the cluster and the overlap of its sequence with the field and background sequences in almost all color-magnitude and color-color diagrams complicate the analysis and the contamination level is expected to be significant. Our study, nevertheless, provides a coherent and quantitative membership analysis of Messier 35 based on a large fraction of the best ground-based data sets obtained over the past 18 years. As such, it represents a valuable input for follow-up studies using, in particular, the Kepler K2 photometric time series. Table 3 is only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (ftp://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/575/A120
Strong-lensing analysis of A2744 with MUSE and Hubble Frontier Fields images

NASA Astrophysics Data System (ADS)

Mahler, G.; Richard, J.; Clément, B.; Lagattuta, D.; Schmidt, K.; Patrício, V.; Soucail, G.; Bacon, R.; Pello, R.; Bouwens, R.; Maseda, M.; Martinez, J.; Carollo, M.; Inami, H.; Leclercq, F.; Wisotzki, L.

2018-01-01

We present an analysis of Multi Unit Spectroscopic Explorer (MUSE) observations obtained on the massive Frontier Fields (FFs) cluster A2744. This new data set covers the entire multiply imaged region around the cluster core. The combined catalogue consists of 514 spectroscopic redshifts (with 414 new identifications). We use this redshift information to perform a strong-lensing analysis revising multiple images previously found in the deep FF images, and add three new MUSE-detected multiply imaged systems with no obvious Hubble Space Telescope counterpart. The combined strong-lensing constraints include a total of 60 systems producing 188 images altogether, out of which 29 systems and 83 images are spectroscopically confirmed, making A2744 one of the most well-constrained clusters to date. Thanks to the large amount of spectroscopic redshifts, we model the influence of substructures at larger radii, using a parametrization including two cluster-scale components in the cluster core and several group scale in the outskirts. The resulting model accurately reproduces all the spectroscopic multiple systems, reaching an rms of 0.67 arcsec in the image plane. The large number of MUSE spectroscopic redshifts gives us a robust model, which we estimate reduces the systematic uncertainty on the 2D mass distribution by up to ∼2.5 times the statistical uncertainty in the cluster core. In addition, from a combination of the parametrization and the set of constraints, we estimate the relative systematic uncertainty to be up to 9 per cent at 200 kpc.
Classification of frailty using the Kihon checklist: A cluster analysis of older adults in urban areas.

PubMed

Kera, Takeshi; Kawai, Hisashi; Yoshida, Hideyo; Hirano, Hirohiko; Kojima, Motonaga; Fujiwara, Yoshinori; Ihara, Kazushige; Obuchi, Shuichi

2017-01-01

Frailty is an important predictor of the need for long-term care and hospitalization. Our aim was to categorize frailty in community-dwelling older adults. The present study was carried out in 2011-2013, and consisted of 1380 individuals over 65 years of age. Participants completed the Kihon checklist, which is widely used to assess frailty in Japan, and their physical, cognitive and social function was evaluated. Non-hierarchical cluster analysis was used to statistically categorize frailty. The optimum number of clusters was determined as the point at which the external reference values (instrumental activity of daily living score, grip power, 10-m walk time, body mass index, portable fall risk index, occlusal force and Mini-Mental State Examination score) differed. According to the Kihon checklist, 369 (26.7%) of the 1380 study participants were considered frail. When the cluster number was increased from two to six, the scores in each subdomain of the Kihon checklist significantly differed. The estimated minimum number of clusters was five, and each of the five cluster groups had distinct characteristics. The numbers of participants in cluster groups 1-5 were 105, 78, 62, 71 and 53, respectively. We identified five types of frailty in community-dwelling older adults in Japan: "experience of falling," "pre-frailty," "oral frailty," "housebound" and "severe frailty." Geriatr Gerontol Int 2017; 17: 69-77. © 2016 Japan Geriatrics Society.
Clustering of lifestyle risk behaviours among residents of forty deprived neighbourhoods in London: lessons for targeting public health interventions.

PubMed

Watts, P; Buck, D; Netuveli, G; Renton, A

2016-06-01

Clustering of lifestyle risk behaviours is very important in predicting premature mortality. Understanding the extent to which risk behaviours are clustered in deprived communities is vital to most effectively target public health interventions. We examined co-occurrence and associations between risk behaviours (smoking, alcohol consumption, poor diet, low physical activity and high sedentary time) reported by adults living in deprived London neighbourhoods. Associations between sociodemographic characteristics and clustered risk behaviours were examined. Latent class analysis was used to identify underlying clustering of behaviours. Over 90% of respondents reported at least one risk behaviour. Reporting specific risk behaviours predicted reporting of further risk behaviours. Latent class analyses revealed four underlying classes. Membership of a maximal risk behaviour class was more likely for young, white males who were unable to work. Compared with recent national level analysis, there was a weaker relationship between education and clustering of behaviours and a very high prevalence of clustering of risk behaviours in those unable to work. Young, white men who report difficulty managing on income were at high risk of reporting multiple risk behaviours. These groups may be an important target for interventions to reduce premature mortality caused by multiple risk behaviours. © The Author 2015. Published by Oxford University Press on behalf of Faculty of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Optical spectroscopy and velocity dispersions of galaxy clusters from the SPT-SZ survey

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ruel, J.; Bayliss, M.; Bazin, G.

2014-09-01

We present optical spectroscopy of galaxies in clusters detected through the Sunyaev-Zel'dovich (SZ) effect with the South Pole Telescope (SPT). We report our own measurements of 61 spectroscopic cluster redshifts, and 48 velocity dispersions each calculated with more than 15 member galaxies. This catalog also includes 19 dispersions of SPT-observed clusters previously reported in the literature. The majority of the clusters in this paper are SPT-discovered; of these, most have been previously reported in other SPT cluster catalogs, and five are reported here as SPT discoveries for the first time. By performing a resampling analysis of galaxy velocities, we findmore » that unbiased velocity dispersions can be obtained from a relatively small number of member galaxies (≲ 30), but with increased systematic scatter. We use this analysis to determine statistical confidence intervals that include the effect of membership selection. We fit scaling relations between the observed cluster velocity dispersions and mass estimates from SZ and X-ray observables. In both cases, the results are consistent with the scaling relation between velocity dispersion and mass expected from dark-matter simulations. We measure a ∼30% log-normal scatter in dispersion at fixed mass, and a ∼10% offset in the normalization of the dispersion-mass relation when compared to the expectation from simulations, which is within the expected level of systematic uncertainty.« less
Diffusion maps, clustering and fuzzy Markov modeling in peptide folding transitions

NASA Astrophysics Data System (ADS)

Nedialkova, Lilia V.; Amat, Miguel A.; Kevrekidis, Ioannis G.; Hummer, Gerhard

2014-09-01

Using the helix-coil transitions of alanine pentapeptide as an illustrative example, we demonstrate the use of diffusion maps in the analysis of molecular dynamics simulation trajectories. Diffusion maps and other nonlinear data-mining techniques provide powerful tools to visualize the distribution of structures in conformation space. The resulting low-dimensional representations help in partitioning conformation space, and in constructing Markov state models that capture the conformational dynamics. In an initial step, we use diffusion maps to reduce the dimensionality of the conformational dynamics of Ala5. The resulting pretreated data are then used in a clustering step. The identified clusters show excellent overlap with clusters obtained previously by using the backbone dihedral angles as input, with small—but nontrivial—differences reflecting torsional degrees of freedom ignored in the earlier approach. We then construct a Markov state model describing the conformational dynamics in terms of a discrete-time random walk between the clusters. We show that by combining fuzzy C-means clustering with a transition-based assignment of states, we can construct robust Markov state models. This state-assignment procedure suppresses short-time memory effects that result from the non-Markovianity of the dynamics projected onto the space of clusters. In a comparison with previous work, we demonstrate how manifold learning techniques may complement and enhance informed intuition commonly used to construct reduced descriptions of the dynamics in molecular conformation space.
Diffusion maps, clustering and fuzzy Markov modeling in peptide folding transitions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nedialkova, Lilia V.; Amat, Miguel A.; Kevrekidis, Ioannis G., E-mail: yannis@princeton.edu, E-mail: gerhard.hummer@biophys.mpg.de

Using the helix-coil transitions of alanine pentapeptide as an illustrative example, we demonstrate the use of diffusion maps in the analysis of molecular dynamics simulation trajectories. Diffusion maps and other nonlinear data-mining techniques provide powerful tools to visualize the distribution of structures in conformation space. The resulting low-dimensional representations help in partitioning conformation space, and in constructing Markov state models that capture the conformational dynamics. In an initial step, we use diffusion maps to reduce the dimensionality of the conformational dynamics of Ala5. The resulting pretreated data are then used in a clustering step. The identified clusters show excellent overlapmore » with clusters obtained previously by using the backbone dihedral angles as input, with small—but nontrivial—differences reflecting torsional degrees of freedom ignored in the earlier approach. We then construct a Markov state model describing the conformational dynamics in terms of a discrete-time random walk between the clusters. We show that by combining fuzzy C-means clustering with a transition-based assignment of states, we can construct robust Markov state models. This state-assignment procedure suppresses short-time memory effects that result from the non-Markovianity of the dynamics projected onto the space of clusters. In a comparison with previous work, we demonstrate how manifold learning techniques may complement and enhance informed intuition commonly used to construct reduced descriptions of the dynamics in molecular conformation space.« less
Diffusion maps, clustering and fuzzy Markov modeling in peptide folding transitions

PubMed Central

Nedialkova, Lilia V.; Amat, Miguel A.; Kevrekidis, Ioannis G.; Hummer, Gerhard

2014-01-01

Using the helix-coil transitions of alanine pentapeptide as an illustrative example, we demonstrate the use of diffusion maps in the analysis of molecular dynamics simulation trajectories. Diffusion maps and other nonlinear data-mining techniques provide powerful tools to visualize the distribution of structures in conformation space. The resulting low-dimensional representations help in partitioning conformation space, and in constructing Markov state models that capture the conformational dynamics. In an initial step, we use diffusion maps to reduce the dimensionality of the conformational dynamics of Ala5. The resulting pretreated data are then used in a clustering step. The identified clusters show excellent overlap with clusters obtained previously by using the backbone dihedral angles as input, with small—but nontrivial—differences reflecting torsional degrees of freedom ignored in the earlier approach. We then construct a Markov state model describing the conformational dynamics in terms of a discrete-time random walk between the clusters. We show that by combining fuzzy C-means clustering with a transition-based assignment of states, we can construct robust Markov state models. This state-assignment procedure suppresses short-time memory effects that result from the non-Markovianity of the dynamics projected onto the space of clusters. In a comparison with previous work, we demonstrate how manifold learning techniques may complement and enhance informed intuition commonly used to construct reduced descriptions of the dynamics in molecular conformation space. PMID:25240340
Intracluster light at the Frontier - II. The Frontier Fields Clusters

NASA Astrophysics Data System (ADS)

Montes, Mireia; Trujillo, Ignacio

2018-02-01

Multiwavelength deep observations are a key tool to understand the origin of the diffuse light in clusters of galaxies: the intracluster light (ICL). For this reason, we take advantage of the Hubble Frontier Fields (HFF) survey to investigate the properties of the stellar populations of the ICL of its six massive intermediate redshift (0.3 < z < 0.6) clusters. We carry on this analysis down to a radial distance of ˜120 kpc from the brightest cluster galaxy. We found that the average metallicity of the ICL is [Fe/H]ICL ˜ -0.5, compatible with the value of the outskirts of the Milky Way. The mean stellar ages of the ICL are between 2 and 6 Gyr younger than the most massive galaxies of the clusters. Those results suggest that the ICL of these massive (>1015 M⊙) clusters is formed by the stripping of MW-like objects that have been accreted at z < 1, in agreement with current simulations. We do not find any significant increase in the fraction of light of the ICL with cosmic time, although the redshift range explored is narrow to derive any strong conclusion. When exploring the slope of the stellar mass density profile, we found that the ICL of the HFF clusters follows the shape of their underlying dark matter haloes, in agreement with the idea that the ICL is the result of the stripping of galaxies at recent times.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Zhaoying; Liu, Bingwen; Zhao, Evan

For the first time, the use of an argon cluster ion sputtering source has been demonstrated to perform superiorly relative to traditional oxygen and cesium ion sputtering sources for ToF-SIMS depth profiling of insulating materials. The superior performance has been attributed to effective alleviation of surface charging. A simulated nuclear waste glass, SON68, and layered hole-perovskite oxide thin films were selected as model systems due to their fundamental and practical significance. Our study shows that if the size of analysis areas is same, the highest sputter rate of argon cluster sputtering can be 2-3 times faster than the highest sputtermore » rates of oxygen or cesium sputtering. More importantly, high quality data and high sputter rates can be achieved simultaneously for argon cluster sputtering while this is not the case for cesium and oxygen sputtering. Therefore, for deep depth profiling of insulating samples, the measurement efficiency of argon cluster sputtering can be about 6-15 times better than traditional cesium and oxygen sputtering. Moreover, for a SrTiO3/SrCrO3 bi-layer thin film on a SrTiO3 substrate, the true 18O/16O isotopic distribution at the interface is better revealed when using the argon cluster sputtering source. Therefore, the implementation of an argon cluster sputtering source can significantly improve the measurement efficiency of insulating materials, and thus can expand the application of ToF-SIMS to the study of glass corrosion, perovskite oxide thin films, and many other potential systems.« less
Prediction of chemotherapeutic response in bladder cancer using k-means clustering of DCE-MRI pharmacokinetic parameters

PubMed Central

Nguyen, Huyen T.; Jia, Guang; Shah, Zarine K.; Pohar, Kamal; Mortazavi, Amir; Zynger, Debra L.; Wei, Lai; Yang, Xiangyu; Clark, Daniel; Knopp, Michael V.

2015-01-01

Purpose To apply k-means clustering of two pharmacokinetic parameters derived from 3T DCE-MRI to predict chemotherapeutic response in bladder cancer at the mid-cycle time-point. Materials and Methods With the pre-determined number of 3 clusters, k-means clustering was performed on non-dimensionalized Amp and kep estimates of each bladder tumor. Three cluster volume fractions (VFs) were calculated for each tumor at baseline and mid-cycle. The changes of three cluster VFs from baseline to mid-cycle were correlated with the tumor’s chemotherapeutic response. Receiver-operating-characteristics curve analysis was used to evaluate the performance of each cluster VF change as a biomarker of chemotherapeutic response in bladder cancer. Results k-means clustering partitioned each bladder tumor into cluster 1 (low kep and low Amp), cluster 2 (low kep and high Amp), cluster 3 (high kep and low Amp). The changes of all three cluster VFs were found to be associated with bladder tumor response to chemotherapy. The VF change of cluster 2 presented with the highest area-under-the-curve value (0.96) and the highest sensitivity/specificity/accuracy (96%/100%/97%) with a selected cutoff value. Conclusion k-means clustering of the two DCE-MRI pharmacokinetic parameters can characterize the complex microcirculatory changes within a bladder tumor to enable early prediction of the tumor’s chemotherapeutic response. PMID:24943272
Phenotypes of comorbidity in OSAS patients: combining categorical principal component analysis with cluster analysis.

PubMed

Vavougios, George D; George D, George; Pastaka, Chaido; Zarogiannis, Sotirios G; Gourgoulianis, Konstantinos I

2016-02-01

Phenotyping obstructive sleep apnea syndrome's comorbidity has been attempted for the first time only recently. The aim of our study was to determine phenotypes of comorbidity in obstructive sleep apnea syndrome patients employing a data-driven approach. Data from 1472 consecutive patient records were recovered from our hospital's database. Categorical principal component analysis and two-step clustering were employed to detect distinct clusters in the data. Univariate comparisons between clusters included one-way analysis of variance with Bonferroni correction and chi-square tests. Predictors of pairwise cluster membership were determined via a binary logistic regression model. The analyses revealed six distinct clusters: A, 'healthy, reporting sleeping related symptoms'; B, 'mild obstructive sleep apnea syndrome without significant comorbidities'; C1: 'moderate obstructive sleep apnea syndrome, obesity, without significant comorbidities'; C2: 'moderate obstructive sleep apnea syndrome with severe comorbidity, obesity and the exclusive inclusion of stroke'; D1: 'severe obstructive sleep apnea syndrome and obesity without comorbidity and a 33.8% prevalence of hypertension'; and D2: 'severe obstructive sleep apnea syndrome with severe comorbidities, along with the highest Epworth Sleepiness Scale score and highest body mass index'. Clusters differed significantly in apnea-hypopnea index, oxygen desaturation index; arousal index; age, body mass index, minimum oxygen saturation and daytime oxygen saturation (one-way analysis of variance P < 0.0001). Binary logistic regression indicated that older age, greater body mass index, lower daytime oxygen saturation and hypertension were associated independently with an increased risk of belonging in a comorbid cluster. Six distinct phenotypes of obstructive sleep apnea syndrome and its comorbidities were identified. Mapping the heterogeneity of the obstructive sleep apnea syndrome may help the early identification of at-risk groups. Finally, determining predictors of comorbidity for the moderate and severe strata of these phenotypes implies a need to take these factors into account when considering obstructive sleep apnea syndrome treatment options. © 2015 The Authors. Journal of Sleep Research published by John Wiley & Sons Ltd on behalf of European Sleep Research Society.
Real Time Intelligent Target Detection and Analysis with Machine Vision

NASA Technical Reports Server (NTRS)

Howard, Ayanna; Padgett, Curtis; Brown, Kenneth

2000-01-01

We present an algorithm for detecting a specified set of targets for an Automatic Target Recognition (ATR) application. ATR involves processing images for detecting, classifying, and tracking targets embedded in a background scene. We address the problem of discriminating between targets and nontarget objects in a scene by evaluating 40x40 image blocks belonging to an image. Each image block is first projected onto a set of templates specifically designed to separate images of targets embedded in a typical background scene from those background images without targets. These filters are found using directed principal component analysis which maximally separates the two groups. The projected images are then clustered into one of n classes based on a minimum distance to a set of n cluster prototypes. These cluster prototypes have previously been identified using a modified clustering algorithm based on prior sensed data. Each projected image pattern is then fed into the associated cluster's trained neural network for classification. A detailed description of our algorithm will be given in this paper. We outline our methodology for designing the templates, describe our modified clustering algorithm, and provide details on the neural network classifiers. Evaluation of the overall algorithm demonstrates that our detection rates approach 96% with a false positive rate of less than 0.03%.
Impact of SZ cluster residuals in CMB maps and CMB-LSS cross-correlations

NASA Astrophysics Data System (ADS)

Chen, T.; Remazeilles, M.; Dickinson, C.

2018-06-01

Residual foreground contamination in cosmic microwave background (CMB) maps, such as the residual contamination from thermal Sunyaev-Zeldovich (SZ) effect in the direction of galaxy clusters, can bias the cross-correlation measurements between CMB and large-scale structure optical surveys. It is thus essential to quantify those residuals and, if possible, to null out SZ cluster residuals in CMB maps. We quantify for the first time the amount of SZ cluster contamination in the released Planck 2015 CMB maps through (i) the stacking of CMB maps in the direction of the clusters, and (ii) the computation of cross-correlation power spectra between CMB maps and the SDSS-IV large-scale structure data. Our cross-power spectrum analysis yields a 30σ detection at the cluster scale (ℓ = 1500-2500) and a 39σ detection on larger scales (ℓ = 500-1500) due to clustering of SZ clusters, giving an overall 54σ detection of SZ cluster residuals in the Planck CMB maps. The Planck 2015 NILC CMB map is shown to have 44 ± 4% of thermal SZ foreground emission left in it. Using the 'Constrained ILC' component separation technique, we construct an alternative Planck CMB map, the 2D-ILC map, which is shown to have negligible SZ contamination, at the cost of being slightly more contaminated by Galactic foregrounds and noise. We also discuss the impact of the SZ residuals in CMB maps on the measurement of the ISW effect, which is shown to be negligible based on our analysis.
Cosmology from galaxy clusters as observed by Planck

NASA Astrophysics Data System (ADS)

Pierpaoli, Elena

We propose to use current all-sky data on galaxy clusters in the radio/infrared bands in order to constrain cosmology. This will be achieved performing parameter estimation with number counts and power spectra for galaxy clusters detected by Planck through their Sunyaev—Zeldovich signature. The ultimate goal of this proposal is to use clusters as tracers of matter density in order to provide information about fundamental properties of our Universe, such as the law of gravity on large scale, early Universe phenomena, structure formation and the nature of dark matter and dark energy. We will leverage on the availability of a larger and deeper cluster catalog from the latest Planck data release in order to include, for the first time, the cluster power spectrum in the cosmological parameter determination analysis. Furthermore, we will extend clusters' analysis to cosmological models not yet investigated by the Planck collaboration. These aims require a diverse set of activities, ranging from the characterization of the clusters' selection function, the choice of the cosmological cluster sample to be used for parameter estimation, the construction of mock samples in the various cosmological models with correct correlation properties in order to produce reliable selection functions and noise covariance matrices, and finally the construction of the appropriate likelihood for number counts and power spectra. We plan to make the final code available to the community and compatible with the most widely used cosmological parameter estimation code. This research makes use of data from the NASA satellites Planck and, less directly, Chandra, in order to constrain cosmology; and therefore perfectly fits the NASA objectives and the specifications of this solicitation.
Groundwater Quality: Analysis of Its Temporal and Spatial Variability in a Karst Aquifer.

PubMed

Pacheco Castro, Roger; Pacheco Ávila, Julia; Ye, Ming; Cabrera Sansores, Armando

2018-01-01

This study develops an approach based on hierarchical cluster analysis for investigating the spatial and temporal variation of water quality governing processes. The water quality data used in this study were collected in the karst aquifer of Yucatan, Mexico, the only source of drinking water for a population of nearly two million people. Hierarchical cluster analysis was applied to the quality data of all the sampling periods lumped together. This was motivated by the observation that, if water quality does not vary significantly in time, two samples from the same sampling site will belong to the same cluster. The resulting distribution maps of clusters and box-plots of the major chemical components reveal the spatial and temporal variability of groundwater quality. Principal component analysis was used to verify the results of cluster analysis and to derive the variables that explained most of the variation of the groundwater quality data. Results of this work increase the knowledge about how precipitation and human contamination impact groundwater quality in Yucatan. Spatial variability of groundwater quality in the study area is caused by: a) seawater intrusion and groundwater rich in sulfates at the west and in the coast, b) water rock interactions and the average annual precipitation at the middle and east zones respectively, and c) human contamination present in two localized zones. Changes in the amount and distribution of precipitation cause temporal variation by diluting groundwater in the aquifer. This approach allows to analyze the variation of groundwater quality controlling processes efficiently and simultaneously. © 2017, National Ground Water Association.
Subtypes of female juvenile offenders: a cluster analysis of the Millon Adolescent Clinical Inventory.

PubMed

Stefurak, Tres; Calhoun, Georgia B

2007-01-01

The current study sought to explore subtypes of adolescents within a sample of female juvenile offenders. Using the Millon Adolescent Clinical Inventory with 101 female juvenile offenders, a two-step cluster analysis was performed beginning with a Ward's method hierarchical cluster analysis followed by a K-Means iterative partitioning cluster analysis. The results suggest an optimal three-cluster solution, with cluster profiles leading to the following group labels: Externalizing Problems, Depressed/Interpersonally Ambivalent, and Anxious Prosocial. Analysis along the factors of age, race, offense typology and offense chronicity were conducted to further understand the nature of found clusters. Only the effect for race was significant with the Anxious Prosocial and Depressed Intepersonally Ambivalent clusters appearing disproportionately comprised of African American girls. To establish external validity, clusters were compared across scales of the Behavioral Assessment System for Children - Self Report of Personality, and corroborative distinctions between clusters were found here.
Wildfire cluster detection using space-time scan statistics

NASA Astrophysics Data System (ADS)

Tonini, M.; Tuia, D.; Ratle, F.; Kanevski, M.

2009-04-01

The aim of the present study is to identify spatio-temporal clusters of fires sequences using space-time scan statistics. These statistical methods are specifically designed to detect clusters and assess their significance. Basically, scan statistics work by comparing a set of events occurring inside a scanning window (or a space-time cylinder for spatio-temporal data) with those that lie outside. Windows of increasing size scan the zone across space and time: the likelihood ratio is calculated for each window (comparing the ratio "observed cases over expected" inside and outside): the window with the maximum value is assumed to be the most probable cluster, and so on. Under the null hypothesis of spatial and temporal randomness, these events are distributed according to a known discrete-state random process (Poisson or Bernoulli), which parameters can be estimated. Given this assumption, it is possible to test whether or not the null hypothesis holds in a specific area. In order to deal with fires data, the space-time permutation scan statistic has been applied since it does not require the explicit specification of the population-at risk in each cylinder. The case study is represented by Florida daily fire detection using the Moderate Resolution Imaging Spectroradiometer (MODIS) active fire product during the period 2003-2006. As result, statistically significant clusters have been identified. Performing the analyses over the entire frame period, three out of the five most likely clusters have been identified in the forest areas, on the North of the country; the other two clusters cover a large zone in the South, corresponding to agricultural land and the prairies in the Everglades. Furthermore, the analyses have been performed separately for the four years to analyze if the wildfires recur each year during the same period. It emerges that clusters of forest fires are more frequent in hot seasons (spring and summer), while in the South areas they are widely present along the whole year. The analysis of fires distribution to evaluate if they are statistically more frequent in some area or/and in some period of the year, can be useful to support fire management and to focus on prevention measures.
Analysis of plasmaspheric plumes: CLUSTER and IMAGE observations and numerical simulations

NASA Technical Reports Server (NTRS)

Darouzet, Fabien; DeKeyser, Johan; Decreau, Pierrette; Gallagher, Dennis; Pierrard, Viviane; Lemaire, Joseph; Dandouras, Iannis; Matsui, Hiroshi; Dunlop, Malcolm; Andre, Mats

2005-01-01

Plasmaspheric plumes have been routinely observed by CLUSTER and IMAGE. The CLUSTER mission provides high time resolution four-point measurements of the plasmasphere near perigee. Total electron density profiles can be derived from the plasma frequency and/or from the spacecraft potential (note that the electron spectrometer is usually not operating inside the plasmasphere); ion velocity is also measured onboard these satellites (but ion density is not reliable because of instrumental limitations). The EUV imager onboard the IMAGE spacecraft provides global images of the plasmasphere with a spatial resolution of 0.1 RE every 10 minutes; such images acquired near apogee from high above the pole show the geometry of plasmaspheric plumes, their evolution and motion. We present coordinated observations for 3 plume events and compare CLUSTER in-situ data (panel A) with global images of the plasmasphere obtained from IMAGE (panel B), and with numerical simulations for the formation of plumes based on a model that includes the interchange instability mechanism (panel C). In particular, we study the geometry and the orientation of plasmaspheric plumes by using a four-point analysis method, the spatial gradient. We also compare several aspects of their motion as determined by different methods: (i) inner and outer plume boundary velocity calculated from time delays of this boundary observed by the wave experiment WHISPER on the four spacecraft, (ii) ion velocity derived from the ion spectrometer CIS onboard CLUSTER, (iii) drift velocity measured by the electron drift instrument ED1 onboard CLUSTER and (iv) global velocity determined from successive EUV images. These different techniques consistently indicate that plasmaspheric plumes rotate around the Earth, with their foot fully co-rotating, but with their tip rotating slower and moving farther out.
Artificial neural network modeling and cluster analysis for organic facies and burial history estimation using well log data: A case study of the South Pars Gas Field, Persian Gulf, Iran

NASA Astrophysics Data System (ADS)

Alizadeh, Bahram; Najjari, Saeid; Kadkhodaie-Ilkhchi, Ali

2012-08-01

Intelligent and statistical techniques were used to extract the hidden organic facies from well log responses in the Giant South Pars Gas Field, Persian Gulf, Iran. Kazhdomi Formation of Mid-Cretaceous and Kangan-Dalan Formations of Permo-Triassic Data were used for this purpose. Initially GR, SGR, CGR, THOR, POTA, NPHI and DT logs were applied to model the relationship between wireline logs and Total Organic Carbon (TOC) content using Artificial Neural Networks (ANN). The correlation coefficient (R2) between the measured and ANN predicted TOC equals to 89%. The performance of the model is measured by the Mean Squared Error function, which does not exceed 0.0073. Using Cluster Analysis technique and creating a binary hierarchical cluster tree the constructed TOC column of each formation was clustered into 5 organic facies according to their geochemical similarity. Later a second model with the accuracy of 84% was created by ANN to determine the specified clusters (facies) directly from well logs for quick cluster recognition in other wells of the studied field. Each created facies was correlated to its appropriate burial history curve. Hence each and every facies of a formation could be scrutinized separately and directly from its well logs, demonstrating the time and depth of oil or gas generation. Therefore potential production zone of Kazhdomi probable source rock and Kangan- Dalan reservoir formation could be identified while well logging operations (especially in LWD cases) were in progress. This could reduce uncertainty and save plenty of time and cost for oil industries and aid in the successful implementation of exploration and exploitation plans.

[Cluster analysis in biomedical researches].

PubMed

Akopov, A S; Moskovtsev, A A; Dolenko, S A; Savina, G D

2013-01-01

Cluster analysis is one of the most popular methods for the analysis of multi-parameter data. The cluster analysis reveals the internal structure of the data, group the separate observations on the degree of their similarity. The review provides a definition of the basic concepts of cluster analysis, and discusses the most popular clustering algorithms: k-means, hierarchical algorithms, Kohonen networks algorithms. Examples are the use of these algorithms in biomedical research.
Bruker biotyper matrix-assisted laser desorption ionization-time of flight mass spectrometry system for identification of Nocardia, Rhodococcus, Kocuria, Gordonia, Tsukamurella, and Listeria species.

PubMed

Hsueh, Po-Ren; Lee, Tai-Fen; Du, Shin-Hei; Teng, Shih-Hua; Liao, Chun-Hsing; Sheng, Wang-Hui; Teng, Lee-Jene

2014-07-01

We evaluated whether the Bruker Biotyper matrix-associated laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) system provides accurate species-level identifications of 147 isolates of aerobically growing Gram-positive rods (GPRs). The bacterial isolates included Nocardia (n = 74), Listeria (n = 39), Kocuria (n = 15), Rhodococcus (n = 10), Gordonia (n = 7), and Tsukamurella (n = 2) species, which had all been identified by conventional methods, molecular methods, or both. In total, 89.7% of Listeria monocytogenes, 80% of Rhodococcus species, 26.7% of Kocuria species, and 14.9% of Nocardia species (n = 11, all N. nova and N. otitidiscaviarum) were correctly identified to the species level (score values, ≥ 2.0). A clustering analysis of spectra generated by the Bruker Biotyper identified six clusters of Nocardia species, i.e., cluster 1 (N. cyriacigeorgica), cluster 2 (N. brasiliensis), cluster 3 (N. farcinica), cluster 4 (N. puris), cluster 5 (N. asiatica), and cluster 6 (N. beijingensis), based on the six peaks generated by ClinProTools with the genetic algorithm, i.e., m/z 2,774.477 (cluster 1), m/z 5,389.792 (cluster 2), m/z 6,505.720 (cluster 3), m/z 5,428.795 (cluster 4), m/z 6,525.326 (cluster 5), and m/z 16,085.216 (cluster 6). Two clusters of L. monocytogenes spectra were also found according to the five peaks, i.e., m/z 5,594.85, m/z 6,184.39, and m/z 11,187.31, for cluster 1 (serotype 1/2a) and m/z 5,601.21 and m/z 11,199.33 for cluster 2 (serotypes 1/2b and 4b). The Bruker Biotyper system was unable to accurately identify Nocardia (except for N. nova and N. otitidiscaviarum), Tsukamurella, or Gordonia species. Continuous expansion of the MALDI-TOF MS databases to include more GPRs is necessary. Copyright © 2014, American Society for Microbiology. All Rights Reserved.
Mass Profile Decomposition of the Frontier Fields Cluster MACS J0416-2403: Insights on the Dark-matter Inner Profile

NASA Astrophysics Data System (ADS)

Annunziatella, M.; Bonamigo, M.; Grillo, C.; Mercurio, A.; Rosati, P.; Caminha, G.; Biviano, A.; Girardi, M.; Gobat, R.; Lombardi, M.; Munari, E.

2017-12-01

We present a high-resolution dissection of the two-dimensional total mass distribution in the core of the Hubble Frontier Fields galaxy cluster MACS J0416.1‑2403, at z = 0.396. We exploit HST/WFC3 near-IR (F160W) imaging, VLT/Multi Unit Spectroscopic Explorer spectroscopy, and Chandra data to separate the stellar, hot gas, and dark-matter mass components in the inner 300 kpc of the cluster. We combine the recent results of our refined strong lensing analysis, which includes the contribution of the intracluster gas, with the modeling of the surface brightness and stellar mass distributions of 193 cluster members, of which 144 are spectroscopically confirmed. We find that, moving from 10 to 300 kpc from the cluster center, the stellar to total mass fraction decreases from 12% to 1% and the hot gas to total mass fraction increases from 3% to 9%, resulting in a baryon fraction of approximatively 10% at the outermost radius. We measure that the stellar component represents ∼30%, near the cluster center, and 15%, at larger clustercentric distances, of the total mass in the cluster substructures. We subtract the baryonic mass component from the total mass distribution and conclude that within 30 kpc (∼3 times the effective radius of the brightest cluster galaxy) from the cluster center the surface mass density profile of the total mass and global (cluster plus substructures) dark-matter are steeper and that of the diffuse (cluster) dark-matter is shallower than an NFW profile. Our current analysis does not point to a significant offset between the cluster stellar and dark-matter components. This detailed and robust reconstruction of the inner dark-matter distribution in a larger sample of galaxy clusters will set a new benchmark for different structure formation scenarios.
A Review of Subsequence Time Series Clustering

PubMed Central

Teh, Ying Wah

2014-01-01

Clustering of subsequence time series remains an open issue in time series clustering. Subsequence time series clustering is used in different fields, such as e-commerce, outlier detection, speech recognition, biological systems, DNA recognition, and text mining. One of the useful fields in the domain of subsequence time series clustering is pattern recognition. To improve this field, a sequence of time series data is used. This paper reviews some definitions and backgrounds related to subsequence time series clustering. The categorization of the literature reviews is divided into three groups: preproof, interproof, and postproof period. Moreover, various state-of-the-art approaches in performing subsequence time series clustering are discussed under each of the following categories. The strengths and weaknesses of the employed methods are evaluated as potential issues for future studies. PMID:25140332
A review of subsequence time series clustering.

PubMed

Zolhavarieh, Seyedjamal; Aghabozorgi, Saeed; Teh, Ying Wah

2014-01-01

Clustering of subsequence time series remains an open issue in time series clustering. Subsequence time series clustering is used in different fields, such as e-commerce, outlier detection, speech recognition, biological systems, DNA recognition, and text mining. One of the useful fields in the domain of subsequence time series clustering is pattern recognition. To improve this field, a sequence of time series data is used. This paper reviews some definitions and backgrounds related to subsequence time series clustering. The categorization of the literature reviews is divided into three groups: preproof, interproof, and postproof period. Moreover, various state-of-the-art approaches in performing subsequence time series clustering are discussed under each of the following categories. The strengths and weaknesses of the employed methods are evaluated as potential issues for future studies.
Spatial-temporal clustering of tornadoes

NASA Astrophysics Data System (ADS)

Malamud, Bruce D.; Turcotte, Donald L.; Brooks, Harold E.

2016-12-01

The standard measure of the intensity of a tornado is the Enhanced Fujita scale, which is based qualitatively on the damage caused by a tornado. An alternative measure of tornado intensity is the tornado path length, L. Here we examine the spatial-temporal clustering of severe tornadoes, which we define as having path lengths L ≥ 10 km. Of particular concern are tornado outbreaks, when a large number of severe tornadoes occur in a day in a restricted region. We apply a spatial-temporal clustering analysis developed for earthquakes. We take all pairs of severe tornadoes in observed and modelled outbreaks, and for each pair plot the spatial lag (distance between touchdown points) against the temporal lag (time between touchdown points). We apply our spatial-temporal lag methodology to the intense tornado outbreaks in the central United States on 26 and 27 April 2011, which resulted in over 300 fatalities and produced 109 severe (L ≥ 10 km) tornadoes. The patterns of spatial-temporal lag correlations that we obtain for the 2 days are strikingly different. On 26 April 2011, there were 45 severe tornadoes and our clustering analysis is dominated by a complex sequence of linear features. We associate the linear patterns with the tornadoes generated in either a single cell thunderstorm or a closely spaced cluster of single cell thunderstorms moving at a near-constant velocity. Our study of a derecho tornado outbreak of six severe tornadoes on 4 April 2011 along with modelled outbreak scenarios confirms this association. On 27 April 2011, there were 64 severe tornadoes and our clustering analysis is predominantly random with virtually no embedded linear patterns. We associate this pattern with a large number of interacting supercell thunderstorms generating tornadoes randomly in space and time. In order to better understand these associations, we also applied our approach to the Great Plains tornado outbreak of 3 May 1999. Careful studies by others have associated individual tornadoes with specified supercell thunderstorms. Our analysis of the 3 May 1999 tornado outbreak directly associated linear features in the largely random spatial-temporal analysis with several supercell thunderstorms, which we then confirmed using model scenarios of synthetic tornado outbreaks. We suggest that it may be possible to develop a semi-automated modelling of tornado touchdowns to match the type of observations made on the 3 May 1999 outbreak.
Spatial-Temporal Clustering of Tornadoes

NASA Astrophysics Data System (ADS)

Malamud, Bruce D.; Turcotte, Donald L.; Brooks, Harold E.

2017-04-01

The standard measure of the intensity of a tornado is the Enhanced Fujita scale, which is based qualitatively on the damage caused by a tornado. An alternative measure of tornado intensity is the tornado path length, L. Here we examine the spatial-temporal clustering of severe tornadoes, which we define as having path lengths L ≥ 10 km. Of particular concern are tornado outbreaks, when a large number of severe tornadoes occur in a day in a restricted region. We apply a spatial-temporal clustering analysis developed for earthquakes. We take all pairs of severe tornadoes in observed and modelled outbreaks, and for each pair plot the spatial lag (distance between touchdown points) against the temporal lag (time between touchdown points). We apply our spatial-temporal lag methodology to the intense tornado outbreaks in the central United States on 26 and 27 April 2011, which resulted in over 300 fatalities and produced 109 severe (L ≥ 10 km) tornadoes. The patterns of spatial-temporal lag correlations that we obtain for the 2 days are strikingly different. On 26 April 2011, there were 45 severe tornadoes and our clustering analysis is dominated by a complex sequence of linear features. We associate the linear patterns with the tornadoes generated in either a single cell thunderstorm or a closely spaced cluster of single cell thunderstorms moving at a near-constant velocity. Our study of a derecho tornado outbreak of six severe tornadoes on 4 April 2011 along with modelled outbreak scenarios confirms this association. On 27 April 2011, there were 64 severe tornadoes and our clustering analysis is predominantly random with virtually no embedded linear patterns. We associate this pattern with a large number of interacting supercell thunderstorms generating tornadoes randomly in space and time. In order to better understand these associations, we also applied our approach to the Great Plains tornado outbreak of 3 May 1999. Careful studies by others have associated individual tornadoes with specified supercell thunderstorms. Our analysis of the 3 May 1999 tornado outbreak directly associated linear features in the largely random spatial-temporal analysis with several supercell thunderstorms, which we then confirmed using model scenarios of synthetic tornado outbreaks. We suggest that it may be possible to develop a semi-automated modelling of tornado touchdowns to match the type of observations made on the 3 May 1999 outbreak.
Development of an automated energy audit protocol for office buildings

NASA Astrophysics Data System (ADS)

Deb, Chirag

This study aims to enhance the building energy audit process, and bring about reduction in time and cost requirements in the conduction of a full physical audit. For this, a total of 5 Energy Service Companies in Singapore have collaborated and provided energy audit reports for 62 office buildings. Several statistical techniques are adopted to analyse these reports. These techniques comprise cluster analysis and development of prediction models to predict energy savings for buildings. The cluster analysis shows that there are 3 clusters of buildings experiencing different levels of energy savings. To understand the effect of building variables on the change in EUI, a robust iterative process for selecting the appropriate variables is developed. The results show that the 4 variables of GFA, non-air-conditioning energy consumption, average chiller plant efficiency and installed capacity of chillers should be taken for clustering. This analysis is extended to the development of prediction models using linear regression and artificial neural networks (ANN). An exhaustive variable selection algorithm is developed to select the input variables for the two energy saving prediction models. The results show that the ANN prediction model can predict the energy saving potential of a given building with an accuracy of +/-14.8%.
Statistical design and analysis plan for an impact evaluation of an HIV treatment and prevention intervention for female sex workers in Zimbabwe: a study protocol for a cluster randomised controlled trial.

PubMed

Hargreaves, James R; Fearon, Elizabeth; Davey, Calum; Phillips, Andrew; Cambiano, Valentina; Cowan, Frances M

2016-01-05

Pragmatic cluster-randomised trials should seek to make unbiased estimates of effect and be reported according to CONSORT principles, and the study population should be representative of the target population. This is challenging when conducting trials amongst 'hidden' populations without a sample frame. We describe a pair-matched cluster-randomised trial of a combination HIV-prevention intervention to reduce the proportion of female sex workers (FSW) with a detectable HIV viral load in Zimbabwe, recruiting via respondent driven sampling (RDS). We will cross-sectionally survey approximately 200 FSW at baseline and at endline to characterise each of 14 sites. RDS is a variant of chain referral sampling and has been adapted to approximate random sampling. Primary analysis will use the 'RDS-2' method to estimate cluster summaries and will adapt Hayes and Moulton's '2-step' method to adjust effect estimates for individual-level confounders and further adjust for cluster baseline prevalence. We will adapt CONSORT to accommodate RDS. In the absence of observable refusal rates, we will compare the recruitment process between matched pairs. We will need to investigate whether cluster-specific recruitment or the intervention itself affects the accuracy of the RDS estimation process, potentially causing differential biases. To do this, we will calculate RDS-diagnostic statistics for each cluster at each time point and compare these statistics within matched pairs and time points. Sensitivity analyses will assess the impact of potential biases arising from assumptions made by the RDS-2 estimation. We are not aware of any other completed pragmatic cluster RCTs that are recruiting participants using RDS. Our statistical design and analysis approach seeks to transparently document participant recruitment and allow an assessment of the representativeness of the study to the target population, a key aspect of pragmatic trials. The challenges we have faced in the design of this trial are likely to be shared in other contexts aiming to serve the needs of legally and/or socially marginalised populations for which no sampling frame exists and especially when the social networks of participants are both the target of intervention and the means of recruitment. The trial was registered at Pan African Clinical Trials Registry (PACTR201312000722390) on 9 December 2013.
Recent TB transmission, clustering and predictors of large clusters in London, 2010–2012: results from first 3 years of universal MIRU-VNTR strain typing

PubMed Central

Hamblion, Esther L; Le Menach, Arnaud; Anderson, Laura F; Lalor, Maeve K; Brown, Tim; Abubakar, Ibrahim; Anderson, Charlotte; Maguire, Helen; Anderson, Sarah R

2016-01-01

Background The incidence of TB has doubled in the last 20 years in London. A better understanding of risk groups for recent transmission is required to effectively target interventions. We investigated the molecular epidemiological characteristics of TB cases to estimate the proportion of cases due to recent transmission, and identify predictors for belonging to a cluster. Methods The study population included all culture-positive TB cases in London residents, notified between January 2010 and December 2012, strain typed using 24-loci multiple interspersed repetitive units-variable number tandem repeats. Multivariable logistic regression analysis was performed to assess the risk factors for clustering using sociodemographic and clinical characteristics of cases and for cluster size based on the characteristics of the first two cases. Results There were 10 147 cases of which 5728 (57%) were culture confirmed and 4790 isolates (84%) were typed. 2194 (46%) were clustered in 570 clusters, and the estimated proportion attributable to recent transmission was 34%. Clustered cases were more likely to be UK born, have pulmonary TB, a previous diagnosis, a history of substance abuse or alcohol abuse and imprisonment, be of white, Indian, black-African or Caribbean ethnicity. The time between notification of the first two cases was more likely to be <90 days in large clusters. Conclusions Up to a third of TB cases in London may be due to recent transmission. Resources should be directed to the timely investigation of clusters involving cases with risk factors, particularly those with a short period between the first two cases, to interrupt onward transmission of TB. PMID:27417280
Highly dynamically evolved intermediate-age open clusters

NASA Astrophysics Data System (ADS)

Piatti, Andrés E.; Dias, Wilton S.; Sampedro, Laura M.

2017-04-01

We present a comprehensive UBVRI and Washington CT1T2 photometric analysis of seven catalogued open clusters, namely: Ruprecht 3, 9, 37, 74, 150, ESO 324-15 and 436-2. The multiband photometric data sets in combination with 2MASS photometry and Gaia astrometry for the brighter stars were used to estimate their structural parameters and fundamental astrophysical properties. We found that Ruprecht 3 and ESO 436-2 do not show self-consistent evidence of being physical systems. The remained studied objects are open clusters of intermediate age (9.0 ≤ log(t yr-1) ≤ 9.6), of relatively small size (rcls ˜ 0.4-1.3 pc) and placed between 0.6 and 2.9 kpc from the Sun. We analysed the relationships between core, half-mass, tidal and Jacoby radii as well as half-mass relaxation times to conclude that the studied clusters are in an evolved dynamical stage. The total cluster masses obtained by summing those of the observed cluster stars resulted to be ˜10-15 per cent of the masses of open clusters of similar age located closer than 2 kpc from the Sun. We found that cluster stars occupy volumes as large as those for tidally filled clusters.
Fascioliasis risk factors and space-time clusters in domestic ruminants in Bangladesh.

PubMed

Rahman, A K M Anisur; Islam, S K Shaheenur; Talukder, Md Hasanuzzaman; Hassan, Md Kumrul; Dhand, Navneet K; Ward, Michael P

2017-05-08

A retrospective observational study was conducted to identify fascioliasis hotspots, clusters, potential risk factors and to map fascioliasis risk in domestic ruminants in Bangladesh. Cases of fascioliasis in cattle, buffalo, sheep and goats from all districts in Bangladesh between 2011 and 2013 were identified via secondary surveillance data from the Department of Livestock Services' Epidemiology Unit. From each case report, date of report, species affected and district data were extracted. The total number of domestic ruminants in each district was used to calculate fascioliasis cases per ten thousand animals at risk per district, and this was used for cluster and hotspot analysis. Clustering was assessed with Moran's spatial autocorrelation statistic, hotspots with the local indicator of spatial association (LISA) statistic and space-time clusters with the scan statistic (Poisson model). The association between district fascioliasis prevalence and climate (temperature, precipitation), elevation, land cover and water bodies was investigated using a spatial regression model. A total of 1,723,971 cases of fascioliasis were reported in the three-year study period in cattle (1,164,560), goats (424,314), buffalo (88,924) and sheep (46,173). A total of nine hotspots were identified; one of these persisted in each of the three years. Only two local clusters were found. Five space-time clusters located within 22 districts were also identified. Annual risk maps of fascioliasis cases correlated with the hotspots and clusters detected. Cultivated and managed (P < 0.001) and artificial surface (P = 0.04) land cover areas, and elevation (P = 0.003) were positively and negatively associated with fascioliasis in Bangladesh, respectively. Results indicate that due to land use characteristics some areas of Bangladesh are at greater risk of fascioliasis. The potential risk factors, hot spots and clusters identified in this study can be used to guide science-based treatment and control decisions for fascioliasis in Bangladesh and in other similar geo-climatic zones throughout the world.
Cluster analysis of phytoplankton data collected from the National Stream Quality Accounting Network in the Tennessee River basin, 1974-81

USGS Publications Warehouse

Stephens, D.W.; Wangsgard, J.B.

1988-01-01

A computer program, Numerical Taxonomy System of Multivariate Statistical Programs (NTSYS), was used with interfacing software to perform cluster analyses of phytoplankton data stored in the biological files of the U.S. Geological Survey. The NTSYS software performs various types of statistical analyses and is capable of handling a large matrix of data. Cluster analyses were done on phytoplankton data collected from 1974 to 1981 at four national Stream Quality Accounting Network stations in the Tennessee River basin. Analysis of the changes in clusters of phytoplankton genera indicated possible changes in the water quality of the French Broad River near Knoxville, Tennessee. At this station, the most common diatom groups indicated a shift in dominant forms with some of the less common diatoms being replaced by green and blue-green algae. There was a reduction in genera variability between 1974-77 and 1979-81 sampling periods. Statistical analysis of chloride and dissolved solids confirmed that concentrations of these substances were smaller in 1974-77 than in 1979-81. At Pickwick Landing Dam, the furthest downstream station used in the study, there was an increase in the number of genera of ' rare ' organisms with time. The appearance of two groups of green and blue-green algae indicated that an increase in temperature or nutrient concentrations occurred from 1974 to 1981, but this could not be confirmed using available water quality data. Associations of genera forming the phytoplankton communities at three stations on the Tennessee River were found to be seasonal. Nodal analysis of combined data from all four stations used in the study did not identify any seasonal or temporal patterns during 1974-81. Cluster analysis using the NYSYS programs was effective in reducing the large phytoplankton data set to a manageable size and provided considerable insight into the structure of phytoplankton communities in the Tennessee River basin. Problems encountered using cluster analysis were the subjectivity introduced in the definition of meaningful clusters, and the lack of taxonomic identification to the species level. (Author 's abstract)
Differentiation of Recurrent Glioblastoma from Delayed Radiation Necrosis by Using Voxel-based Multiparametric Analysis of MR Imaging Data.

PubMed

Yoon, Ra Gyoung; Kim, Ho Sung; Koh, Myeong Ju; Shim, Woo Hyun; Jung, Seung Chai; Kim, Sang Joon; Kim, Jeong Hoon

2017-10-01

Purpose To assess a volume-weighted voxel-based multiparametric (MP) clustering method as an imaging biomarker to differentiate recurrent glioblastoma from delayed radiation necrosis. Materials and Methods The institutional review board approved this retrospective study and waived the informed consent requirement. Seventy-five patients with pathologic analysis-confirmed recurrent glioblastoma (n = 42) or radiation necrosis (n = 33) who presented with enlarged contrast material-enhanced lesions at magnetic resonance (MR) imaging after they completed concurrent chemotherapy and radiation therapy were enrolled. The diagnostic performance of the total MP cluster score was determined by using the area under the receiver operating characteristic curve (AUC) with cross-validation and compared with those of single parameter measurements (10% histogram cutoffs of apparent diffusion coefficient [ADC10] or 90% histogram cutoffs of normalized cerebral blood volume and initial time-signal intensity AUC). Results Receiver operating characteristic curve analysis showed that an AUC for differentiating recurrent glioblastoma from delayed radiation necrosis was highest in the total MP cluster score and lowest for ADC10 for both readers. The total MP cluster score had significantly better diagnostic accuracy than any single parameter (corrected P = .001-.039 for reader 1; corrected P = .005-.041 for reader 2). The total MP cluster score was the best predictor of recurrent glioblastoma (cross-validated AUCs, 0.942-0.946 for both readers), with a sensitivity of 95.2% for reader 1 and 97.6% for reader 2. Conclusion Quantitative analysis with volume-weighted voxel-based MP clustering appears to be superior to the use of single imaging parameters to differentiate recurrent glioblastoma from delayed radiation necrosis. © RSNA, 2017 Online supplemental material is available for this article.
The Impact of Multilocus Variable-Number Tandem-Repeat Analysis on PulseNet Canada Escherichia coli O157:H7 Laboratory Surveillance and Outbreak Support, 2008-2012.

PubMed

Rumore, Jillian Leigh; Tschetter, Lorelee; Nadon, Celine

2016-05-01

The lack of pattern diversity among pulsed-field gel electrophoresis (PFGE) profiles for Escherichia coli O157:H7 in Canada does not consistently provide optimal discrimination, and therefore, differentiating temporally and/or geographically associated sporadic cases from potential outbreak cases can at times impede investigations. To address this limitation, DNA sequence-based methods such as multilocus variable-number tandem-repeat analysis (MLVA) have been explored. To assess the performance of MLVA as a supplemental method to PFGE from the Canadian perspective, a retrospective analysis of all E. coli O157:H7 isolated in Canada from January 2008 to December 2012 (inclusive) was conducted. A total of 2285 E. coli O157:H7 isolates and 63 clusters of cases (by PFGE) were selected for the study. Based on the qualitative analysis, the addition of MLVA improved the categorization of cases for 60% of clusters and no change was observed for ∼40% of clusters investigated. In such situations, MLVA serves to confirm PFGE results, but may not add further information per se. The findings of this study demonstrate that MLVA data, when used in combination with PFGE-based analyses, provide additional resolution to the detection of clusters lacking PFGE diversity as well as demonstrate good epidemiological concordance. In addition, MLVA is able to identify cluster-associated isolates with variant PFGE pattern combinations that may have been previously missed by PFGE alone. Optimal laboratory surveillance in Canada is achieved with the application of PFGE and MLVA in tandem for routine surveillance, cluster detection, and outbreak response.
Malaria control and prevention towards elimination: data from an eleven-year surveillance in Shandong Province, China.

PubMed

Kong, Xiangli; Liu, Xin; Tu, Hong; Xu, Yan; Niu, Jianbing; Wang, Yongbin; Zhao, Changlei; Kou, Jingxuan; Feng, Jun

2017-01-31

Shandong Province experienced a declining malaria trend of local-acquired transmission, but the increasing imported malaria remains a challenge. Therefore, understanding the epidemiological characteristics of malaria and the control and elimination strategy and interventions is needed for better planning to achieve the overall elimination goal in Shandong Province. A retrospective study was conducted and all individual cases from a web-based reporting system were reviewed and analysed to explore malaria-endemic characteristics in Shandong from 2005 to 2015. Annual malaria incidence reported in 2005-2015 were geo-coded and matched to the county-level. Spatial cluster analysis was performed to evaluate any identified spatial disease clusters for statistical significance. The space-time cluster was detected with high rates through the retrospective space-time analysis scanning using the discrete Poisson model. The overall malaria incidence decreased to a low level during 2005-2015. In total, 1564 confirmed malaria cases were reported, 27.1% of which (n = 424) were indigenous cases. Most of the indigenous case (n = 339, 80.0%) occurred from June to October. However, the number and scale of imported cases have been increased but no significant difference was observed during months. Shandong is endemic for both Plasmodium vivax (n = 730) and Plasmodium falciparum (n = 674). The disease is mainly distributed in Southern (n = 710) and Eastern region (n = 424) of Shandong, such as Jinning (n = 214 [13.7%]), Weihai (n = 151 [9.7%]), and Yantai (n = 107 [6.8%]). Furthermore, the spatial cluster analysis of malaria cases from 2005 to 2015 indicated that the diseased was not randomly distributed. For indigenous cases, a total of 15 and 2 high-risk counties were determined from 2005 to 2009 (control phase) and from 2010 to 2015 (elimination phase), respectively. For imported cases, a total of 26 and 29 high-risk counties were determined from 2005 to 2009 (control phase) and from 2010 to 2015 (elimination phase), respectively. The method of spatial scan statistics identified different 13 significant spatial clusters between 2005 and 2015. The space-time clustering analysis determined that the most likely cluster included 14 and 19 counties for indigenous and imported, respectively. In order to cope with the requirements of malaria elimination phase, the surveillance system should be strengthened particularity on the frequent migration regions as well as the effective multisectoral cooperation and coordination mechanisms. Specific response packages should be tailored among different types of cities and capacity building should also be improved mainly focus on the emergence response and case management. Fund guarantees for scientific research should be maintained both during the elimination and post-elimination phase to consolidate the achievements of malaria elimination.
Rapid Disaster Damage Estimation

NASA Astrophysics Data System (ADS)

Vu, T. T.

2012-07-01

The experiences from recent disaster events showed that detailed information derived from high-resolution satellite images could accommodate the requirements from damage analysts and disaster management practitioners. Richer information contained in such high-resolution images, however, increases the complexity of image analysis. As a result, few image analysis solutions can be practically used under time pressure in the context of post-disaster and emergency responses. To fill the gap in employment of remote sensing in disaster response, this research develops a rapid high-resolution satellite mapping solution built upon a dual-scale contextual framework to support damage estimation after a catastrophe. The target objects are building (or building blocks) and their condition. On the coarse processing level, statistical region merging deployed to group pixels into a number of coarse clusters. Based on majority rule of vegetation index, water and shadow index, it is possible to eliminate the irrelevant clusters. The remaining clusters likely consist of building structures and others. On the fine processing level details, within each considering clusters, smaller objects are formed using morphological analysis. Numerous indicators including spectral, textural and shape indices are computed to be used in a rule-based object classification. Computation time of raster-based analysis highly depends on the image size or number of processed pixels in order words. Breaking into 2 level processing helps to reduce the processed number of pixels and the redundancy of processing irrelevant information. In addition, it allows a data- and tasks- based parallel implementation. The performance is demonstrated with QuickBird images captured a disaster-affected area of Phanga, Thailand by the 2004 Indian Ocean tsunami are used for demonstration of the performance. The developed solution will be implemented in different platforms as well as a web processing service for operational uses.
Intersubject synchronisation analysis of brain activity associated with the instant effects of acupuncture: an fMRI study.

PubMed

Jin, Lingmin; Sun, Jinbo; Xu, Ziliang; Yang, Xuejuan; Liu, Peng; Qin, Wei

2018-02-01

To use a promising analytical method, namely intersubject synchronisation (ISS), to evaluate the brain activity associated with the instant effects of acupuncture and compare the findings with traditional general linear model (GLM) methods. 30 healthy volunteers were recruited for this study. Block-designed manual acupuncture stimuli were delivered at SP6, and de qi sensations were measured after acupuncture stimulation. All subjects underwent functional MRI (fMRI) scanning during the acupuncture stimuli. The fMRI data were separately analysed by ISS and traditional GLM methods. All subjects experienced de qi sensations. ISS analysis showed that the regions activated during acupuncture stimulation at SP6 were mainly divided into five clusters based on the time courses. The time courses of clusters 1 and 2 were in line with the acupuncture stimulation pattern, and the active regions were mainly involved in the sensorimotor system and salience network. Clusters 3, 4 and 5 displayed an almost contrary time course relative to the stimulation pattern. The brain regions activated included the default mode network, descending pain modulation pathway and visual cortices. GLM analysis indicated that the brain responses associated with the instant effects of acupuncture were largely implicated in sensory and motor processing and sensory integration. The ISS analysis considered the sustained effect of acupuncture and uncovered additional information not shown by GLM analysis. We suggest that ISS may be a suitable approach to investigate the brain responses associated with the instant effects of acupuncture. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Clustering Analysis of Antibiograms and Antibiogram Types of Streptococcus agalactiae Strains from Tilapia in China.

PubMed

Liu, Chan; Feng, Juan; Zhang, Defeng; Xie, Yundan; Li, Anxing; Wang, Jiangyong; Su, Youlu

2018-05-11

In view of the changing antibiotic-resistance profiles of Streptococcus agalactiae from tilapia in China, antimicrobial susceptibilities of 75 S. agalactiae strains were determined by the disc diffusion method, and cluster analyses of the antibiograms and antibiogram types were performed. All strains displayed multidrug resistance (MDR). The antimicrobial-resistance rates were highest (>90%) to aminoglycosides, sulfonamides, pipemidic acid, and norfloxacin, followed by penicillin, ampicillin, and ciprofloxacin (26.7-38.7%); those to furadantin, lincomycin, erythromycin, ofloxacin, tetracycline, and florfenicol were low (<10%), and no resistance to vancomycin, cefalexin, cefoxitin, amoxicillin, medemycin, doxitard, oxytetracycline, rifampin, chloramphenicol, or thiamphenicol was detected. Statistical analysis showed that the resistance rate to ciprofloxacin increased significantly in 2016 (p = 0.009), whereas that to trimethoprim/sulfamethoxazole decreased (p = 0.017). Cluster analyses identified that the strains had 23 antibiogram types (A-W) and clustered in five groups (Groups I-V). The strains with higher antimicrobial resistance mainly clustered in Groups I and II. Our results show that the antibiograms varied with time and by location and that antibiogram types are constantly updating and expanding. Effective measures must be taken to reduce the antimicrobial resistance and spread of MDR strains.
Density-cluster NMA: A new protein decomposition technique for coarse-grained normal mode analysis.

PubMed

Demerdash, Omar N A; Mitchell, Julie C

2012-07-01

Normal mode analysis has emerged as a useful technique for investigating protein motions on long time scales. This is largely due to the advent of coarse-graining techniques, particularly Hooke's Law-based potentials and the rotational-translational blocking (RTB) method for reducing the size of the force-constant matrix, the Hessian. Here we present a new method for domain decomposition for use in RTB that is based on hierarchical clustering of atomic density gradients, which we call Density-Cluster RTB (DCRTB). The method reduces the number of degrees of freedom by 85-90% compared with the standard blocking approaches. We compared the normal modes from DCRTB against standard RTB using 1-4 residues in sequence in a single block, with good agreement between the two methods. We also show that Density-Cluster RTB and standard RTB perform well in capturing the experimentally determined direction of conformational change. Significantly, we report superior correlation of DCRTB with B-factors compared with 1-4 residue per block RTB. Finally, we show significant reduction in computational cost for Density-Cluster RTB that is nearly 100-fold for many examples. Copyright © 2012 Wiley Periodicals, Inc.

Comparative study of cluster Ag17Cu2 by instantaneous normal mode analysis and by isothermal Brownian-type molecular dynamics simulation.

PubMed

Tang, Ping-Han; Wu, Ten-Ming; Yen, Tsung-Wen; Lai, S K; Hsu, P J

2011-09-07

We perform isothermal Brownian-type molecular dynamics simulations to obtain the velocity autocorrelation function and its time Fourier-transformed power spectral density for the metallic cluster Ag(17)Cu(2). The temperature dependences of these dynamical quantities from T = 0 to 1500 K were examined and across this temperature range the cluster melting temperature T(m), which we define to be the principal maximum position of the specific heat is determined. The instantaneous normal mode analysis is then used to dissect the cluster dynamics by calculating the vibrational instantaneous normal mode density of states and hence its frequency integrated value I(j) which is an ensemble average of all vibrational projection operators for the jth atom in the cluster. In addition to comparing the results with simulation data, we look more closely at the entities I(j) of all atoms using the point group symmetry and diagnose their temperature variations. We find that I(j) exhibit features that may be used to deduce T(m), which turns out to agree very well with those inferred from the power spectral density and specific heat. © 2011 American Institute of Physics
Late Life Employment Histories and Their Association With Work and Family Formation During Adulthood: A Sequence Analysis Based on ELSA.

PubMed

Wahrendorf, Morten; Zaninotto, Paola; Hoven, Hanno; Head, Jenny; Carr, Ewan

2017-05-31

To extend research on workforce participation beyond age 50 by describing entire employment histories in later life and testing their links to prior life course conditions. We use data from the English Longitudinal Study of Ageing, with retrospective information on employment histories between age 50 and 70 for 1,103 men and 1,195 women (n = 2,298). We apply sequence analysis and group respondents into eight clusters with similar histories. Using multinomial regressions, we then test their links to labor market participation, partnership, and parenthood histories during early (age 20-34) and mid-adulthood (age 35-49). Three clusters include histories dominated by full-time employees but with varying age of retirement (before, at, and after age 60). One cluster is dominated by self-employment with comparatively later retirement. Remaining clusters include part-time work (retirement around age 60 or no retirement), continuous domestic work (mostly women), or other forms of nonemployment. Those who had strong attachments to the labor market during adulthood are more likely to have histories of full-time work up until and beyond age 60, especially men. Parenthood in early adulthood is related to later retirement (for men only). Continued domestic work was not linked to parenthood. Partnered women tend to work part-time or do domestic work. The findings remain consistent after adjusting for birth cohort, childhood adversity, life course health, and occupational position. Policies aimed at increasing the proportion of older workers not only need to address later stages of the life course but also early and mid-adulthood. © The Author 2017. Published by Oxford University Press on behalf of The Gerontological Society of America.
Time-Series Monitoring of Open Star Clusters

NASA Astrophysics Data System (ADS)

Hojaev, A. S.; Semakov, D. G.

2006-08-01

Star clusters especially a compact ones (with diameter of few to ten arcmin) are suitable targets to search of light variability for orchestera of stars by means of ordinary Casegrain telescope plus CCD system. A special patroling with short time-fixed exposures and mmag accuracy could be used also to study of stellar oscillation for group of stars simultaneously. The last can be carried out both separately from one site and within international campaigns. Detection and study of optical variability of X-ray sources including X-ray binaries with compact objects might be as a result of a long-term monitoring of such clusters as well. We present the program of open star clusters monitoring with Zeiss 1 meter RCC telescope of Maidanak observatory has been recently automated. In combination with quite good seeing at this observatory (see, e.g., Sarazin, M. 1999, URL http://www.eso.org/gen-fac/pubs/astclim/) the automatic telescope equipped with large-format (2KX2K) CCD camera AP-10 available will allow to collect homogenious time-series for analysis. We already started this program in 2001 and had a set of patrol observations with Zeiss 0.6 meter telescope and AP-10 camera in 2003. 7 compact open clusters in the Milky Way (NGC 7801, King1, King 13, King18, King20, Berkeley 55, IC 4996) have been monitored for stellar variability and some results of photometry will be presented. A few interesting variables were discovered and dozens were suspected for variability to the moment in these clusters for the first time. We have made steps to join the Whole-Earth Telescope effort in its future campaigns.
Is the non-isothermal double β-model incompatible with no time evolution of galaxy cluster gas mass fraction?

NASA Astrophysics Data System (ADS)

Holanda, R. F. L.

2018-05-01

In this paper, we propose a new method to obtain the depletion factor γ(z), the ratio by which the measured baryon fraction in galaxy clusters is depleted with respect to the universal mean. We use exclusively galaxy cluster data, namely, X-ray gas mass fraction (fgas) and angular diameter distance measurements from Sunyaev-Zel'dovich effect plus X-ray observations. The galaxy clusters are the same in both data set and the non-isothermal spherical double β-model was used to describe their electron density and temperature profiles. In order to compare our results with those from recent cosmological hydrodynamical simulations, we suppose a possible time evolution for γ(z), such as, γ(z) =γ0(1 +γ1 z) . As main conclusions we found that: the γ0 value is in full agreement with the simulations. On the other hand, although the γ1 value found in our analysis is compatible with γ1 = 0 within 2σ c.l., our results show a non-negligible time evolution for the depletion factor, unlike the results of the simulations. However, we also put constraints on γ(z) by using the fgas measurements and angular diameter distances obtained from the flat ΛCDM model (Planck results) and from a sample of galaxy clusters described by an elliptical profile. For these cases no significant time evolution for γ(z) was found. Then, if a constant depletion factor is an inherent characteristic of these structures, our results show that the spherical double β-model used to describe the galaxy clusters considered does not affect the quality of their fgas measurements.
The Relationship of Dynamical Heterogeneity to the Adam-Gibbs and Random First-Order Transition Theories of Glass Formation

NASA Astrophysics Data System (ADS)

Starr, Francis; Douglas, Jack; Sastry, Srikanth

2013-03-01

We examine measures of dynamical heterogeneity for a bead-spring polymer melt and test how these scales compare with the scales hypothesized by the Adam and Gibbs (AG) and random first-order transition (RFOT) theories. We show that the time scale of the high-mobility clusters and strings is associated with a diffusive time scale, while the low-mobility particles' time scale relates to a structural relaxation time. The difference of the characteristic times naturally explains the decoupling of diffusion and structural relaxation time scales. We examine the appropriateness of identifying the size scales of mobile particle clusters or strings with the size of cooperatively rearranging regions (CRR) in the AG and RFOT theories. We find that the string size appears to be the most consistent measure of CRR for both the AG and RFOT models. Identifying strings or clusters with the``mosaic'' length of the RFOT model relaxes the conventional assumption that the``entropic droplet'' are compact. We also confirm the validity of the entropy formulation of the AG theory, constraining the exponent values of the RFOT theory. This constraint, together with the analysis of size scales, enables us to estimate the characteristic exponents of RFOT.
Autonomic specificity of basic emotions: evidence from pattern classification and cluster analysis.

PubMed

Stephens, Chad L; Christie, Israel C; Friedman, Bruce H

2010-07-01

Autonomic nervous system (ANS) specificity of emotion remains controversial in contemporary emotion research, and has received mixed support over decades of investigation. This study was designed to replicate and extend psychophysiological research, which has used multivariate pattern classification analysis (PCA) in support of ANS specificity. Forty-nine undergraduates (27 women) listened to emotion-inducing music and viewed affective films while a montage of ANS variables, including heart rate variability indices, peripheral vascular activity, systolic time intervals, and electrodermal activity, were recorded. Evidence for ANS discrimination of emotion was found via PCA with 44.6% of overall observations correctly classified into the predicted emotion conditions, using ANS variables (z=16.05, p<.001). Cluster analysis of these data indicated a lack of distinct clusters, which suggests that ANS responses to the stimuli were nomothetic and stimulus-specific rather than idiosyncratic and individual-specific. Collectively these results further confirm and extend support for the notion that basic emotions have distinct ANS signatures. Copyright © 2010 Elsevier B.V. All rights reserved.
Analysis of Transcriptional Regulation of the Human miR-17-92 Cluster; Evidence for Involvement of Pim-1

PubMed Central

Thomas, Maren; Lange-Grünweller, Kerstin; Hartmann, Dorothee; Golde, Lara; Schlereth, Julia; Streng, Dennis; Aigner, Achim; Grünweller, Arnold; Hartmann, Roland K.

2013-01-01

The human polycistronic miRNA cluster miR-17-92 is frequently overexpressed in hematopoietic malignancies and cancers. Its transcription is in part controlled by an E2F-regulated host gene promoter. An intronic A/T-rich region directly upstream of the miRNA coding region also contributes to cluster expression. Our deletion analysis of the A/T-rich region revealed a strong dependence on c-Myc binding to the functional E3 site. Yet, constructs lacking the 5′-proximal ~1.3 kb or 3′-distal ~0.1 kb of the 1.5 kb A/T-rich region still retained residual specific promoter activity, suggesting multiple transcription start sites (TSS) in this region. Furthermore, the protooncogenic kinase, Pim-1, its phosphorylation target HP1γ and c-Myc colocalize to the E3 region, as inferred from chromatin immunoprecipitation. Analysis of pri-miR-17-92 expression levels in K562 and HeLa cells revealed that silencing of E2F3, c-Myc or Pim-1 negatively affects cluster expression, with a synergistic effect caused by c-Myc/Pim-1 double knockdown in HeLa cells. Thus, we show, for the first time, that the protooncogene Pim-1 is part of the network that regulates transcription of the human miR-17-92 cluster. PMID:23749113
SCUD: fast structure clustering of decoys using reference state to remove overall rotation.

PubMed

Li, Hongzhi; Zhou, Yaoqi

2005-08-01

We developed a method for fast decoy clustering by using reference root-mean-squared distance (rRMSD) rather than commonly used pairwise RMSD (pRMSD) values. For 41 proteins with 2000 decoys each, the computing efficiency increases nine times without a significant change in the accuracy of near-native selections. Tests on additional protein decoys based on different reference conformations confirmed this result. Further analysis indicates that the pRMSD and rRMSD values are highly correlated (with an average correlation coefficient of 0.82) and the clusters obtained from pRMSD and rRMSD values are highly similar (the representative structures of the top five largest clusters from the two methods are 74% identical). SCUD (Structure ClUstering of Decoys) with an automatic cutoff value is available at http://theory.med.buffalo.edu. (c) 2005 Wiley Periodicals, Inc.
The interaction between atomic displacement cascades and tilt symmetrical grain boundaries in α-zirconium

NASA Astrophysics Data System (ADS)

Kapustin, P.; Svetukhin, V.; Tikhonchev, M.

2017-06-01

The atomic displacement cascade simulations near symmetric tilt grain boundaries (GBs) in hexagonal close packed-Zirconium were considered in this paper. Further defect structure analysis was conducted. Four symmetrical tilt GBs -∑14?, ∑14? with the axis of rotation [0 0 0 1] and ∑32?, ∑32? with the axis of rotation ? - were considered. The molecular dynamics method was used for atomic displacement cascades' simulation. A tendency of the point defects produced in the cascade to accumulate near the GB plane, which was an obstacle to the spread of the cascade, was discovered. The results of the point defects' clustering produced in the cascade were obtained. The clusters of both types were represented mainly by single point defects. At the same time, vacancies formed clusters of a large size (more than 20 vacancies per cluster), while self-interstitial atom clusters were small-sized.
Incremental fuzzy C medoids clustering of time series data using dynamic time warping distance

PubMed Central

Chen, Jingli; Wu, Shuai; Liu, Zhizhong; Chao, Hao

2018-01-01

Clustering time series data is of great significance since it could extract meaningful statistics and other characteristics. Especially in biomedical engineering, outstanding clustering algorithms for time series may help improve the health level of people. Considering data scale and time shifts of time series, in this paper, we introduce two incremental fuzzy clustering algorithms based on a Dynamic Time Warping (DTW) distance. For recruiting Single-Pass and Online patterns, our algorithms could handle large-scale time series data by splitting it into a set of chunks which are processed sequentially. Besides, our algorithms select DTW to measure distance of pair-wise time series and encourage higher clustering accuracy because DTW could determine an optimal match between any two time series by stretching or compressing segments of temporal data. Our new algorithms are compared to some existing prominent incremental fuzzy clustering algorithms on 12 benchmark time series datasets. The experimental results show that the proposed approaches could yield high quality clusters and were better than all the competitors in terms of clustering accuracy. PMID:29795600
Incremental fuzzy C medoids clustering of time series data using dynamic time warping distance.

PubMed

Liu, Yongli; Chen, Jingli; Wu, Shuai; Liu, Zhizhong; Chao, Hao

2018-01-01

Clustering time series data is of great significance since it could extract meaningful statistics and other characteristics. Especially in biomedical engineering, outstanding clustering algorithms for time series may help improve the health level of people. Considering data scale and time shifts of time series, in this paper, we introduce two incremental fuzzy clustering algorithms based on a Dynamic Time Warping (DTW) distance. For recruiting Single-Pass and Online patterns, our algorithms could handle large-scale time series data by splitting it into a set of chunks which are processed sequentially. Besides, our algorithms select DTW to measure distance of pair-wise time series and encourage higher clustering accuracy because DTW could determine an optimal match between any two time series by stretching or compressing segments of temporal data. Our new algorithms are compared to some existing prominent incremental fuzzy clustering algorithms on 12 benchmark time series datasets. The experimental results show that the proposed approaches could yield high quality clusters and were better than all the competitors in terms of clustering accuracy.
Technical structure of the global nanoscience and nanotechnology literature

NASA Astrophysics Data System (ADS)

Kostoff, Ronald N.; Koytcheff, Raymond G.; Lau, Clifford G. Y.

2007-10-01

Text mining was used to extract technical intelligence from the open source global nanotechnology and nanoscience research literature. An extensive nanotechnology/nanoscience-focused query was applied to the Science Citation Index/Social Science Citation Index (SCI/SSCI) databases. The nanotechnology/nanoscience research literature technical structure (taxonomy) was obtained using computational linguistics/document clustering and factor analysis. The infrastructure (prolific authors, key journals/institutions/countries, most cited authors/journals/documents) for each of the clusters generated by the document clustering algorithm was obtained using bibliometrics. Another novel addition was the use of phrase auto-correlation maps to show technical thrust areas based on phrase co-occurrence in Abstracts, and the use of phrase-phrase cross-correlation maps to show technical thrust areas based on phrase relations due to the sharing of common co-occurring phrases. The ˜400 most cited nanotechnology papers since 1991 were grouped, and their characteristics generated. Whereas the main analysis provided technical thrusts of all nanotechnology papers retrieved, analysis of the most cited papers allowed their characteristics to be displayed. Finally, most cited papers from selected time periods were extracted, along with all publications from those time periods, and the institutions and countries were compared based on their representation in the most cited documents list relative to their representation in the most publications list.
Super massive black hole in galactic nuclei with tidal disruption of stars

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhong, Shiyan; Berczik, Peter; Spurzem, Rainer

Tidal disruption of stars by super massive central black holes from dense star clusters is modeled by high-accuracy direct N-body simulation. The time evolution of the stellar tidal disruption rate, the effect of tidal disruption on the stellar density profile, and, for the first time, the detailed origin of tidally disrupted stars are carefully examined and compared with classic papers in the field. Up to 128k particles are used in simulation to model the star cluster around a super massive black hole, and we use the particle number and the tidal radius of the black hole as free parameters formore » a scaling analysis. The transition from full to empty loss-cone is analyzed in our data, and the tidal disruption rate scales with the particle number, N, in the expected way for both cases. For the first time in numerical simulations (under certain conditions) we can support the concept of a critical radius of Frank and Rees, which claims that most stars are tidally accreted on highly eccentric orbits originating from regions far outside the tidal radius. Due to the consumption of stars moving on radial orbits, a velocity anisotropy is found inside the cluster. Finally we estimate the real galactic center based on our simulation results and the scaling analysis.« less
Super Massive Black Hole in Galactic Nuclei with Tidal Disruption of Stars

NASA Astrophysics Data System (ADS)

Zhong, Shiyan; Berczik, Peter; Spurzem, Rainer

2014-09-01

Tidal disruption of stars by super massive central black holes from dense star clusters is modeled by high-accuracy direct N-body simulation. The time evolution of the stellar tidal disruption rate, the effect of tidal disruption on the stellar density profile, and, for the first time, the detailed origin of tidally disrupted stars are carefully examined and compared with classic papers in the field. Up to 128k particles are used in simulation to model the star cluster around a super massive black hole, and we use the particle number and the tidal radius of the black hole as free parameters for a scaling analysis. The transition from full to empty loss-cone is analyzed in our data, and the tidal disruption rate scales with the particle number, N, in the expected way for both cases. For the first time in numerical simulations (under certain conditions) we can support the concept of a critical radius of Frank & Rees, which claims that most stars are tidally accreted on highly eccentric orbits originating from regions far outside the tidal radius. Due to the consumption of stars moving on radial orbits, a velocity anisotropy is found inside the cluster. Finally we estimate the real galactic center based on our simulation results and the scaling analysis.
Analysis of Multi-Flight Common Routes for Traffic Flow Management

NASA Technical Reports Server (NTRS)

Sheth, Kapil; Clymer, Alexis; Morando, Alex; Shih, Fu-Tai

2016-01-01

This paper presents an approach for creating common weather avoidance reroutes for multiple flights and the associated benefits analysis, which is an extension of the single flight advisories generated using the Dynamic Weather Routes (DWR) concept. These multiple flight advisories are implemented in the National Airspace System (NAS) Constraint Evaluation and Notification Tool (NASCENT), a nation-wide simulation environment to generate time- and fuel-saving alternate routes for flights during severe weather events. These single flight advisories are clustered together in the same Center by considering parameters such as a common return capture fix. The clustering helps propose routes called, Multi-Flight Common Routes (MFCR), that avoid weather and other airspace constraints, and save time and fuel. It is expected that these routes would also provide lower workload for traffic managers and controllers since a common route is found for several flights, and presumably the route clearances would be easier and faster. This study was based on 30-days in 2014 and 2015 each, which had most delays attributed to convective weather. The results indicate that many opportunities exist where individual flight routes can be clustered to fly along a common route to save a significant amount of time and fuel, and potentially reducing the amount of coordination needed.
Long-term analysis of health status and preventive behavior in music students across an entire university program.

PubMed

Spahn, Claudia; Nusseck, Manfred; Zander, Mark

2014-03-01

The aim of this investigation was to analyze longitudinal data concerning physical and psychological health, playing-related problems, and preventive behavior among music students across their complete 4- to 5-year study period. In a longitudinal, observational study, we followed students during their university training and measured their psychological and physical health status and preventive behavior using standardized questionnaires at four different times. The data were in accordance with previous findings. They demonstrated three groups of health characteristics observed in beginners of music study: healthy students (cluster 1), students with preclinical symptoms (cluster 2), and students who are clinically symptomatic (cluster 3). In total, 64% of all students remained in the same cluster group during their whole university training. About 10% of the students showed considerable health problems and belonged to the third cluster group. The three clusters of health characteristics found in this longitudinal study with music students necessitate that prevention programs for musicians must be adapted to the target audience.
Predicting Educational Outcomes and Psychological Well-Being in Adolescents Using Time Attitude Profiles

ERIC Educational Resources Information Center

Andretta, James R.; Worrell, Frank C.; Mello, Zena R.

2014-01-01

Using cluster analysis of Adolescent Time Attitude Scale (ATAS) scores in a sample of 300 adolescents ("M" age = 16 years; "SD" = 1.25; 60% male; 41% European American; 25.3% Asian American; 11% African American; 10.3% Latino), the authors identified five time attitude profiles based on positive and negative attitudes toward…
Cluster formation in laser-induced ablation and evaporation of solids observed by laser ionization time-of-flight mass spectrometry and scanning tunneling microscopy

NASA Astrophysics Data System (ADS)

Tench, R. J.; Balooch, M.; Bernardez, L.; Allen, Mike J.; Siekhaus, W. J.; Olander, D. R.; Wang, W.

1990-04-01

Laser ionization time-of-flight mass analysis (LIMA) used pulses (5ns) of a frequency-quadrupled Nd-YAG laser (266 nm) focused onto spots of 4 to 100 microns diameter to ablate material, and a reflectron time of flight tube to mass-analyze the plume. The observed mass spectra for Si, Pt, SiC, and UO 2 varied in the distribution of ablation products among atoms, molecules and clusters, depending on laser power density and target material. Cleaved surfaces of highly oriented pyrolytic graphite (HOPG) positioned at room temperature either 10 cm away from materials ablated at 10(exp -5) Torr by 1 to 3 excimer laser (308 nm) pulses of 20 ns duration or 1 m away from materials vaporized at 10(exp -8) Torr by 10 Nd-Glass laser pulses of 1 ms duration were analyzed by Scanning Tunneling Microscopy (STM) in air with angstrom resolution. Clusters up to 30 A in diameter were observed.
EXPLORING FUNCTIONAL CONNECTIVITY IN FMRI VIA CLUSTERING.

PubMed

Venkataraman, Archana; Van Dijk, Koene R A; Buckner, Randy L; Golland, Polina

2009-04-01

In this paper we investigate the use of data driven clustering methods for functional connectivity analysis in fMRI. In particular, we consider the K-Means and Spectral Clustering algorithms as alternatives to the commonly used Seed-Based Analysis. To enable clustering of the entire brain volume, we use the Nyström Method to approximate the necessary spectral decompositions. We apply K-Means, Spectral Clustering and Seed-Based Analysis to resting-state fMRI data collected from 45 healthy young adults. Without placing any a priori constraints, both clustering methods yield partitions that are associated with brain systems previously identified via Seed-Based Analysis. Our empirical results suggest that clustering provides a valuable tool for functional connectivity analysis.
Multi-Spatiotemporal Patterns of Residential Burglary Crimes in Chicago: 2006-2016

NASA Astrophysics Data System (ADS)

Luo, J.

2017-10-01

This research attempts to explore the patterns of burglary crimes at multi-spatiotemporal scales in Chicago between 2006 and 2016. Two spatial scales are investigated that are census block and police beat area. At each spatial scale, three temporal scales are integrated to make spatiotemporal slices: hourly scale with two-hour time step from 12:00am to the end of the day; daily scale with one-day step from Sunday to Saturday within a week; monthly scale with one-month step from January to December. A total of six types of spatiotemporal slices will be created as the base for the analysis. Burglary crimes are spatiotemporally aggregated to spatiotemporal slices based on where and when they occurred. For each type of spatiotemporal slices with burglary occurrences integrated, spatiotemporal neighborhood will be defined and managed in a spatiotemporal matrix. Hot-spot analysis will identify spatiotemporal clusters of each type of spatiotemporal slices. Spatiotemporal trend analysis is conducted to indicate how the clusters shift in space and time. The analysis results will provide helpful information for better target policing and crime prevention policy such as police patrol scheduling regarding times and places covered.

Calculating the Motion and Direction of Flux Transfer Events with Cluster

NASA Technical Reports Server (NTRS)

Collado-Vega, Yaireska M.; Sibeck, David Gary

2011-01-01

We use multi-point timing analysis to determine the orientation and motion of flux transfer events (FTEs) detected by the four Cluster spacecraft on the high-latitude dayside and flank magnetopause during 2002 and 2003. During these years, the distances between the Cluster spacecraft were greater than 1000 km, providing the tetrahedral configuration needed to select events and determine velocities. Each velocity and location will be examined in detail and compared to the velocities and locations determined by the predictions of the component and antiparallel reconnection models for event formation, orientation, motion, and acceleration for a wide range of spacecraft locations and solar wind conditions.
Determination of clusters and factors associated with dengue dispersion during the first epidemic related to Dengue virus serotype 4 in Vitória, Brazil

PubMed Central

Herbinger, Karl-Heinz; Cerutti Junior, Crispim; Malta Romano, Camila; de Souza Areias Cabidelle, Aline; Fröschl, Günter

2017-01-01

Dengue occurrence is partially influenced by the immune status of the population. Consequently, the introduction of a new Dengue virus serotype can trigger explosive epidemics in susceptible populations. The determination of clusters in this scenario can help to identify hotspots and understand the disease dispersion regardless of the influence of the population herd immunity. The present study evaluated the pattern and factors associated with dengue dispersion during the first epidemic related to Dengue virus serotype 4 in Vitória, Espírito Santo state, Brazil. Data on 18,861 dengue cases reported in Vitória from September 2012 to June 2013 were included in the study. The analysis of spatial variation in temporal trend was performed to detect clusters that were compared by their respective relative risk, house index, population density, and income in an ecological study. Overall, 11 clusters were detected. The time trend increase of dengue incidence in the overall study population was 636%. The five clusters that showed a lower time trend increase than the overall population presented a higher incidence in the beginning of the epidemic and, compared to the six clusters with higher time trend increase, they presented higher relative risk for their inhabitants to acquire dengue infection (P-value = 0.02) and a lower income (P-value <0.01). House index and population density did not differ between the clusters. Early increase of dengue incidence and higher relative risk for acquiring dengue infection were favored in low-income areas. Preventive actions and improvement of infrastructure in low-income areas should be prioritized in order to diminish the magnitude of dengue dispersion after the introduction of a new serotype. PMID:28388694
A Massive Cluster in its Youth: the Fundamental Plane, Kinematics, and Ages for Cluster Galaxies at z = 1.80 in JKCS 041

NASA Astrophysics Data System (ADS)

Prichard, Laura Jane; Davies, Roger L.; Beifiori, Alessandra; Chan, Jeffrey C. C.; Cappellari, Michele; Houghton, Ryan C. W.; Mendel, Trevor; Bender, Ralf; Galametz, Audrey; Saglia, Roberto P.; Smith, Russell; Stott, John P.; Wilman, David J.; Lewis, Ian J.; Sharples, Ray; Wegner, Michael

2018-01-01

Galaxy clusters are the largest gravitationally bound structures in the Universe, and we know that early type galaxies (ETGs) are more common towards their centers. Clusters of galaxies are increasingly rare at early times, but are essential for understanding the formation of these massive structures and how they alter the fate of their member galaxies. However, long integration times are required to constrain the stellar properties of these distant cluster ETGs. Now with the advent of the multiplexed near-infrared integral field instrument, the K-band Multi-Object Spectrograph (KMOS) on the Very Large Telescope, we can target the ETGs in these valuable high-redshift clusters more efficiently than ever. The KMOS guaranteed observing program, the KMOS Cluster Survey (KCS; P.I.s Bender & Davies), has enabled a study of cluster galaxies in overdensities spanning z=1-2 through absorption-line spectroscopy obtained from 20-hour integrations. We will present spectra for 16 galaxies in the furthest KCS overdensity, JKCS 041, an ETG-rich cluster at z=1.80. We measured seven velocity dispersions from the quiescent galaxy spectra, expanding the sample of like measurements in the literature at or above z=1.80 by more than 40%. Through the analysis of Hubble Space Telescope photometry and deep absorption-line spectroscopy, we were able to construct the highest redshift fundamental plane (FP) within a single system for galaxies in JKCS 041. From the redshift evolution of the FP zero-point, we derived a mean age of the galaxies in this cluster of 1.4 +/- 0.2 Gyrs. We determined relative velocities of the galaxies to study the three-dimensional structure of this overdensity. We noticed from the dynamics of JKCS 041 that a group of galaxies was infalling towards the cluster center. When measuring FP ages for the infalling group, we found these galaxies had significantly younger mean ages (0.3 +/- 0.2 Gyrs) than the other galaxies in the cluster (2.0 +0.3/-0.1 Gyrs). Based on the galaxy dynamics, cluster morphology, and galaxy stellar age results, we concluded that JKCS 041 is in formation and consists of two merging groups of galaxies. This could link galaxy ages to large-scale structure for the first time at this redshift.
Fingerprint analysis of Hibiscus mutabilis L. leaves based on ultra performance liquid chromatography with photodiode array detector combined with similarity analysis and hierarchical clustering analysis methods

PubMed Central

Liang, Xianrui; Ma, Meiling; Su, Weike

2013-01-01

Background: A method for chemical fingerprint analysis of Hibiscus mutabilis L. leaves was developed based on ultra performance liquid chromatography with photodiode array detector (UPLC-PAD) combined with similarity analysis (SA) and hierarchical clustering analysis (HCA). Materials and Methods: 10 batches of Hibiscus mutabilis L. leaves samples were collected from different regions of China. UPLC-PAD was employed to collect chemical fingerprints of Hibiscus mutabilis L. leaves. Results: The relative standard deviations (RSDs) of the relative retention times (RRT) and relative peak areas (RPA) of 10 characteristic peaks (one of them was identified as rutin) in precision, repeatability and stability test were less than 3%, and the method of fingerprint analysis was validated to be suitable for the Hibiscus mutabilis L. leaves. Conclusions: The chromatographic fingerprints showed abundant diversity of chemical constituents qualitatively in the 10 batches of Hibiscus mutabilis L. leaves samples from different locations by similarity analysis on basis of calculating the correlation coefficients between each two fingerprints. Moreover, the HCA method clustered the samples into four classes, and the HCA dendrogram showed the close or distant relations among the 10 samples, which was consistent to the SA result to some extent. PMID:23930008
ClusterViz: A Cytoscape APP for Cluster Analysis of Biological Network.

PubMed

Wang, Jianxin; Zhong, Jiancheng; Chen, Gang; Li, Min; Wu, Fang-xiang; Pan, Yi

2015-01-01

Cluster analysis of biological networks is one of the most important approaches for identifying functional modules and predicting protein functions. Furthermore, visualization of clustering results is crucial to uncover the structure of biological networks. In this paper, ClusterViz, an APP of Cytoscape 3 for cluster analysis and visualization, has been developed. In order to reduce complexity and enable extendibility for ClusterViz, we designed the architecture of ClusterViz based on the framework of Open Services Gateway Initiative. According to the architecture, the implementation of ClusterViz is partitioned into three modules including interface of ClusterViz, clustering algorithms and visualization and export. ClusterViz fascinates the comparison of the results of different algorithms to do further related analysis. Three commonly used clustering algorithms, FAG-EC, EAGLE and MCODE, are included in the current version. Due to adopting the abstract interface of algorithms in module of the clustering algorithms, more clustering algorithms can be included for the future use. To illustrate usability of ClusterViz, we provided three examples with detailed steps from the important scientific articles, which show that our tool has helped several research teams do their research work on the mechanism of the biological networks.
Decomposition of Proteins into Dynamic Units from Atomic Cross-Correlation Functions.

PubMed

Calligari, Paolo; Gerolin, Marco; Abergel, Daniel; Polimeno, Antonino

2017-01-10

In this article, we present a clustering method of atoms in proteins based on the analysis of the correlation times of interatomic distance correlation functions computed from MD simulations. The goal is to provide a coarse-grained description of the protein in terms of fewer elements that can be treated as dynamically independent subunits. Importantly, this domain decomposition method does not take into account structural properties of the protein. Instead, the clustering of protein residues in terms of networks of dynamically correlated domains is defined on the basis of the effective correlation times of the pair distance correlation functions. For these properties, our method stands as a complementary analysis to the customary protein decomposition in terms of quasi-rigid, structure-based domains. Results obtained for a prototypal protein structure illustrate the approach proposed.
Analysis of the Seismicity Preceding Large Earthquakes

NASA Astrophysics Data System (ADS)

Stallone, A.; Marzocchi, W.

2016-12-01

The most common earthquake forecasting models assume that the magnitude of the next earthquake is independent from the past. This feature is probably one of the most severe limitations of the capability to forecast large earthquakes.In this work, we investigate empirically on this specific aspect, exploring whether spatial-temporal variations in seismicity encode some information on the magnitude of the future earthquakes. For this purpose, and to verify the universality of the findings, we consider seismic catalogs covering quite different space-time-magnitude windows, such as the Alto Tiberina Near Fault Observatory (TABOO) catalogue, and the California and Japanese seismic catalog. Our method is inspired by the statistical methodology proposed by Zaliapin (2013) to distinguish triggered and background earthquakes, using the nearest-neighbor clustering analysis in a two-dimension plan defined by rescaled time and space. In particular, we generalize the metric based on the nearest-neighbor to a metric based on the k-nearest-neighbors clustering analysis that allows us to consider the overall space-time-magnitude distribution of k-earthquakes (k-foreshocks) which anticipate one target event (the mainshock); then we analyze the statistical properties of the clusters identified in this rescaled space. In essence, the main goal of this study is to verify if different classes of mainshock magnitudes are characterized by distinctive k-foreshocks distribution. The final step is to show how the findings of this work may (or not) improve the skill of existing earthquake forecasting models.
Visceral leishmaniasis in the state of Sao Paulo, Brazil: spatial and space-time analysis

PubMed Central

Cardim, Marisa Furtado Mozini; Guirado, Marluci Monteiro; Dibo, Margareth Regina; Chiaravalloti, Francisco

2016-01-01

ABSTRACT OBJECTIVE To perform both space and space-time evaluations of visceral leishmaniasis in humans in the state of Sao Paulo, Brazil. METHODS The population considered in the study comprised autochthonous cases of visceral leishmaniasis and deaths resulting from it in Sao Paulo, between 1999 and 2013. The analysis considered the western region of the state as its studied area. Thematic maps were created to show visceral leishmaniasis dissemination in humans in the municipality. Spatial analysis tools Kernel and Kernel ratio were used to respectively obtain the distribution of cases and deaths and the distribution of incidence and mortality. Scan statistics were used in order to identify spatial and space-time clusters of cases and deaths. RESULTS The visceral leishmaniasis cases in humans, during the studied period, were observed to occur in the western portion of Sao Paulo, and their territorial extension mainly followed the eastbound course of the Marechal Rondon highway. The incidences were characterized as two sequences of concentric ellipses of decreasing intensities. The first and more intense one was found to have its epicenter in the municipality of Castilho (where the Marechal Rondon highway crosses the border of the state of Mato Grosso do Sul) and the second one in Bauru. Mortality was found to have a similar behavior to incidence. The spatial and space-time clusters of cases were observed to coincide with the two areas of highest incidence. Both the space-time clusters identified, even without coinciding in time, were started three years after the human cases were detected and had the same duration, that is, six years. CONCLUSIONS The expansion of visceral leishmaniasis in Sao Paulo has been taking place in an eastbound direction, focusing on the role of highways, especially Marechal Rondon, in this process. The space-time analysis detected the disease occurred in cycles, in different spaces and time periods. These meetings, if considered, may contribute to the adoption of actions that aim to prevent the disease from spreading throughout the whole territory of São Paulo or to at least reducing its expansion speed. PMID:27533364
XMM-Newton X-ray and HST weak gravitational lensing study of the extremely X-ray luminous galaxy cluster Cl J120958.9+495352 (z = 0.902)

NASA Astrophysics Data System (ADS)

Thölken, Sophia; Schrabback, Tim; Reiprich, Thomas H.; Lovisari, Lorenzo; Allen, Steven W.; Hoekstra, Henk; Applegate, Douglas; Buddendiek, Axel; Hicks, Amalia

2018-03-01

Context. Observations of relaxed, massive, and distant clusters can provide important tests of standard cosmological models, for example by using the gas mass fraction. To perform this test, the dynamical state of the cluster and its gas properties have to be investigated. X-ray analyses provide one of the best opportunities to access this information and to determine important properties such as temperature profiles, gas mass, and the total X-ray hydrostatic mass. For the last of these, weak gravitational lensing analyses are complementary independent probes that are essential in order to test whether X-ray masses could be biased. Aims: We study the very luminous, high redshift (z = 0.902) galaxy cluster Cl J120958.9+495352 using XMM-Newton data. We measure global cluster properties and study the temperature profile and the cooling time to investigate the dynamical status with respect to the presence of a cool core. We use Hubble Space Telescope (HST) weak lensing data to estimate its total mass and determine the gas mass fraction. Methods: We perform a spectral analysis using an XMM-Newton observation of 15 ks cleaned exposure time. As the treatment of the background is crucial, we use two different approaches to account for the background emission to verify our results. We account for point spread function effects and deproject our results to estimate the gas mass fraction of the cluster. We measure weak lensing galaxy shapes from mosaic HST imaging and select background galaxies photometrically in combination with imaging data from the William Herschel Telescope. Results: The X-ray luminosity of Cl J120958.9+495352 in the 0.1-2.4 keV band estimated from our XMM-Newton data is LX = (13.4+1.2-1.0) × 1044 erg/s and thus it is one of the most X-ray luminous clusters known at similarly high redshift. We find clear indications for the presence of a cool core from the temperature profile and the central cooling time, which is very rare at such high redshifts. Based on the weak lensing analysis, we estimate a cluster mass of M500/1014 M⊙ = 4.4+2.2-2.0 (stat.) + 0.6 (sys.) and a gas mass fraction of fgas,2500 = 0.11-0.03+0.06 in good agreement with previous findings for high redshift and local clusters.
Clustering of worry appraisals among college students.

PubMed

Schwab, Nicholas G; Cullum, Jerry C; Harton, Helen C

2016-01-01

The present study investigated the potential clustering of worry appraisals within college social networks. Participants living in campus residence buildings responded to online surveys across the course of several months. Worry appraisals were measured 10 weeks into the fall semester and again approximately 6 months later. Analysis of sociometric data suggests that the majority of participants' social interactions occurred within their respective residence building floors, indicating that proximity strongly influenced the development of social network ties and sources of social influence. Further, significant clustering of worry appraisals occurred across time, and more importantly, within residence building floors. The present findings compliment previous work suggesting that several physical and psychological states appear to spread and cluster within social networks. Implications for the study of emotional appraisals and future research are discussed.
Clustering and assembly dynamics of a one-dimensional microphase former.

PubMed

Hu, Yi; Charbonneau, Patrick

2018-05-23

Both ordered and disordered microphases ubiquitously form in suspensions of particles that interact through competing short-range attraction and long-range repulsion (SALR). While ordered microphases are more appealing materials targets, understanding the rich structural and dynamical properties of their disordered counterparts is essential to controlling their mesoscale assembly. Here, we study the disordered regime of a one-dimensional (1D) SALR model, whose simplicity enables detailed analysis by transfer matrices and Monte Carlo simulations. We first characterize the signature of the clustering process on macroscopic observables, and then assess the equilibration dynamics of various simulation algorithms. We notably find that cluster moves markedly accelerate the mixing time, but that event chains are of limited help in the clustering regime. These insights will inspire further study of three-dimensional microphase formers.
Anthocyanins in the bracts of Curcuma species and relationship of the species based on anthocyanin composition.

PubMed

Koshioka, Masaji; Umegaki, Naoko; Boontiang, Kriangsuk; Pornchuti, Witayaporn; Thammasiri, Kanchit; Yamaguchi, Satoshi; Tatsuzawa, Fumi; Nakayama, Masayoshi; Tateishi, Akira; Kubota, Satoshi

2015-03-01

Five anthocyanins, delphinidin 3-O-rutinoside, cyanidin 3-O-rutinoside, petunidin 3-O-rutinoside, malvidin 3-O-glucoside and malvidin 3-O-rutinoside, were identified. Three anthocyanins, delphinidin 3-O-glucoside, cyanidin 3-O-glucoside and pelargonidin 3-O-rutinoside, were putatively identified based on C18 HPLC retention time, absorption spectrum, including λmax, and comparisons with those of corresponding standard anthocyanins, as the compounds responsible for the pink to purple-red pigmentation of the bracts of Curcuma alismatifolia and five related species. Cluster analysis based on four major anthocyanins formed two clusters. One consisted of only one species, C. alismatifolia, and the other consisted of five. Each cluster further formed sub-clusters depending on either species or habitats.
Theoretical Analysis of Optical Absorption and Emission in Mixed Noble Metal Nanoclusters.

PubMed

Day, Paul N; Pachter, Ruth; Nguyen, Kiet A

2018-04-26

In this work, we studied theoretically two hybrid gold-silver clusters, which were reported to have dual-band emission, using density functional theory (DFT) and linear and quadratic response time-dependent DFT (TDDFT). Hybrid functionals were found to successfully predict absorption and emission, although explanation of the NIR emission from the larger cluster (cluster 1) requires significant vibrational excitation in the final state. For the smaller cluster (cluster 2), the Δ H(0-0) value calculated for the T1 → S0 transition, using the PBE0 functional, is in good agreement with the measured NIR emission, and the calculated T2 → S0 value is in fair agreement with the measured visible emission. The calculated T1 → S0 phosphorescence Δ H(0-0) for cluster 1 is close to the measured visible emission energy. In order for the calculated phosphorescence for cluster 1 to agree with the intense NIR emission reported experimentally, the vibrational energy of the final state (S0) is required to be about 0.7 eV greater than the zero-point vibrational energy.
Spatial-temporal epidemiology of human Salmonella Enteritidis infections with major phage types (PTs 1, 4, 5b, 8, 13, and 13a) in Ontario, Canada, 2008-2009.

PubMed

Varga, Csaba; Pearl, David L; McEwen, Scott A; Sargeant, Jan M; Pollari, Frank; Guerin, Michele T

2015-12-17

In Ontario and Canada, the incidence of human Salmonella enterica serotype Enteritidis (S. Enteritidis) infections have increased steadily during the last decade. Our study evaluated the spatial and temporal epidemiology of the major phage types (PTs) of S. Enteritidis infections to aid public health practitioners design effective prevention and control programs. Data on S. Enteritidis infections between January 1, 2008 and December 31, 2009 were obtained from Ontario's disease surveillance system. Salmonella Enteritidis infections with major phage types were classified by their annual health region-level incidence rates (IRs), monthly IRs, clinical symptoms, and exposure settings. A scan statistic was employed to detect retrospective phage type-specific spatial, temporal, and space-time clusters of S. Enteritidis infections. Space-time cluster cases' exposure settings were evaluated to identify common exposures. 1,336 cases were available for analysis. The six most frequently reported S. Enteritidis PTs were 8 (n = 398), 13a (n = 218), 13 (n = 198), 1 (n = 132), 5b (n = 83), and 4 (n = 76). Reported rates of S. Enteritidis infections with major phage types varied by health region and month. International travel and unknown exposure settings were the most frequently reported settings for PT 5b, 4, and 1 cases, whereas unknown exposure setting, private home, food premise, and international travel were the most frequently reported settings for PT 8, 13, and 13a cases. Diarrhea, abdominal pain, and fever were the most commonly reported clinical symptoms. A number of phage type-specific spatial, temporal, and space-time clusters were identified. Space-time clusters of PTs 1, 4, and 5b occurred mainly during the winter and spring months in the North West, North East, Eastern, Central East, and Central West regions. Space-time clusters of PTs 13 and 13a occurred at different times of the year in the Toronto region. Space-time clusters of PT 8 occurred at different times of the year in the North West and South West regions. Phage type-specific differences in exposure settings, and spatial-temporal clustering of S. Enteritidis infections were demonstrated that might guide public health surveillance of disease outbreaks. Our study methodology could be applied to other foodborne disease surveillance data to detect retrospective high disease rate clusters, which could aid public health authorities in developing effective prevention and control programs.
Endohedral gallide cluster superconductors and superconductivity in ReGa5.

PubMed

Xie, Weiwei; Luo, Huixia; Phelan, Brendan F; Klimczuk, Tomasz; Cevallos, Francois Alexandre; Cava, Robert Joseph

2015-12-22

We present transition metal-embedded (T@Gan) endohedral Ga-clusters as a favorable structural motif for superconductivity and develop empirical, molecule-based, electron counting rules that govern the hierarchical architectures that the clusters assume in binary phases. Among the binary T@Gan endohedral cluster systems, Mo8Ga41, Mo6Ga31, Rh2Ga9, and Ir2Ga9 are all previously known superconductors. The well-known exotic superconductor PuCoGa5 and related phases are also members of this endohedral gallide cluster family. We show that electron-deficient compounds like Mo8Ga41 prefer architectures with vertex-sharing gallium clusters, whereas electron-rich compounds, like PdGa5, prefer edge-sharing cluster architectures. The superconducting transition temperatures are highest for the electron-poor, corner-sharing architectures. Based on this analysis, the previously unknown endohedral cluster compound ReGa5 is postulated to exist at an intermediate electron count and a mix of corner sharing and edge sharing cluster architectures. The empirical prediction is shown to be correct and leads to the discovery of superconductivity in ReGa5. The Fermi levels for endohedral gallide cluster compounds are located in deep pseudogaps in the electronic densities of states, an important factor in determining their chemical stability, while at the same time limiting their superconducting transition temperatures.
Surrogate Reservoir Model

NASA Astrophysics Data System (ADS)

Mohaghegh, Shahab

2010-05-01

Surrogate Reservoir Model (SRM) is new solution for fast track, comprehensive reservoir analysis (solving both direct and inverse problems) using existing reservoir simulation models. SRM is defined as a replica of the full field reservoir simulation model that runs and provides accurate results in real-time (one simulation run takes only a fraction of a second). SRM mimics the capabilities of a full field model with high accuracy. Reservoir simulation is the industry standard for reservoir management. It is used in all phases of field development in the oil and gas industry. The routine of simulation studies calls for integration of static and dynamic measurements into the reservoir model. Full field reservoir simulation models have become the major source of information for analysis, prediction and decision making. Large prolific fields usually go through several versions (updates) of their model. Each new version usually is a major improvement over the previous version. The updated model includes the latest available information incorporated along with adjustments that usually are the result of single-well or multi-well history matching. As the number of reservoir layers (thickness of the formations) increases, the number of cells representing the model approaches several millions. As the reservoir models grow in size, so does the time that is required for each run. Schemes such as grid computing and parallel processing helps to a certain degree but do not provide the required speed for tasks such as: field development strategies using comprehensive reservoir analysis, solving the inverse problem for injection/production optimization, quantifying uncertainties associated with the geological model and real-time optimization and decision making. These types of analyses require hundreds or thousands of runs. Furthermore, with the new push for smart fields in the oil/gas industry that is a natural growth of smart completion and smart wells, the need for real time reservoir modeling becomes more pronounced. SRM is developed using the state of the art in neural computing and fuzzy pattern recognition to address the ever growing need in the oil and gas industry to perform accurate, but high speed simulation and modeling. Unlike conventional geo-statistical approaches (response surfaces, proxy models …) that require hundreds of simulation runs for development, SRM is developed only with a few (from 10 to 30 runs) simulation runs. SRM can be developed regularly (as new versions of the full field model become available) off-line and can be put online for real-time processing to guide important decisions. SRM has proven its value in the field. An SRM was developed for a giant oil field in the Middle East. The model included about one million grid blocks with more than 165 horizontal wells and took ten hours for a single run on 12 parallel CPUs. Using only 10 simulation runs, an SRM was developed that was able to accurately mimic the behavior of the reservoir simulation model. Performing a comprehensive reservoir analysis that included making millions of SRM runs, wells in the field were divided into five clusters. It was predicted that wells in cluster one & two are best candidates for rate relaxation with minimal, long term water production while wells in clusters four and five are susceptive to high water cuts. Two and a half years and 20 wells later, rate relaxation results from the field proved that all the predictions made by the SRM analysis were correct. While incremental oil production increased in all wells (wells in clusters 1 produced the most followed by wells in cluster 2, 3 …) the percent change in average monthly water cut for wells in each cluster clearly demonstrated the analytic power of SRM. As it was correctly predicted, wells in clusters 1 and 2 actually experience a reduction in water cut while a substantial increase in water cut was observed in wells classified into clusters 4 and 5. Performing these analyses would have been impossible using the original full field simulation model.
`Inter-Arrival Time' Inspired Algorithm and its Application in Clustering and Molecular Phylogeny

NASA Astrophysics Data System (ADS)

Kolekar, Pandurang S.; Kale, Mohan M.; Kulkarni-Kale, Urmila

2010-10-01

Bioinformatics, being multidisciplinary field, involves applications of various methods from allied areas of Science for data mining using computational approaches. Clustering and molecular phylogeny is one of the key areas in Bioinformatics, which help in study of classification and evolution of organisms. Molecular phylogeny algorithms can be divided into distance based and character based methods. But most of these methods are dependent on pre-alignment of sequences and become computationally intensive with increase in size of data and hence demand alternative efficient approaches. `Inter arrival time distribution' (IATD) is a popular concept in the theory of stochastic system modeling but its potential in molecular data analysis has not been fully explored. The present study reports application of IATD in Bioinformatics for clustering and molecular phylogeny. The proposed method provides IATDs of nucleotides in genomic sequences. The distance function based on statistical parameters of IATDs is proposed and distance matrix thus obtained is used for the purpose of clustering and molecular phylogeny. The method is applied on a dataset of 3' non-coding region sequences (NCR) of Dengue virus type 3 (DENV-3), subtype III, reported in 2008. The phylogram thus obtained revealed the geographical distribution of DENV-3 isolates. Sri Lankan DENV-3 isolates were further observed to be clustered in two sub-clades corresponding to pre and post Dengue hemorrhagic fever emergence groups. These results are consistent with those reported earlier, which are obtained using pre-aligned sequence data as an input. These findings encourage applications of the IATD based method in molecular phylogenetic analysis in particular and data mining in general.
The implementation of two stages clustering (k-means clustering and adaptive neuro fuzzy inference system) for prediction of medicine need based on medical data

NASA Astrophysics Data System (ADS)

Husein, A. M.; Harahap, M.; Aisyah, S.; Purba, W.; Muhazir, A.

2018-03-01

Medication planning aim to get types, amount of medicine according to needs, and avoid the emptiness medicine based on patterns of disease. In making the medicine planning is still rely on ability and leadership experience, this is due to take a long time, skill, difficult to obtain a definite disease data, need a good record keeping and reporting, and the dependence of the budget resulted in planning is not going well, and lead to frequent lack and excess of medicines. In this research, we propose Adaptive Neuro Fuzzy Inference System (ANFIS) method to predict medication needs in 2016 and 2017 based on medical data in 2015 and 2016 from two source of hospital. The framework of analysis using two approaches. The first phase is implementing ANFIS to a data source, while the second approach we keep using ANFIS, but after the process of clustering from K-Means algorithm, both approaches are calculated values of Root Mean Square Error (RMSE) for training and testing. From the testing result, the proposed method with better prediction rates based on the evaluation analysis of quantitative and qualitative compared with existing systems, however the implementation of K-Means Algorithm against ANFIS have an effect on the timing of the training process and provide a classification accuracy significantly better without clustering.
Spatio-Temporal Dynamics of Asymptomatic Malaria: Bridging the Gap Between Annual Malaria Resurgences in a Sahelian Environment.

PubMed

Coulibaly, Drissa; Travassos, Mark A; Tolo, Youssouf; Laurens, Matthew B; Kone, Abdoulaye K; Traore, Karim; Sissoko, Mody; Niangaly, Amadou; Diarra, Issa; Daou, Modibo; Guindo, Boureima; Rebaudet, Stanislas; Kouriba, Bourema; Dessay, Nadine; Piarroux, Renaud; Plowe, Christopher V; Doumbo, Ogobara K; Thera, Mahamadou A; Gaudart, Jean

2017-12-01

In areas of seasonal malaria transmission, the incidence rate of malaria infection is presumed to be near zero at the end of the dry season. Asymptomatic individuals may constitute a major parasite reservoir during this time. We conducted a longitudinal analysis of the spatio-temporal distribution of clinical malaria and asymptomatic parasitemia over time in a Malian town to highlight these malaria transmission dynamics. For a cohort of 300 rural children followed over 2009-2014, periodicity and phase shift between malaria and rainfall were determined by spectral analysis. Spatial risk clusters of clinical episodes or carriage were identified. A nested-case-control study was conducted to assess the parasite carriage factors. Malaria infection persisted over the entire year with seasonal peaks. High transmission periods began 2-3 months after the rains began. A cluster with a low risk of clinical malaria in the town center persisted in high and low transmission periods. Throughout 2009-2014, cluster locations did not vary from year to year. Asymptomatic and gametocyte carriage were persistent, even during low transmission periods. For high transmission periods, the ratio of asymptomatic to clinical cases was approximately 0.5, but was five times higher during low transmission periods. Clinical episodes at previous high transmission periods were a protective factor for asymptomatic carriage, but carrying parasites without symptoms at a previous high transmission period was a risk factor for asymptomatic carriage. Stable malaria transmission was associated with sustained asymptomatic carriage during dry seasons. Control strategies should target persistent low-level parasitemia clusters to interrupt transmission.
A Fast Density-Based Clustering Algorithm for Real-Time Internet of Things Stream

PubMed Central

Ying Wah, Teh

2014-01-01

Data streams are continuously generated over time from Internet of Things (IoT) devices. The faster all of this data is analyzed, its hidden trends and patterns discovered, and new strategies created, the faster action can be taken, creating greater value for organizations. Density-based method is a prominent class in clustering data streams. It has the ability to detect arbitrary shape clusters, to handle outlier, and it does not need the number of clusters in advance. Therefore, density-based clustering algorithm is a proper choice for clustering IoT streams. Recently, several density-based algorithms have been proposed for clustering data streams. However, density-based clustering in limited time is still a challenging issue. In this paper, we propose a density-based clustering algorithm for IoT streams. The method has fast processing time to be applicable in real-time application of IoT devices. Experimental results show that the proposed approach obtains high quality results with low computation time on real and synthetic datasets. PMID:25110753

A fast density-based clustering algorithm for real-time Internet of Things stream.

PubMed

Amini, Amineh; Saboohi, Hadi; Wah, Teh Ying; Herawan, Tutut

2014-01-01

Data streams are continuously generated over time from Internet of Things (IoT) devices. The faster all of this data is analyzed, its hidden trends and patterns discovered, and new strategies created, the faster action can be taken, creating greater value for organizations. Density-based method is a prominent class in clustering data streams. It has the ability to detect arbitrary shape clusters, to handle outlier, and it does not need the number of clusters in advance. Therefore, density-based clustering algorithm is a proper choice for clustering IoT streams. Recently, several density-based algorithms have been proposed for clustering data streams. However, density-based clustering in limited time is still a challenging issue. In this paper, we propose a density-based clustering algorithm for IoT streams. The method has fast processing time to be applicable in real-time application of IoT devices. Experimental results show that the proposed approach obtains high quality results with low computation time on real and synthetic datasets.
Accounting for Non-Gaussian Sources of Spatial Correlation in Parametric Functional Magnetic Resonance Imaging Paradigms II: A Method to Obtain First-Level Analysis Residuals with Uniform and Gaussian Spatial Autocorrelation Function and Independent and Identically Distributed Time-Series.

PubMed

Gopinath, Kaundinya; Krishnamurthy, Venkatagiri; Lacey, Simon; Sathian, K

2018-02-01

In a recent study Eklund et al. have shown that cluster-wise family-wise error (FWE) rate-corrected inferences made in parametric statistical method-based functional magnetic resonance imaging (fMRI) studies over the past couple of decades may have been invalid, particularly for cluster defining thresholds less stringent than p < 0.001; principally because the spatial autocorrelation functions (sACFs) of fMRI data had been modeled incorrectly to follow a Gaussian form, whereas empirical data suggest otherwise. Hence, the residuals from general linear model (GLM)-based fMRI activation estimates in these studies may not have possessed a homogenously Gaussian sACF. Here we propose a method based on the assumption that heterogeneity and non-Gaussianity of the sACF of the first-level GLM analysis residuals, as well as temporal autocorrelations in the first-level voxel residual time-series, are caused by unmodeled MRI signal from neuronal and physiological processes as well as motion and other artifacts, which can be approximated by appropriate decompositions of the first-level residuals with principal component analysis (PCA), and removed. We show that application of this method yields GLM residuals with significantly reduced spatial correlation, nearly Gaussian sACF and uniform spatial smoothness across the brain, thereby allowing valid cluster-based FWE-corrected inferences based on assumption of Gaussian spatial noise. We further show that application of this method renders the voxel time-series of first-level GLM residuals independent, and identically distributed across time (which is a necessary condition for appropriate voxel-level GLM inference), without having to fit ad hoc stochastic colored noise models. Furthermore, the detection power of individual subject brain activation analysis is enhanced. This method will be especially useful for case studies, which rely on first-level GLM analysis inferences.
Time-course microarray analysis for identifying candidate genes involved in obesity-associated pathological changes in the mouse colon.

PubMed

Bae, Yun Jung; Kim, Sung-Eun; Hong, Seong Yeon; Park, Taesun; Lee, Sang Gyu; Choi, Myung-Sook; Sung, Mi-Kyung

2016-01-01

Obesity is known to increase the risk of colorectal cancer. However, mechanisms underlying the pathogenesis of obesity-induced colorectal cancer are not completely understood. The purposes of this study were to identify differentially expressed genes in the colon of mice with diet-induced obesity and to select candidate genes as early markers of obesity-associated abnormal cell growth in the colon. C57BL/6N mice were fed normal diet (11% fat energy) or high-fat diet (40% fat energy) and were euthanized at different time points. Genome-wide expression profiles of the colon were determined at 2, 4, 8, and 12 weeks. Cluster analysis was performed using expression data of genes showing log 2 fold change of ≥1 or ≤-1 (twofold change), based on time-dependent expression patterns, followed by virtual network analysis. High-fat diet-fed mice showed significant increase in body weight and total visceral fat weight over 12 weeks. Time-course microarray analysis showed that 50, 47, 36, and 411 genes were differentially expressed at 2, 4, 8, and 12 weeks, respectively. Ten cluster profiles representing distinguishable patterns of genes differentially expressed over time were determined. Cluster 4, which consisted of genes showing the most significant alterations in expression in response to high-fat diet over 12 weeks, included Apoa4 (apolipoprotein A-IV), Ppap2b (phosphatidic acid phosphatase type 2B), Cel (carboxyl ester lipase), and Clps (colipase, pancreatic), which interacted strongly with surrounding genes associated with colorectal cancer or obesity. Our data indicate that Apoa4 , Ppap2b , Cel , and Clps are candidate early marker genes associated with obesity-related pathological changes in the colon. Genome-wide analyses performed in the present study provide new insights on selecting novel genes that may be associated with the development of diseases of the colon.
Functional analysis of the upstream regulatory region of chicken miR-17-92 cluster.

PubMed

Cheng, Min; Zhang, Wen-jian; Xing, Tian-yu; Yan, Xiao-hong; Li, Yu-mao; Li, Hui; Wang, Ning

2016-08-01

miR-17-92 cluster plays important roles in cell proliferation, differentiation, apoptosis, animal development and tumorigenesis. The transcriptional regulation of miR-17-92 cluster has been extensively studied in mammals, but not in birds. To date, avian miR-17-92 cluster genomic structure has not been fully determined. The promoter location and sequence of miR-17-92 cluster have not been determined, due to the existence of a genomic gap sequence upstream of miR-17-92 cluster in all the birds whose genomes have been sequenced. In this study, genome walking was used to close the genomic gap upstream of chicken miR-17-92 cluster. In addition, bioinformatics analysis, reporter gene assay and truncation mutagenesis were used to investigate functional role of the genomic gap sequence. Genome walking analysis showed that the gap region was 1704 bp long, and its GC content was 80.11%. Bioinformatics analysis showed that in the gap region, there was a 200 bp conserved sequence among the tested 10 species (Gallus gallus, Homo sapiens, Pan troglodytes, Bos taurus, Sus scrofa, Rattus norvegicus, Mus musculus, Possum, Danio rerio, Rana nigromaculata), which is core promoter region of mammalian miR-17-92 host gene (MIR17HG). Promoter luciferase reporter gene vector of the gap region was constructed and reporter assay was performed. The result showed that the promoter activity of pGL3-cMIR17HG (-4228/-2506) was 417 times than that of negative control (empty pGL3 basic vector), suggesting that chicken miR-17-92 cluster promoter exists in the gap region. To further gain insight into the promoter structure, two different truncations for the cloned gap sequence were generated by PCR. One had a truncation of 448 bp at the 5'-end and the other had a truncation of 894 bp at the 3'-end. Further reporter analysis showed that compared with the promoter activity of pGL3-cMIR17HG (-4228/-2506), the reporter activities of the 5'-end truncation and the 3'-end truncation were reduced by 19.82% and 60.14%, respectively. These data demonstrated that the important promoter region of chicken miR-17-92 cluster is located in the -3400/-2506 bp region. Our results lay the foundation for revealing the transcriptional regulatory mechanisms of chicken miR-17-92 cluster.
Kinematic gait patterns in healthy runners: A hierarchical cluster analysis.

PubMed

Phinyomark, Angkoon; Osis, Sean; Hettinga, Blayne A; Ferber, Reed

2015-11-05

Previous studies have demonstrated distinct clusters of gait patterns in both healthy and pathological groups, suggesting that different movement strategies may be represented. However, these studies have used discrete time point variables and usually focused on only one specific joint and plane of motion. Therefore, the first purpose of this study was to determine if running gait patterns for healthy subjects could be classified into homogeneous subgroups using three-dimensional kinematic data from the ankle, knee, and hip joints. The second purpose was to identify differences in joint kinematics between these groups. The third purpose was to investigate the practical implications of clustering healthy subjects by comparing these kinematics with runners experiencing patellofemoral pain (PFP). A principal component analysis (PCA) was used to reduce the dimensionality of the entire gait waveform data and then a hierarchical cluster analysis (HCA) determined group sets of similar gait patterns and homogeneous clusters. The results show two distinct running gait patterns were found with the main between-group differences occurring in frontal and sagittal plane knee angles (P<0.001), independent of age, height, weight, and running speed. When these two groups were compared to PFP runners, one cluster exhibited greater while the other exhibited reduced peak knee abduction angles (P<0.05). The variability observed in running patterns across this sample could be the result of different gait strategies. These results suggest care must be taken when selecting samples of subjects in order to investigate the pathomechanics of injured runners. Copyright © 2015 Elsevier Ltd. All rights reserved.
Genetic variability in selected date palm (Phoenix dactylifera L.) cultivars of United Arab Emirates using ISSR and DAMD markers.

PubMed

Purayil, Fayas T; Robert, Gabriel A; Gothandam, Kodiveri M; Kurup, Shyam S; Subramaniam, Sreeramanan; Cheruth, Abdul Jaleel

2018-02-01

Nine (9) different date palm ( Phoenix dactylifera L.) cultivars from UAE, which differ in their flower timings were selected to determine the polymorphism and genetic relationship between these cultivars. Hereditary differences and interrelationships were assessed utilizing inter-simple sequence repeat (ISSR) and directed amplification of minisatellite DNA region (DAMD) primers. Analysis on eight DAMD and five ISSR markers produced total of 113 amplicon including 99 polymorphic and 14 monomorphic alleles with a polymorphic percentage of 85.45. The average polymorphic information content for the two-marker system was almost similar (DAMD, 0.445 and ISSR, 0.459). UPGMA based clustering of DAMD and ISSR revealed that mid-season cultivars, Mkh (Khlas) and MB (Barhee) grouped together to form a subcluster in both the marker systems. The genetic similarity analysis followed by clustering of the cumulative data from the DAMD and ISSR resulted in two major clusters with two early-season cultivars (ENg and Ekn), two mid-season cultivars (MKh and MB) and one late-season cultivar (Lkhs) in cluster 1, cluster 2 includes two late-season cultivars, one early-season cultivar and one mid-season cultivar. The cluster analysis of both DAMD and ISSR marker revealed that, the patterns of variation between some of the tested cultivars were similar in both DNA marker systems. Hence, the present study signifies the applicability of DAMD and ISSR marker system in detecting genetic diversity of date palm cultivars flowering at different seasons. This may facilitate the conservation and improvement of date palm cultivars in the future.
A Cluster Analytic Approach to Identifying Predictors and Moderators of Psychosocial Treatment for Bipolar Depression: Results from STEP-BD

PubMed Central

Deckersbach, Thilo; Peters, Amy T.; Sylvia, Louisa G.; Gold, Alexandra K.; da Silva Magalhaes, Pedro Vieira; Henry, David B.; Frank, Ellen; Otto, Michael W.; Berk, Michael; Dougherty, Darin D.; Nierenberg, Andrew A.; Miklowitz, David J.

2016-01-01

Background We sought to address how predictors and moderators of psychotherapy for bipolar depression – identified individually in prior analyses – can inform the development of a metric for prospectively classifying treatment outcome in intensive psychotherapy (IP) versus collaborative care (CC) adjunctive to pharmacotherapy in the Systematic Treatment Enhancement Program (STEP-BD) study. Methods We conducted post-hoc analyses on 135 STEP-BD participants using cluster analysis to identify subsets of participants with similar clinical profiles and investigated this combined metric as a moderator and predictor of response to IP. We used agglomerative hierarchical cluster analyses and k-means clustering to determine the content of the clinical profiles. Logistic regression and Cox proportional hazard models were used to evaluate whether the resulting clusters predicted or moderated likelihood of recovery or time until recovery. Results The cluster analysis yielded a two-cluster solution: 1) “less-recurrent/severe” and 2) “chronic/recurrent.” Rates of recovery in IP were similar for less-recurrent/severe and chronic/recurrent participants. Less-recurrent/severe patients were more likely than chronic/recurrent patients to achieve recovery in CC (p = .040, OR = 4.56). IP yielded a faster recovery for chronic/recurrent participants, whereas CC led to recovery sooner in the less-recurrent/severe cluster (p = .034, OR = 2.62). Limitations Cluster analyses require list-wise deletion of cases with missing data so we were unable to conduct analyses on all STEP-BD participants. Conclusions A well-powered, parametric approach can distinguish patients based on illness history and provide clinicians with symptom profiles of patients that confer differential prognosis in CC vs. IP. PMID:27289316
Ion mobility spectrometry-mass spectrometry examination of the structures, stabilities, and extents of hydration of dimethylamine-sulfuric acid clusters.

PubMed

Thomas, Jikku M; He, Siqin; Larriba-Andaluz, Carlos; DePalma, Joseph W; Johnston, Murray V; Hogan, Christopher J

2016-08-17

We applied an atmospheric pressure differential mobility analyzer (DMA) coupled to a time-of-flight mass spectrometer to examine the stability, mass-mobility relationship, and extent of hydration of dimethylamine-sulfuric acid cluster ions, which are of relevance to nucleation in ambient air. Cluster ions were generated by electrospray ionization and were of the form: [H((CH3)2NH)x(H2SO4)y](+) and [(HSO4)((CH3)2NH)x(H2SO4)y](-), where 4 ≤ x ≤ 8, and 5 ≤ y ≤ 12. Under dry conditions, we find that positively charged cluster ions dissociated via loss of both multiple dimethylamine and sulfuric acid molecules after mobility analysis but prior to mass analysis, and few parent ions were detected in the mass spectrometer. Dissociation also occurred for negative ions, but to a lesser extent than for positive ions for the same mass spectrometer inlet conditions. Under humidified conditions (relative humidities up to 30% in the DMA), positively charged cluster ion dissociation in the mass spectrometer inlet was mitigated and occurred primarily by H2SO4 loss from ions containing excess acid molecules. DMA measurements were used to infer collision cross sections (CCSs) for all identifiable cluster ions. Stokes-Millikan equation and diffuse/inelastic gas molecule scattering predicted CCSs overestimate measured CCSs by more than 15%, while elastic-specular collision model predictions are in good agreement with measurements. Finally, cluster ion hydration was examined by monitoring changes in CCSs with increasing relative humidity. All examined cluster ions showed a modest amount of water molecule adsorption, with percentage increases in CCS smaller than 10%. The extent of hydration correlates directly with cluster ion acidity for positive ions.
Cluster Analysis in Nursing Research: An Introduction, Historical Perspective, and Future Directions.

PubMed

Dunn, Heather; Quinn, Laurie; Corbridge, Susan J; Eldeirawi, Kamal; Kapella, Mary; Collins, Eileen G

2017-05-01

The use of cluster analysis in the nursing literature is limited to the creation of classifications of homogeneous groups and the discovery of new relationships. As such, it is important to provide clarity regarding its use and potential. The purpose of this article is to provide an introduction to distance-based, partitioning-based, and model-based cluster analysis methods commonly utilized in the nursing literature, provide a brief historical overview on the use of cluster analysis in nursing literature, and provide suggestions for future research. An electronic search included three bibliographic databases, PubMed, CINAHL and Web of Science. Key terms were cluster analysis and nursing. The use of cluster analysis in the nursing literature is increasing and expanding. The increased use of cluster analysis in the nursing literature is positioning this statistical method to result in insights that have the potential to change clinical practice.
Identification of stress responsive genes by studying specific relationships between mRNA and protein abundance.

PubMed

Morimoto, Shimpei; Yahara, Koji

2018-03-01

Protein expression is regulated by the production and degradation of mRNAs and proteins but the specifics of their relationship are controversial. Although technological advances have enabled genome-wide and time-series surveys of mRNA and protein abundance, recent studies have shown paradoxical results, with most statistical analyses being limited to linear correlation, or analysis of variance applied separately to mRNA and protein datasets. Here, using recently analyzed genome-wide time-series data, we have developed a statistical analysis framework for identifying which types of genes or biological gene groups have significant correlation between mRNA and protein abundance after accounting for potential time delays. Our framework stratifies all genes in terms of the extent of time delay, conducts gene clustering in each stratum, and performs a non-parametric statistical test of the correlation between mRNA and protein abundance in a gene cluster. Consequently, we revealed stronger correlations than previously reported between mRNA and protein abundance in two metabolic pathways. Moreover, we identified a pair of stress responsive genes ( ADC17 and KIN1 ) that showed a highly similar time series of mRNA and protein abundance. Furthermore, we confirmed robustness of the analysis framework by applying it to another genome-wide time-series data and identifying a cytoskeleton-related gene cluster (keratin 18, keratin 17, and mitotic spindle positioning) that shows similar correlation. The significant correlation and highly similar changes of mRNA and protein abundance suggests a concerted role of these genes in cellular stress response, which we consider provides an answer to the question of the specific relationships between mRNA and protein in a cell. In addition, our framework for studying the relationship between mRNAs and proteins in a cell will provide a basis for studying specific relationships between mRNA and protein abundance after accounting for potential time delays.
A low carbon economy and society.

PubMed

Urry, John

2013-03-13

This paper examines various aspects of moving from high carbon economies and societies to a cluster of low carbon systems. First, some historical material is considered from the Second World War and the 1970s, periods with some lessons for the contemporary 'powering down' of whole societies. Second, analysis is provided of some green shoots of a powering down of existing systems identifiable in the contemporary developed world. Third, analysis is provided of the array of systems, social practices and innovations that would have to develop in order to effect powering down on a sufficient scale and within an appropriate time period. Most examples are drawn from transport and mobility. Finally, the paper demonstrates just why developing new systems is so hard, especially as this must involve a transformed cluster of systems. The forces that make a new cluster unlikely are exceptionally powerful and make this a very difficult but not impossible outcome.
On the Accuracy and Parallelism of GPGPU-Powered Incremental Clustering Algorithms.

PubMed

Chen, Chunlei; He, Li; Zhang, Huixiang; Zheng, Hao; Wang, Lei

2017-01-01

Incremental clustering algorithms play a vital role in various applications such as massive data analysis and real-time data processing. Typical application scenarios of incremental clustering raise high demand on computing power of the hardware platform. Parallel computing is a common solution to meet this demand. Moreover, General Purpose Graphic Processing Unit (GPGPU) is a promising parallel computing device. Nevertheless, the incremental clustering algorithm is facing a dilemma between clustering accuracy and parallelism when they are powered by GPGPU. We formally analyzed the cause of this dilemma. First, we formalized concepts relevant to incremental clustering like evolving granularity. Second, we formally proved two theorems. The first theorem proves the relation between clustering accuracy and evolving granularity. Additionally, this theorem analyzes the upper and lower bounds of different-to-same mis-affiliation. Fewer occurrences of such mis-affiliation mean higher accuracy. The second theorem reveals the relation between parallelism and evolving granularity. Smaller work-depth means superior parallelism. Through the proofs, we conclude that accuracy of an incremental clustering algorithm is negatively related to evolving granularity while parallelism is positively related to the granularity. Thus the contradictory relations cause the dilemma. Finally, we validated the relations through a demo algorithm. Experiment results verified theoretical conclusions.
IMG-ABC: A Knowledge Base To Fuel Discovery of Biosynthetic Gene Clusters and Novel Secondary Metabolites.

PubMed

Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Szeto, Ernest; Huang, Jinghua; Reddy, T B K; Cimermančič, Peter; Fischbach, Michael A; Ivanova, Natalia N; Markowitz, Victor M; Kyrpides, Nikos C; Pati, Amrita

2015-07-14

In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of "big" genomic data for discovering small molecules. IMG-ABC relies on IMG's comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve as the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC's focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in Alphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. IMG-ABC is the largest publicly available database of predicted and experimental biosynthetic gene clusters and the secondary metabolites they produce. The system also includes powerful search and analysis tools that are integrated with IMG's extensive genomic/metagenomic data and analysis tool kits. As new research on biosynthetic gene clusters and secondary metabolites is published and more genomes are sequenced, IMG-ABC will continue to expand, with the goal of becoming an essential component of any bioinformatic exploration of the secondary metabolism world. Copyright © 2015 Hadjithomas et al.
Low Back Pain Subgroups using Fear-Avoidance Model Measures: Results of a Cluster Analysis

PubMed Central

Beneciuk, Jason M.; Robinson, Michael E.; George, Steven Z.

2012-01-01

Objectives The purpose of this secondary analysis was to test the hypothesis that an empirically derived psychological subgrouping scheme based on multiple Fear-Avoidance Model (FAM) constructs would provide additional capabilities for clinical outcomes in comparison to a single FAM construct. Methods Patients (n = 108) with acute or sub-acute low back pain (LBP) enrolled in a clinical trial comparing behavioral physical therapy interventions to classification based physical therapy completed baseline questionnaires for pain catastrophizing (PCS), fear-avoidance beliefs (FABQ-PA, FABQ-W), and patient-specific fear (FDAQ). Clinical outcomes were pain intensity and disability measured at baseline, 4-weeks, and 6-months. A hierarchical agglomerative cluster analysis was used to create distinct cluster profiles among FAM measures and discriminant analysis was used to interpret clusters. Changes in clinical outcomes were investigated with repeated measures ANOVA and differences in results based on cluster membership were compared to FABQ-PA subgrouping used in the original trial. Results Three distinct FAM subgroups (Low Risk, High Specific Fear, and High Fear & Catastrophizing) emerged from cluster analysis. Subgroups differed on baseline pain and disability (p’s<.01) with the High Fear & Catastrophizing subgroup associated with greater pain than the Low Risk subgroup (p<.01) and the greatest disability (p’s<.05). Subgroup × time interactions were detected for both pain and disability (p’s<.05) with the High Fear & Catastrophizing subgroup reporting greater changes in pain and disability than other subgroups (p’s<.05). In contrast, FABQ-PA subgroups used in the original trial were not associated with interactions for clinical outcomes. Discussion These data suggest that subgrouping based on multiple FAM measures may provide additional information on clinical outcomes in comparison to determining subgroup status by FABQ-PA alone. Subgrouping methods for patients with LBP should include multiple psychological factors to further explore if patients can be matched with appropriate interventions. PMID:22510537
Hemodynamic Response to Interictal Epileptiform Discharges Addressed by Personalized EEG-fNIRS Recordings

PubMed Central

Pellegrino, Giovanni; Machado, Alexis; von Ellenrieder, Nicolas; Watanabe, Satsuki; Hall, Jeffery A.; Lina, Jean-Marc; Kobayashi, Eliane; Grova, Christophe

2016-01-01

Objective: We aimed at studying the hemodynamic response (HR) to Interictal Epileptic Discharges (IEDs) using patient-specific and prolonged simultaneous ElectroEncephaloGraphy (EEG) and functional Near InfraRed Spectroscopy (fNIRS) recordings. Methods: The epileptic generator was localized using Magnetoencephalography source imaging. fNIRS montage was tailored for each patient, using an algorithm to optimize the sensitivity to the epileptic generator. Optodes were glued using collodion to achieve prolonged acquisition with high quality signal. fNIRS data analysis was handled with no a priori constraint on HR time course, averaging fNIRS signals to similar IEDs. Cluster-permutation analysis was performed on 3D reconstructed fNIRS data to identify significant spatio-temporal HR clusters. Standard (GLM with fixed HRF) and cluster-permutation EEG-fMRI analyses were performed for comparison purposes. Results: fNIRS detected HR to IEDs for 8/9 patients. It mainly consisted oxy-hemoglobin increases (seven patients), followed by oxy-hemoglobin decreases (six patients). HR was lateralized in six patients and lasted from 8.5 to 30 s. Standard EEG-fMRI analysis detected an HR in 4/9 patients (4/9 without enough IEDs, 1/9 unreliable result). The cluster-permutation EEG-fMRI analysis restricted to the region investigated by fNIRS showed additional strong and non-canonical BOLD responses starting earlier than the IEDs and lasting up to 30 s. Conclusions: (i) EEG-fNIRS is suitable to detect the HR to IEDs and can outperform EEG-fMRI because of prolonged recordings and greater chance to detect IEDs; (ii) cluster-permutation analysis unveils additional HR features underestimated when imposing a canonical HR function (iii) the HR is often bilateral and lasts up to 30 s. PMID:27047325
Effect of Stagger on the Vibroacoustic Loads from Clustered Rockets

NASA Technical Reports Server (NTRS)

Rojo, Raymundo; Tinney, Charles E.; Ruf, Joseph H.

2016-01-01

The effect of stagger startup on the vibro-acoustic loads that form during the end- effects-regime of clustered rockets is studied using both full-scale (hot-gas) and laboratory scale (cold gas) data. Both configurations comprise three nozzles with thrust optimized parabolic contours that undergo free shock separated flow and restricted shock separated flow as well as an end-effects regime prior to flowing full. Acoustic pressure waveforms recorded at the base of the nozzle clusters are analyzed using various statistical metrics as well as time-frequency analysis. The findings reveal a significant reduction in end- effects-regime loads when engine ignition is staggered. However, regardless of stagger, both the skewness and kurtosis of the acoustic pressure time derivative elevate to the same levels during the end-effects-regime event thereby demonstrating the intermittence and impulsiveness of the acoustic waveforms that form during engine startup.
ICAP - An Interactive Cluster Analysis Procedure for analyzing remotely sensed data

NASA Technical Reports Server (NTRS)

Wharton, S. W.; Turner, B. J.

1981-01-01

An Interactive Cluster Analysis Procedure (ICAP) was developed to derive classifier training statistics from remotely sensed data. ICAP differs from conventional clustering algorithms by allowing the analyst to optimize the cluster configuration by inspection, rather than by manipulating process parameters. Control of the clustering process alternates between the algorithm, which creates new centroids and forms clusters, and the analyst, who can evaluate and elect to modify the cluster structure. Clusters can be deleted, or lumped together pairwise, or new centroids can be added. A summary of the cluster statistics can be requested to facilitate cluster manipulation. The principal advantage of this approach is that it allows prior information (when available) to be used directly in the analysis, since the analyst interacts with ICAP in a straightforward manner, using basic terms with which he is more likely to be familiar. Results from testing ICAP showed that an informed use of ICAP can improve classification, as compared to an existing cluster analysis procedure.
Non-targeted analyses of animal plasma: betaine and choline represent the nutritional and metabolic status.

PubMed

Katayama, K; Sato, T; Arai, T; Amao, H; Ohta, Y; Ozawa, T; Kenyon, P R; Hickson, R E; Tazaki, H

2013-02-01

Simple liquid chromatography-mass spectrometry (LC-MS) was applied to non-targeted metabolic analyses to discover new metabolic markers in animal plasma. Principle component analysis (PCA) and partial least squares-discriminate analysis (PLS-DA) were used to analyse LC-MS multivariate data. PCA clearly generated two separate clusters for artificially induced diabetic mice and healthy control mice. PLS-DA of time-course changes in plasma metabolites of chicks after feeding generated three clusters (pre- and immediately after feeding, 0.5-3 h after feeding and 4 h after feeding). Two separate clusters were also generated for plasma metabolites of pregnant Angus heifers with differing live-weight change profiles (gaining or losing). The accompanying PLS-DA loading plot detailed the metabolites that contribute the most to the cluster separation. In each case, the same highly hydrophilic metabolite was strongly correlated to the group separation. The metabolite was identified as betaine by LC-MS/MS. This result indicates that betaine and its metabolic precursor, choline, may be useful biomarkers to evaluate the nutritional and metabolic status of animals. © 2011 Blackwell Verlag GmbH.
Distribution-based fuzzy clustering of electrical resistivity tomography images for interface detection

NASA Astrophysics Data System (ADS)

Ward, W. O. C.; Wilkinson, P. B.; Chambers, J. E.; Oxby, L. S.; Bai, L.

2014-04-01

A novel method for the effective identification of bedrock subsurface elevation from electrical resistivity tomography images is described. Identifying subsurface boundaries in the topographic data can be difficult due to smoothness constraints used in inversion, so a statistical population-based approach is used that extends previous work in calculating isoresistivity surfaces. The analysis framework involves a procedure for guiding a clustering approach based on the fuzzy c-means algorithm. An approximation of resistivity distributions, found using kernel density estimation, was utilized as a means of guiding the cluster centroids used to classify data. A fuzzy method was chosen over hard clustering due to uncertainty in hard edges in the topography data, and a measure of clustering uncertainty was identified based on the reciprocal of cluster membership. The algorithm was validated using a direct comparison of known observed bedrock depths at two 3-D survey sites, using real-time GPS information of exposed bedrock by quarrying on one site, and borehole logs at the other. Results show similarly accurate detection as a leading isosurface estimation method, and the proposed algorithm requires significantly less user input and prior site knowledge. Furthermore, the method is effectively dimension-independent and will scale to data of increased spatial dimensions without a significant effect on the runtime. A discussion on the results by automated versus supervised analysis is also presented.
The Impact of Clinical, Demographic and Risk Factors on Rates of HIV Transmission: A Population-based Phylogenetic Analysis in British Columbia, Canada

PubMed Central

Poon, Art F. Y.; Joy, Jeffrey B.; Woods, Conan K.; Shurgold, Susan; Colley, Guillaume; Brumme, Chanson J.; Hogg, Robert S.; Montaner, Julio S. G.; Harrigan, P. Richard

2015-01-01

Background. The diversification of human immunodeficiency virus (HIV) is shaped by its transmission history. We therefore used a population based province wide HIV drug resistance database in British Columbia (BC), Canada, to evaluate the impact of clinical, demographic, and behavioral factors on rates of HIV transmission. Methods. We reconstructed molecular phylogenies from 27 296 anonymized bulk HIV pol sequences representing 7747 individuals in BC—about half the estimated HIV prevalence in BC. Infections were grouped into clusters based on phylogenetic distances, as a proxy for variation in transmission rates. Rates of cluster expansion were reconstructed from estimated dates of HIV seroconversion. Results. Our criteria grouped 4431 individuals into 744 clusters largely separated with respect to risk factors, including large established clusters predominated by injection drug users and more-recently emerging clusters comprising men who have sex with men. The mean log10 viral load of an individual's phylogenetic neighborhood (composed of 5 other individuals with shortest phylogenetic distances) increased their odds of appearing in a cluster by >2-fold per log10 viruses per milliliter. Conclusions. Hotspots of ongoing HIV transmission can be characterized in near real time by the secondary analysis of HIV resistance genotypes, providing an important potential resource for targeting public health initiatives for HIV prevention. PMID:25312037

Some links on this page may take you to non-federal websites. Their policies may differ from this site.