Science.gov

Sample records for agglomerative cluster analysis

  1. Implementing Agglomerative Hierarchic Clustering Algorithms for Use in Document Retrieval.

    ERIC Educational Resources Information Center

    Voorhees, Ellen M.

    1986-01-01

    Describes a computerized information retrieval system that uses three agglomerative hierarchic clustering algorithms--single link, complete link, and group average link--and explains their implementations. It is noted that these implementations have been used to cluster a collection of 12,000 documents. (LRW)

  2. Agglomerative clustering-based approach for two-dimensional phase unwrapping.

    PubMed

    Herráez, Miguel Arevalillo; Boticario, Jesús G; Lalor, Michael J; Burton, David R

    2005-03-01

    We describe a novel algorithm for two-dimensional phase unwrapping. The technique combines the principles of agglomerative clustering and use of heuristics to construct a discontinuous quality-guided path. Unlike other quality-guided algorithms, which establish the path at the start of the unwrapping process, our technique constructs the path as the unwrapping process evolves. This makes the technique less prone to error propagation, although it presents higher execution times than other existing algorithms. The algorithm reacts satisfactorily to random noise and breaks in the phase distribution. A variation of the algorithm is also presented that considerably reduces the execution time without affecting the results significantly.

  3. Hierarchical Regional Disparities and Potential Sector Identification Using Modified Agglomerative Clustering

    NASA Astrophysics Data System (ADS)

    Munandar, T. A.; Azhari; Mushdholifah, A.; Arsyad, L.

    2017-03-01

    Disparities in regional development methods are commonly identified using the Klassen Typology and Location Quotient. Both methods typically use the data on the gross regional domestic product (GRDP) sectors of a particular region. The Klassen approach can identify regional disparities by classifying the GRDP sector data into four classes, namely Quadrants I, II, III, and IV. Each quadrant indicates a certain level of regional disparities based on the GRDP sector value of the said region. Meanwhile, the Location Quotient (LQ) is usually used to identify potential sectors in a particular region so as to determine which sectors are potential and which ones are not potential. LQ classifies each sector into three classes namely, the basic sector, the non-basic sector with a competitive advantage, and the non-basic sector which can only meet its own necessities. Both Klassen Typology and LQ are unable to visualize the relationship of achievements in the development clearly of each region and sector. This research aimed to develop a new approach to the identification of disparities in regional development in the form of hierarchical clustering. The method of Hierarchical Agglomerative Clustering (HAC) was employed as the basis of the hierarchical clustering model for identifying disparities in regional development. Modifications were made to HAC using the Klassen Typology and LQ. Then, HAC which had been modified using the Klassen Typology was called MHACK while HAC which had been modified using LQ was called MACLoQ. Both algorithms can be used to identify regional disparities (MHACK) and potential sectors (MACLoQ), respectively, in the form of hierarchical clusters. Based on the MHACK in 31 regencies in Central Java Province, it is identified that 3 regencies (Demak, Jepara, and Magelang City) fall into the category of developed and rapidly-growing regions, while the other 28 regencies fall into the category of developed but depressed regions. Results of the MACLo

  4. Classifying airborne radiometry data with Agglomerative Hierarchical Clustering: A tool for geological mapping in context of rainforest (French Guiana)

    NASA Astrophysics Data System (ADS)

    Martelet, G.; Truffert, C.; Tourlière, B.; Ledru, P.; Perrin, J.

    2006-09-01

    In highly weathered environments, it is crucial that geological maps provide information concerning both the regolith and the bedrock, for societal needs, such as land-use, mineral or water resources management. Often, geologists are facing the challenge of upgrading existing maps, as relevant information concerning weathering processes and pedogenesis is currently missing. In rugged areas in particular, where access to the field is difficult, ground observations are sparsely available, and need therefore to be complemented using methods based on remotely sensed data. For this purpose, we discuss the use of Agglomerative Hierarchical Clustering (AHC) on eU, K and eTh airborne gamma-ray spectrometry grids. The AHC process allows primarily to segment the geophysical maps into zones having coherent U, K and Th contents. The analysis of these contents are discussed in terms of geochemical signature for lithological attribution of classes, as well as the use of a dendrogram, which gives indications on the hierarchical relations between classes. Unsupervised classification maps resulting from AHC can be considered as spatial models of the distribution of the radioelement content in surface and sub-surface formations. The source of gamma rays emanating from the ground is primarily related to the geochemistry of the bedrock and secondarily to modifications of the radioelement distribution by weathering and other secondary mechanisms, such as mobilisation by wind or water. The interpretation of the obtained predictive classified maps, their U, K, Th contents, and the dendrogram, in light of available geological knowledge, allows to separate signatures related to regolith and solid geology. Consequently, classification maps can be integrated within a GIS environment and used by the geologist as a support for mapping bedrock lithologies and their alteration. We illustrate the AHC classification method in the region of Cayenne using high-resolution airborne radiometric data

  5. Combining analytical hierarchy process and agglomerative hierarchical clustering in search of expert consensus in green corridors development management.

    PubMed

    Shapira, Aviad; Shoshany, Maxim; Nir-Goldenberg, Sigal

    2013-07-01

    Environmental management and planning are instrumental in resolving conflicts arising between societal needs for economic development on the one hand and for open green landscapes on the other hand. Allocating green corridors between fragmented core green areas may provide a partial solution to these conflicts. Decisions regarding green corridor development require the assessment of alternative allocations based on multiple criteria evaluations. Analytical Hierarchy Process provides a methodology for both a structured and consistent extraction of such evaluations and for the search for consensus among experts regarding weights assigned to the different criteria. Implementing this methodology using 15 Israeli experts-landscape architects, regional planners, and geographers-revealed inherent differences in expert opinions in this field beyond professional divisions. The use of Agglomerative Hierarchical Clustering allowed to identify clusters representing common decisions regarding criterion weights. Aggregating the evaluations of these clusters revealed an important dichotomy between a pragmatist approach that emphasizes the weight of statutory criteria and an ecological approach that emphasizes the role of the natural conditions in allocating green landscape corridors.

  6. Interactive Maximum Reliability Cluster Analysis.

    ERIC Educational Resources Information Center

    Mays, Robert

    1978-01-01

    A FORTRAN program for clustering variables using the alpha coefficient of reliability is described. For batch operation, a rule for stopping the agglomerative precedure is available. The conversational version of the program allows the user to intervene in the process in order to test the final solution for sensitivity to changes. (Author/JKS)

  7. Agglomerative percolation on the Bethe lattice and the triangular cactus

    NASA Astrophysics Data System (ADS)

    Chae, Huiseung; Yook, Soon-Hyung; Kim, Yup

    2013-08-01

    Agglomerative percolation (AP) on the Bethe lattice and the triangular cactus is studied to establish the exact mean-field theory for AP. Using the self-consistent simulation method based on the exact self-consistent equations, the order parameter P∞ and the average cluster size S are measured. From the measured P∞ and S, the critical exponents βk and γk for k = 2 and 3 are evaluated. Here, βk and γk are the critical exponents for P∞ and S when the growth of clusters spontaneously breaks the Zk symmetry of the k-partite graph. The obtained values are β2 = 1.79(3), γ2 = 0.88(1), β3 = 1.35(5) and γ3 = 0.94(2). By comparing these exponents with those for ordinary percolation (β∞ = 1 and γ∞ = 1), we also find β∞ < β3 < β2 and γ∞ > γ3 > γ2. These results quantitatively verify the conjecture that the AP model belongs to a new universality class if the Zk symmetry is broken spontaneously, and the new universality class depends on k.

  8. Use of multiple cluster analysis methods to explore the validity of a community outcomes concept map.

    PubMed

    Orsi, Rebecca

    2017-02-01

    Concept mapping is now a commonly-used technique for articulating and evaluating programmatic outcomes. However, research regarding validity of knowledge and outcomes produced with concept mapping is sparse. The current study describes quantitative validity analyses using a concept mapping dataset. We sought to increase the validity of concept mapping evaluation results by running multiple cluster analysis methods and then using several metrics to choose from among solutions. We present four different clustering methods based on analyses using the R statistical software package: partitioning around medoids (PAM), fuzzy analysis (FANNY), agglomerative nesting (AGNES) and divisive analysis (DIANA). We then used the Dunn and Davies-Bouldin indices to assist in choosing a valid cluster solution for a concept mapping outcomes evaluation. We conclude that the validity of the outcomes map is high, based on the analyses described. Finally, we discuss areas for further concept mapping methods research.

  9. Mining a Web Citation Database for Author Co-Citation Analysis.

    ERIC Educational Resources Information Center

    He, Yulan; Hui, Siu Cheung

    2002-01-01

    Proposes a mining process to automate author co-citation analysis based on the Web Citation Database, a data warehouse for storing citation indices of Web publications. Describes the use of agglomerative hierarchical clustering for author clustering and multidimensional scaling for displaying author cluster maps, and explains PubSearch, a…

  10. Cluster analysis of long time-series medical datasets

    NASA Astrophysics Data System (ADS)

    Hirano, Shoji; Tsumoto, Shusaku

    2004-04-01

    This paper presents a comparative study about the characteristics of clustering methods for inhomogeneous time-series medical datasets. Using various combinations of comparison methods and grouping methods, we performed clustering experiments of the hepatitis data set and evaluated validity of the results. The results suggested that (1) complete-linkage (CL) criterion in agglomerative hierarchical clustering (AHC) outperformed average-linkage (AL) criterion in terms of the interpretability of a dendrogram and clustering results, (2) combination of dynamic time warping (DTW) and CL-AHC constantly produced interpretable results, (3) combination of DTW and rough clustering (RC) would be used to find the core sequences of the clusters, (4) multiscale matching may suffer from the treatment of 'no-match' pairs, however, the problem may be eluded by using RC as a subsequent grouping method.

  11. Method for preventing plugging in the pyrolysis of agglomerative coals

    DOEpatents

    Green, Norman W.

    1979-01-23

    To prevent plugging in a pyrolysis operation where an agglomerative coal in a nondeleteriously reactive carrier gas is injected as a turbulent jet from an opening into an elongate pyrolysis reactor, the coal is comminuted to a size where the particles under operating conditions will detackify prior to contact with internal reactor surfaces while a secondary flow of fluid is introduced along the peripheral inner surface of the reactor to prevent backflow of the coal particles. The pyrolysis operation is depicted by two equations which enable preselection of conditions which insure prevention of reactor plugging.

  12. Cluster Analysis and Web-Based 3-D Visualization of Large-scale Geophysical Data

    NASA Astrophysics Data System (ADS)

    Kadlec, B. J.; Yuen, D. A.; Bollig, E. F.; Dzwinel, W.; da Silva, C. R.

    2004-05-01

    We present a problem-solving environment WEB-IS (Web-based Data Interrogative System), which we have developed for remote analysis and visualization of geophysical data [Garbow et. al., 2003]. WEB-IS employs agglomerative clustering methods intended for feature extraction and studying the predictions of large magnitude earthquake events. Data-mining is accomplished using a mutual nearest meighbor (MNN) algorithm for extracting event clusters of different density and shapes based on a hierarchical proximity measure. Clustering schemes used in molecular dynamics [Da Silva et. al., 2002] are also considered for increasing computational efficiency using a linked cell algorithm for creating a Verlet neighbor list (VNL) and extracting different cluster structures by applying a canonical backtracking search on the VNL. Space and time correlations between the events are visualized dynamically in 3-D through a filter by showing clusters at different timescales according to defined units of time ranging from days to years. This WEB-IS functionality was tested both on synthetic [Eneva and Ben-Zion, 1997] and actual earthquake catalogs of Japanese earthquakes and can be applied to the soft-computing data mining methods used in hydrology and geoinformatics. Da Silva, C.R.S., Justo, J.F., Fazzio, A., Phys Rev B, vol., 65, 2002. Eneva, M., Ben-Zion, Y.,J. Geophys. Res., 102, 17785-17795, 1997. Garbow, Z.A., Yuen, D.A., Erlebacher, G., Bollig, E.F., Kadlec, B.J., Vis. Geosci., 2003.

  13. Overview on techniques in cluster analysis.

    PubMed

    Frades, Itziar; Matthiesen, Rune

    2010-01-01

    Clustering is the unsupervised, semisupervised, and supervised classification of patterns into groups. The clustering problem has been addressed in many contexts and disciplines. Cluster analysis encompasses different methods and algorithms for grouping objects of similar kinds into respective categories. In this chapter, we describe a number of methods and algorithms for cluster analysis in a stepwise framework. The steps of a typical clustering analysis process include sequentially pattern representation, the choice of the similarity measure, the choice of the clustering algorithm, the assessment of the output, and the representation of the clusters.

  14. The SMART CLUSTER METHOD - adaptive earthquake cluster analysis and declustering

    NASA Astrophysics Data System (ADS)

    Schaefer, Andreas; Daniell, James; Wenzel, Friedemann

    2016-04-01

    Earthquake declustering is an essential part of almost any statistical analysis of spatial and temporal properties of seismic activity with usual applications comprising of probabilistic seismic hazard assessments (PSHAs) and earthquake prediction methods. The nature of earthquake clusters and subsequent declustering of earthquake catalogues plays a crucial role in determining the magnitude-dependent earthquake return period and its respective spatial variation. Various methods have been developed to address this issue from other researchers. These have differing ranges of complexity ranging from rather simple statistical window methods to complex epidemic models. This study introduces the smart cluster method (SCM), a new methodology to identify earthquake clusters, which uses an adaptive point process for spatio-temporal identification. Hereby, an adaptive search algorithm for data point clusters is adopted. It uses the earthquake density in the spatio-temporal neighbourhood of each event to adjust the search properties. The identified clusters are subsequently analysed to determine directional anisotropy, focussing on a strong correlation along the rupture plane and adjusts its search space with respect to directional properties. In the case of rapid subsequent ruptures like the 1992 Landers sequence or the 2010/2011 Darfield-Christchurch events, an adaptive classification procedure is applied to disassemble subsequent ruptures which may have been grouped into an individual cluster using near-field searches, support vector machines and temporal splitting. The steering parameters of the search behaviour are linked to local earthquake properties like magnitude of completeness, earthquake density and Gutenberg-Richter parameters. The method is capable of identifying and classifying earthquake clusters in space and time. It is tested and validated using earthquake data from California and New Zealand. As a result of the cluster identification process, each event in

  15. Multi-viewpoint clustering analysis

    NASA Technical Reports Server (NTRS)

    Mehrotra, Mala; Wild, Chris

    1993-01-01

    In this paper, we address the feasibility of partitioning rule-based systems into a number of meaningful units to enhance the comprehensibility, maintainability and reliability of expert systems software. Preliminary results have shown that no single structuring principle or abstraction hierarchy is sufficient to understand complex knowledge bases. We therefore propose the Multi View Point - Clustering Analysis (MVP-CA) methodology to provide multiple views of the same expert system. We present the results of using this approach to partition a deployed knowledge-based system that navigates the Space Shuttle's entry. We also discuss the impact of this approach on verification and validation of knowledge-based systems.

  16. Agglomerative Epigenetic Aberrations are a Common Event in Human Breast Cancer

    PubMed Central

    Petr, Novak; Taylor, Jensen; Oshiro Marc, M; Watts George, S; Kim Christina, J; Futscher Bernard, W

    2009-01-01

    Changes in DNA methylation patterns are a common characteristic of cancer cells. Recent studies suggest that DNA methylation affects not only discrete genes, but it can also affect large chromosomal regions, potentially leading to long range epigenetic silencing. It is unclear whether such long-range epigenetic events are relatively rare or frequent occurrences in cancer. Here we use a high-resolution promoter tiling array approach to analyze DNA methylation in breast cancer specimens and normal breast tissue to address this question. We identified 3506 cancer specific differentially methylated regions (DMR) in human breast cancer with 2033 being hypermethylation events and 1473 hypomethylation events. Most of these DMRs are recurrent in breast cancer; 90% of the identified DMRs occurred in at least 33% of the samples. Interestingly, we found a non-random spatial distribution of aberrantly methylated regions across the genome that showed a tendency to concentrate in relatively small genomic regions. Such agglomerates of hyper- and hypomethylated DMRs spanned up to several hundred kilobases and were frequently found at gene family clusters. The hypermethylation events usually occurred in the proximity of the transcription start site in CpG island promoters while hypomethylation events were frequently found in regions of segmental duplication. One example of a newly discovered agglomerate of hypermethylated DMRs associated with gene silencing in breast cancer that we examined in greater detail involved the protocadherin gene family clusters on chromosome 5 (PCDHA, PCDHB, and PCDHG). Taken together, our results suggest that agglomerative epigenetic aberrations are frequent events in human breast cancer. PMID:18922938

  17. DICON: interactive visual analysis of multidimensional clusters.

    PubMed

    Cao, Nan; Gotz, David; Sun, Jimeng; Qu, Huamin

    2011-12-01

    Clustering as a fundamental data analysis technique has been widely used in many analytic applications. However, it is often difficult for users to understand and evaluate multidimensional clustering results, especially the quality of clusters and their semantics. For large and complex data, high-level statistical information about the clusters is often needed for users to evaluate cluster quality while a detailed display of multidimensional attributes of the data is necessary to understand the meaning of clusters. In this paper, we introduce DICON, an icon-based cluster visualization that embeds statistical information into a multi-attribute display to facilitate cluster interpretation, evaluation, and comparison. We design a treemap-like icon to represent a multidimensional cluster, and the quality of the cluster can be conveniently evaluated with the embedded statistical information. We further develop a novel layout algorithm which can generate similar icons for similar clusters, making comparisons of clusters easier. User interaction and clutter reduction are integrated into the system to help users more effectively analyze and refine clustering results for large datasets. We demonstrate the power of DICON through a user study and a case study in the healthcare domain. Our evaluation shows the benefits of the technique, especially in support of complex multidimensional cluster analysis.

  18. Modified distance in average linkage based on M-estimator and MADn criteria in hierarchical cluster analysis

    NASA Astrophysics Data System (ADS)

    Muda, Nora; Othman, Abdul Rahman

    2015-10-01

    The process of grouping a set of objects into classes of similar objects is called clustering. It divides a large group of observations into smaller groups so that the observations within each group are relatively similar and the observations in different groups are relatively dissimilar. In this study, an agglomerative method in hierarchical cluster analysis is chosen and clusters were constructed by using an average linkage technique. An average linkage technique requires distance between clusters, which is calculated based on the average distance between all pairs of points, one group with another group. In calculating the average distance, the distance will not be robust when there is an outlier. Therefore, the average distance in average linkage needs to be modified in order to overcome the problem of outlier. Therefore, the criteria of outlier detection based on MADn criteria is used and the average distance is recalculated without the outlier. Next, the distance in average linkage is calculated based on a modified one step M-estimator (MOM). The groups of cluster are presented in dendrogram graph. To evaluate the goodness of a modified distance in the average linkage clustering, the bootstrap analysis is conducted on the dendrogram graph and the bootstrap value (BP) are assessed for each branch in dendrogram that formed the group, to ensure the reliability of the branches constructed. This study found that the average linkage technique with modified distance is significantly superior than the usual average linkage technique, if there is an outlier. Both of these techniques are said to be similar if there is no outlier.

  19. Deterministic algorithm with agglomerative heuristic for location problems

    NASA Astrophysics Data System (ADS)

    Kazakovtsev, L.; Stupina, A.

    2015-10-01

    Authors consider the clustering problem solved with the k-means method and p-median problem with various distance metrics. The p-median problem and the k-means problem as its special case are most popular models of the location theory. They are implemented for solving problems of clustering and many practically important logistic problems such as optimal factory or warehouse location, oil or gas wells, optimal drilling for oil offshore, steam generators in heavy oil fields. Authors propose new deterministic heuristic algorithm based on ideas of the Information Bottleneck Clustering and genetic algorithms with greedy heuristic. In this paper, results of running new algorithm on various data sets are given in comparison with known deterministic and stochastic methods. New algorithm is shown to be significantly faster than the Information Bottleneck Clustering method having analogous preciseness.

  20. Cluster analysis of multiple planetary flow regimes

    NASA Technical Reports Server (NTRS)

    Mo, Kingtse; Ghil, Michael

    1987-01-01

    A modified cluster analysis method was developed to identify spatial patterns of planetary flow regimes, and to study transitions between them. This method was applied first to a simple deterministic model and second to Northern Hemisphere (NH) 500 mb data. The dynamical model is governed by the fully-nonlinear, equivalent-barotropic vorticity equation on the sphere. Clusters of point in the model's phase space are associated with either a few persistent or with many transient events. Two stationary clusters have patterns similar to unstable stationary model solutions, zonal, or blocked. Transient clusters of wave trains serve as way stations between the stationary ones. For the NH data, cluster analysis was performed in the subspace of the first seven empirical orthogonal functions (EOFs). Stationary clusters are found in the low-frequency band of more than 10 days, and transient clusters in the bandpass frequency window between 2.5 and 6 days. In the low-frequency band three pairs of clusters determine, respectively, EOFs 1, 2, and 3. They exhibit well-known regional features, such as blocking, the Pacific/North American (PNA) pattern and wave trains. Both model and low-pass data show strong bimodality. Clusters in the bandpass window show wave-train patterns in the two jet exit regions. They are related, as in the model, to transitions between stationary clusters.

  1. Data Clustering

    NASA Astrophysics Data System (ADS)

    Wagstaff, Kiri L.

    2012-03-01

    particular application involves considerations of the kind of data being analyzed, algorithm runtime efficiency, and how much prior knowledge is available about the problem domain, which can dictate the nature of clusters sought. Fundamentally, the clustering method and its representations of clusters carries with it a definition of what a cluster is, and it is important that this be aligned with the analysis goals for the problem at hand. In this chapter, I emphasize this point by identifying for each algorithm the cluster representation as a model, m_j , even for algorithms that are not typically thought of as creating a “model.” This chapter surveys a basic collection of clustering methods useful to any practitioner who is interested in applying clustering to a new data set. The algorithms include k-means (Section 25.2), EM (Section 25.3), agglomerative (Section 25.4), and spectral (Section 25.5) clustering, with side mentions of variants such as kernel k-means and divisive clustering. The chapter also discusses each algorithm’s strengths and limitations and provides pointers to additional in-depth reading for each subject. Section 25.6 discusses methods for incorporating domain knowledge into the clustering process. This chapter concludes with a brief survey of interesting applications of clustering methods to astronomy data (Section 25.7). The chapter begins with k-means because it is both generally accessible and so widely used that understanding it can be considered a necessary prerequisite for further work in the field. EM can be viewed as a more sophisticated version of k-means that uses a generative model for each cluster and probabilistic item assignments. Agglomerative clustering is the most basic form of hierarchical clustering and provides a basis for further exploration of algorithms in that vein. Spectral clustering permits a departure from feature-vector-based clustering and can operate on data sets instead represented as affinity, or similarity

  2. Nursing home care quality: a cluster analysis.

    PubMed

    Grøndahl, Vigdis Abrahamsen; Fagerli, Liv Berit

    2017-02-13

    Purpose The purpose of this paper is to explore potential differences in how nursing home residents rate care quality and to explore cluster characteristics. Design/methodology/approach A cross-sectional design was used, with one questionnaire including questions from quality from patients' perspective and Big Five personality traits, together with questions related to socio-demographic aspects and health condition. Residents ( n=103) from four Norwegian nursing homes participated (74.1 per cent response rate). Hierarchical cluster analysis identified clusters with respect to care quality perceptions. χ(2) tests and one-way between-groups ANOVA were performed to characterise the clusters ( p<0.05). Findings Two clusters were identified; Cluster 1 residents (28.2 per cent) had the best care quality perceptions and Cluster 2 (67.0 per cent) had the worst perceptions. The clusters were statistically significant and characterised by personal-related conditions: gender, psychological well-being, preferences, admission, satisfaction with staying in the nursing home, emotional stability and agreeableness, and by external objective care conditions: healthcare personnel and registered nurses. Research limitations/implications Residents assessed as having no cognitive impairments were included, thus excluding the largest group. By choosing questionnaire design and structured interviews, the number able to participate may increase. Practical implications Findings may provide healthcare personnel and managers with increased knowledge on which to develop strategies to improve specific care quality perceptions. Originality/value Cluster analysis can be an effective tool for differentiating between nursing homes residents' care quality perceptions.

  3. [Cluster analysis and its application].

    PubMed

    Půlpán, Zdenĕk

    2002-01-01

    The study exploits knowledge-oriented and context-based modification of well-known algorithms of (fuzzy) clustering. The role of fuzzy sets is inherently inclined towards coping with linguistic domain knowledge also. We try hard to obtain from rich diverse data and knowledge new information about enviroment that is being explored.

  4. Cluster Analysis of Adolescent Blogs

    ERIC Educational Resources Information Center

    Liu, Eric Zhi-Feng; Lin, Chun-Hung; Chen, Feng-Yi; Peng, Ping-Chuan

    2012-01-01

    Emerging web applications and networking systems such as blogs have become popular, and they offer unique opportunities and environments for learners, especially for adolescent learners. This study attempts to explore the writing styles and genres used by adolescents in their blogs by employing content, factor, and cluster analyses. Factor…

  5. Using cluster analysis to explore survey data.

    PubMed

    Spencer, Llinos; Roberts, Gwerfyl; Irvine, Fiona; Jones, Peter; Baker, Colin

    2007-01-01

    Llinos Haf Spencer reports on the use of the cluster analysis statistical technique in nursing research and uses data from the Welsh Language Awareness in Healthcare Provision in Wales survey as an exemplar She concludes that cluster analysis is a valuable tool to tease out patterns in data that are not initially evident in bivariate analyses and thus should be considered as a viable option for nursing research.

  6. Digital image analysis of haematopoietic clusters.

    PubMed

    Benzinou, A; Hojeij, Y; Roudot, A-C

    2005-02-01

    Counting and differentiating cell clusters is a tedious task when performed with a light microscope. Moreover, biased counts and interpretation are difficult to avoid because of the difficulties to evaluate the limits between different types of clusters. Presented here, is a computer-based application able to solve these problems. The image analysis system is entirely automatic, from the stage screening, to the statistical analysis of the results of each experimental plate. Good correlations are found with measurements made by a specialised technician.

  7. Cluster Analysis of the Malaysian Hipposideros

    NASA Astrophysics Data System (ADS)

    Sazali, Siti Nurlydia; Laman, Charlie J.; Abdullah, M. T.

    2008-01-01

    A preliminary study on the morphometric variations among species in the genus Hipposideros was conducted using voucher specimens from the Universiti Malaysia Sarawak (UNIMAS) Zoological Museum and the Department of Wildlife and National Park (DWNP) Kuala Lumpur. A total of 24 individuals from six species of this genus were morphologically studied where all related measurements of body, skull and dental were measured and recorded. The statistical data subjected to the cluster analysis shows that the genus Hipposideros is divided into two major clusters where each species was clearly separated. The cluster analysis among Hipposideros species is useful for aiding in species identification.

  8. Cluster Analysis and Clinical Asthma Phenotypes

    PubMed Central

    Shaw, Dominic E.; Berry, Michael A.; Thomas, Michael; Brightling, Christopher E.; Wardlaw, Andrew J.

    2014-01-01

    Rationale Heterogeneity in asthma expression is multidimensional, including variability in clinical, physiologic, and pathologic parameters. Classification requires consideration of these disparate domains in a unified model. Objectives To explore the application of a multivariate mathematical technique, k-means cluster analysis, for identifying distinct phenotypic groups. Methods We performed k-means cluster analysis in three independent asthma populations. Clusters of a population managed in primary care (n = 184) with predominantly mild to moderate disease, were compared with a refractory asthma population managed in secondary care (n = 187). We then compared differences in asthma outcomes (exacerbation frequency and change in corticosteroid dose at 12 mo) between clusters in a third population of 68 subjects with predominantly refractory asthma, clustered at entry into a randomized trial comparing a strategy of minimizing eosinophilic inflammation (inflammation-guided strategy) with standard care. Measurements and Main Results Two clusters (early-onset atopic and obese, noneosinophilic) were common to both asthma populations. Two clusters characterized by marked discordance between symptom expression and eosinophilic airway inflammation (early-onset symptom predominant and late-onset inflammation predominant) were specific to refractory asthma. Inflammation-guided management was superior for both discordant subgroups leading to a reduction in exacerbation frequency in the inflammation-predominant cluster (3.53 [SD, 1.18] vs. 0.38 [SD, 0.13] exacerbation/patient/yr, P = 0.002) and a dose reduction of inhaled corticosteroid in the symptom-predominant cluster (mean difference, 1,829 μg beclomethasone equivalent/d [95% confidence interval, 307–3,349 μg]; P = 0.02). Conclusions Cluster analysis offers a novel multidimensional approach for identifying asthma phenotypes that exhibit differences in clinical response to treatment algorithms. PMID:18480428

  9. Correcting an analysis of variance for clustering.

    PubMed

    Hedges, Larry V; Rhoads, Christopher H

    2011-02-01

    A great deal of educational and social data arises from cluster sampling designs where clusters involve schools, classrooms, or communities. A mistake that is sometimes encountered in the analysis of such data is to ignore the effect of clustering and analyse the data as if it were based on a simple random sample. This typically leads to an overstatement of the precision of results and too liberal conclusions about precision and statistical significance of mean differences. This paper gives simple corrections to the test statistics that would be computed in an analysis of variance if clustering were (incorrectly) ignored. The corrections are multiplicative factors depending on the total sample size, the cluster size, and the intraclass correlation structure. For example, the corrected F statistic has Fisher's F distribution with reduced degrees of freedom. The corrected statistic reduces to the F statistic computed by ignoring clustering when the intraclass correlations are zero. It reduces to the F statistic computed using cluster means when the intraclass correlations are unity, and it is in between otherwise. A similar adjustment to the usual statistic for testing a linear contrast among group means is described.

  10. ASteCA: Automated Stellar Cluster Analysis

    NASA Astrophysics Data System (ADS)

    Perren, G. I.; Vázquez, R. A.; Piatti, A. E.

    2015-04-01

    We present the Automated Stellar Cluster Analysis package (ASteCA), a suit of tools designed to fully automate the standard tests applied on stellar clusters to determine their basic parameters. The set of functions included in the code make use of positional and photometric data to obtain precise and objective values for a given cluster's center coordinates, radius, luminosity function and integrated color magnitude, as well as characterizing through a statistical estimator its probability of being a true physical cluster rather than a random overdensity of field stars. ASteCA incorporates a Bayesian field star decontamination algorithm capable of assigning membership probabilities using photometric data alone. An isochrone fitting process based on the generation of synthetic clusters from theoretical isochrones and selection of the best fit through a genetic algorithm is also present, which allows ASteCA to provide accurate estimates for a cluster's metallicity, age, extinction and distance values along with its uncertainties. To validate the code we applied it on a large set of over 400 synthetic MASSCLEAN clusters with varying degrees of field star contamination as well as a smaller set of 20 observed Milky Way open clusters (Berkeley 7, Bochum 11, Czernik 26, Czernik 30, Haffner 11, Haffner 19, NGC 133, NGC 2236, NGC 2264, NGC 2324, NGC 2421, NGC 2627, NGC 6231, NGC 6383, NGC 6705, Ruprecht 1, Tombaugh 1, Trumpler 1, Trumpler 5 and Trumpler 14) studied in the literature. The results show that ASteCA is able to recover cluster parameters with an acceptable precision even for those clusters affected by substantial field star contamination. ASteCA is written in Python and is made available as an open source code which can be downloaded ready to be used from its official site.

  11. Optimal wavelength band clustering for multispectral iris recognition.

    PubMed

    Gong, Yazhuo; Zhang, David; Shi, Pengfei; Yan, Jingqi

    2012-07-01

    This work explores the possibility of clustering spectral wavelengths based on the maximum dissimilarity of iris textures. The eventual goal is to determine how many bands of spectral wavelengths will be enough for iris multispectral fusion and to find these bands that will provide higher performance of iris multispectral recognition. A multispectral acquisition system was first designed for imaging the iris at narrow spectral bands in the range of 420 to 940 nm. Next, a set of 60 human iris images that correspond to the right and left eyes of 30 different subjects were acquired for an analysis. Finally, we determined that 3 clusters were enough to represent the 10 feature bands of spectral wavelengths using the agglomerative clustering based on two-dimensional principal component analysis. The experimental results suggest (1) the number, center, and composition of clusters of spectral wavelengths and (2) the higher performance of iris multispectral recognition based on a three wavelengths-bands fusion.

  12. Cluster and constraint analysis in tetrahedron packings.

    PubMed

    Jin, Weiwei; Lu, Peng; Liu, Lufeng; Li, Shuixiang

    2015-04-01

    The disordered packings of tetrahedra often show no obvious macroscopic orientational or positional order for a wide range of packing densities, and it has been found that the local order in particle clusters is the main order form of tetrahedron packings. Therefore, a cluster analysis is carried out to investigate the local structures and properties of tetrahedron packings in this work. We obtain a cluster distribution of differently sized clusters, and peaks are observed at two special clusters, i.e., dimer and wagon wheel. We then calculate the amounts of dimers and wagon wheels, which are observed to have linear or approximate linear correlations with packing density. Following our previous work, the amount of particles participating in dimers is used as an order metric to evaluate the order degree of the hierarchical packing structure of tetrahedra, and an order map is consequently depicted. Furthermore, a constraint analysis is performed to determine the isostatic or hyperstatic region in the order map. We employ a Monte Carlo algorithm to test jamming and then suggest a new maximally random jammed packing of hard tetrahedra from the order map with a packing density of 0.6337.

  13. Identifying Peer Institutions Using Cluster Analysis

    ERIC Educational Resources Information Center

    Boronico, Jess; Choksi, Shail S.

    2012-01-01

    The New York Institute of Technology's (NYIT) School of Management (SOM) wishes to develop a list of peer institutions for the purpose of benchmarking and monitoring/improving performance against other business schools. The procedure utilizes relevant criteria for the purpose of establishing this peer group by way of a cluster analysis. The…

  14. Systematization of actinides using cluster analysis

    SciTech Connect

    Kopyrin, A.A.; Terent`eva, T.N.; Khramov, N.N.

    1994-11-01

    A representation of the actinides in multidimensional property space is proposed for systematization of these elements using cluster analysis. Literature data for their atomic properties are used. Owing to the wide variation of published ionization potentials, medians are used to estimate them. Vertical dendograms are used for classification on the basis of distances between the actinides in atomic-property space. The properties of actinium and lawrencium are furthest removed from the main group. Thorium and mendelevium exhibit individualized properties. A cluster based on the einsteinium-fermium pair is joined by californium.

  15. Electrical Load Profile Analysis Using Clustering Techniques

    NASA Astrophysics Data System (ADS)

    Damayanti, R.; Abdullah, A. G.; Purnama, W.; Nandiyanto, A. B. D.

    2017-03-01

    Data mining is one of the data processing techniques to collect information from a set of stored data. Every day the consumption of electricity load is recorded by Electrical Company, usually at intervals of 15 or 30 minutes. This paper uses a clustering technique, which is one of data mining techniques to analyse the electrical load profiles during 2014. The three methods of clustering techniques were compared, namely K-Means (KM), Fuzzy C-Means (FCM), and K-Means Harmonics (KHM). The result shows that KHM is the most appropriate method to classify the electrical load profile. The optimum number of clusters is determined using the Davies-Bouldin Index. By grouping the load profile, the demand of variation analysis and estimation of energy loss from the group of load profile with similar pattern can be done. From the group of electric load profile, it can be known cluster load factor and a range of cluster loss factor that can help to find the range of values of coefficients for the estimated loss of energy without performing load flow studies.

  16. AMOEBA clustering revisited. [cluster analysis, classification, and image display program

    NASA Technical Reports Server (NTRS)

    Bryant, Jack

    1990-01-01

    A description of the clustering, classification, and image display program AMOEBA is presented. Using a difficult high resolution aircraft-acquired MSS image, the steps the program takes in forming clusters are traced. A number of new features are described here for the first time. Usage of the program is discussed. The theoretical foundation (the underlying mathematical model) is briefly presented. The program can handle images of any size and dimensionality.

  17. Equivalent damage validation by variable cluster analysis

    NASA Astrophysics Data System (ADS)

    Drago, Carlo; Ferlito, Rachele; Zucconi, Maria

    2016-06-01

    The main aim of this work is to perform a clustering analysis on the damage relieved in the old center of L'Aquila after the earthquake occurred on April 6, 2009 and to validate an Indicator of Equivalent Damage ED that summarizes the information reported on the AeDES card regarding the level of damage and their extension on the surface of the buildings. In particular we used a sample of 13442 masonry buildings located in an area characterized by a Macroseismic Intensity equal to 8 [1]. The aim is to ensure the coherence between the clusters and its hierarchy identified in the data of damage detected and in the data of the ED elaborated.

  18. Chaotic map clustering algorithm for EEG analysis

    NASA Astrophysics Data System (ADS)

    Bellotti, R.; De Carlo, F.; Stramaglia, S.

    2004-03-01

    The non-parametric chaotic map clustering algorithm has been applied to the analysis of electroencephalographic signals, in order to recognize the Huntington's disease, one of the most dangerous pathologies of the central nervous system. The performance of the method has been compared with those obtained through parametric algorithms, as K-means and deterministic annealing, and supervised multi-layer perceptron. While supervised neural networks need a training phase, performed by means of data tagged by the genetic test, and the parametric methods require a prior choice of the number of classes to find, the chaotic map clustering gives a natural evidence of the pathological class, without any training or supervision, thus providing a new efficient methodology for the recognition of patterns affected by the Huntington's disease.

  19. Using cluster analysis of cytokines to identify patterns of inflammation in hospitalized patients with community-acquired pneumonia: a pilot study

    PubMed Central

    Wiemken, Timothy L; Kelley, Robert R; Fernandez-Botran, Rafael; Mattingly, William A.; Arnold, Forest W.; Furmanek, Stephen P; Restrepo, Marcos I; Chalmers, James D; Peyrani, Paula; Cavallazzi, Rodrigo; Bordon, Jose; Aliberti, Stefano; Ramirez, Julio A.

    2017-01-01

    Introduction Patients with severe community-acquired pneumonia (CAP) are believed to have an exaggerated inflammatory response to bacterial infection. Therapies aiming to modulate the inflammatory response have been largely unsuccessful, perhaps reflecting that CAP is a heterogeneous disorder that cannot be modulated by a single anti-inflammatory approach. We hypothesize that the host inflammatory response to pneumonia may be characterized by distinct cytokine patterns, which can be harnessed for personalized therapies. Methods Here, we use hierarchical cluster analysis of cytokines to examine if patterns of inflammatory response in 13 hospitalized patients with CAP can be defined. This was a secondary data analysis of the Community-Acquired Pneumonia Inflammatory Study Group (CAPISG) database. The following cytokines were measured in plasma and sputum on the day of admission: interleukin (IL)-1β, IL-1 receptor antagonist (IL-1ra), IL-6, CXCL8 (IL-8), IL-10, IL-12p40, IL-17, interferon (IFN)γ, tumor necrosis factor (TNF)α, and CXCL10 (IP-10). Hierarchical agglomerative clustering algorithms were used to evaluate clusters of patients within plasma and sputum cytokine determinations. Results A total of thirteen patients were included in this pilot study. Cluster analysis identified distinct inflammatory response patterns of cytokines in the plasma, sputum, and the ratio of plasma to sputum. Conclusions Inflammatory response patterns in plasma and sputum can be identified in hospitalized patients with CAP. Characterization of the local and systemic inflammatory response may help to better discriminate patients for enrollment into clinical trials of immunomodulatory therapies. PMID:28393141

  20. Constructing storyboards based on hierarchical clustering analysis

    NASA Astrophysics Data System (ADS)

    Hasebe, Satoshi; Sami, Mustafa M.; Muramatsu, Shogo; Kikuchi, Hisakazu

    2005-07-01

    There are growing needs for quick preview of video contents for the purpose of improving accessibility of video archives as well as reducing network traffics. In this paper, a storyboard that contains a user-specified number of keyframes is produced from a given video sequence. It is based on hierarchical cluster analysis of feature vectors that are derived from wavelet coefficients of video frames. Consistent use of extracted feature vectors is the key to avoid a repetition of computationally-intensive parsing of the same video sequence. Experimental results suggest that a significant reduction in computational time is gained by this strategy.

  1. Accelerating DNA analysis applications on GPU clusters

    SciTech Connect

    Tumeo, Antonino; Villa, Oreste

    2010-06-13

    DNA analysis is an emerging application of high performance bioinformatic. Modern sequencing machinery are able to provide, in few hours, large input streams of data which needs to be matched against exponentially growing databases known fragments. The ability to recognize these patterns effectively and fastly may allow extending the scale and the reach of the investigations performed by biology scientists. Aho-Corasick is an exact, multiple pattern matching algorithm often at the base of this application. High performance systems are a promising platform to accelerate this algorithm, which is computationally intensive but also inherently parallel. Nowadays, high performance systems also include heterogeneous processing elements, such as Graphic Processing Units (GPUs), to further accelerate parallel algorithms. Unfortunately, the Aho-Corasick algorithm exhibits large performance variabilities, depending on the size of the input streams, on the number of patterns to search and on the number of matches, and poses significant challenges on current high performance software and hardware implementations. An adequate mapping of the algorithm on the target architecture, coping with the limit of the underlining hardware, is required to reach the desired high throughputs. Load balancing also plays a crucial role when considering the limited bandwidth among the nodes of these systems. In this paper we present an efficient implementation of the Aho-Corasick algorithm for high performance clusters accelerated with GPUs. We discuss how we partitioned and adapted the algorithm to fit the Tesla C1060 GPU and then present a MPI based implementation for a heterogeneous high performance cluster. We compare this implementation to MPI and MPI with pthreads based implementations for a homogeneous cluster of x86 processors, discussing the stability vs. the performance and the scaling of the solutions, taking into consideration aspects such as the bandwidth among the different nodes.

  2. Cluster analysis applied to multiparameter geophysical dataset

    NASA Astrophysics Data System (ADS)

    Di Giuseppe, M. G.; Troiano, A.; Troise, C.; De Natale, G.

    2012-04-01

    Multi-parameter acquisition is a common geophysical field practice nowadays. Regularly seismic velocity and attenuation, gravity and electromagnetic dataset are acquired in a certain area, to obtain a complete characterization of the some investigate feature of the subsoil. Such a richness of information is often underestimated, although an integration of the analysis could provide a notable improving in the imaging of the investigated structures, mostly because the handling of distinct parameters and their joint inversion still presents several and severe problems. Post-inversion statistical techniques represent a promising approach to these questions, providing a quick, simple and elegant way to obtain this advantageous but complex integration. We present an approach based on the partition of the analyzed multi parameter dataset in a number of different classes, identified as localized regions of high correlation. These classes, or 'Cluster', are structured in such a way that the observations pertaining to a certain group are more similar to each other than the observations belonging to a different one, according to an optimal logical criterion. Regions of the subsoil sharing the same physical characteristic are so identified, without a-priori or empirical relationship linking the distinct measured parameters. The retrieved imaging results highly affordable in a statistical sense, specifically due to this lack of external hypothesis that are, instead, indispensable in a full joint inversion, were works, as matter of fact, just a real constrain for the inversion process, not seldom of relative consistence. We apply our procedure to a certain number of experimental dataset, related to several structures at very different scales presents in the Campanian district (southern Italy). These structures goes from the shallows evidence of the active fault zone originating the M 7.9 Irpinia earthquake to the main feature characterizing the Campi Flegrei Caldera and the Mt

  3. Cluster analysis of word frequency dynamics

    NASA Astrophysics Data System (ADS)

    Maslennikova, Yu S.; Bochkarev, V. V.; Belashova, I. A.

    2015-01-01

    This paper describes the analysis and modelling of word usage frequency time series. During one of previous studies, an assumption was put forward that all word usage frequencies have uniform dynamics approaching the shape of a Gaussian function. This assumption can be checked using the frequency dictionaries of the Google Books Ngram database. This database includes 5.2 million books published between 1500 and 2008. The corpus contains over 500 billion words in American English, British English, French, German, Spanish, Russian, Hebrew, and Chinese. We clustered time series of word usage frequencies using a Kohonen neural network. The similarity between input vectors was estimated using several algorithms. As a result of the neural network training procedure, more than ten different forms of time series were found. They describe the dynamics of word usage frequencies from birth to death of individual words. Different groups of word forms were found to have different dynamics of word usage frequency variations.

  4. Failure Mode Identification Through Clustering Analysis

    NASA Technical Reports Server (NTRS)

    Arunajadai, Srikesh G.; Stone, Robert B.; Tumer, Irem Y.; Clancy, Daniel (Technical Monitor)

    2002-01-01

    Research has shown that nearly 80% of the costs and problems are created in product development and that cost and quality are essentially designed into products in the conceptual stage. Currently, failure identification procedures (such as FMEA (Failure Modes and Effects Analysis), FMECA (Failure Modes, Effects and Criticality Analysis) and FTA (Fault Tree Analysis)) and design of experiments are being used for quality control and for the detection of potential failure modes during the detail design stage or post-product launch. Though all of these methods have their own advantages, they do not give information as to what are the predominant failures that a designer should focus on while designing a product. This work uses a functional approach to identify failure modes, which hypothesizes that similarities exist between different failure modes based on the functionality of the product/component. In this paper, a statistical clustering procedure is proposed to retrieve information on the set of predominant failures that a function experiences. The various stages of the methodology are illustrated using a hypothetical design example.

  5. Somatotyping using 3D anthropometry: a cluster analysis.

    PubMed

    Olds, Tim; Daniell, Nathan; Petkov, John; David Stewart, Arthur

    2013-01-01

    Somatotyping is the quantification of human body shape, independent of body size. Hitherto, somatotyping (including the most popular method, the Heath-Carter system) has been based on subjective visual ratings, sometimes supported by surface anthropometry. This study used data derived from three-dimensional (3D) whole-body scans as inputs for cluster analysis to objectively derive clusters of similar body shapes. Twenty-nine dimensions normalised for body size were measured on a purposive sample of 301 adults aged 17-56 years who had been scanned using a Vitus Smart laser scanner. K-means Cluster Analysis with v-fold cross-validation was used to determine shape clusters. Three male and three female clusters emerged, and were visualised using those scans closest to the cluster centroid and a caricature defined by doubling the difference between the average scan and the cluster centroid. The male clusters were decidedly endomorphic (high fatness), ectomorphic (high linearity), and endo-mesomorphic (a mixture of fatness and muscularity). The female clusters were clearly endomorphic, ectomorphic, and the ecto-mesomorphic (a mixture of linearity and muscularity). An objective shape quantification procedure combining 3D scanning and cluster analysis yielded shape clusters strikingly similar to traditional somatotyping.

  6. A hybrid monkey search algorithm for clustering analysis.

    PubMed

    Chen, Xin; Zhou, Yongquan; Luo, Qifang

    2014-01-01

    Clustering is a popular data analysis and data mining technique. The k-means clustering algorithm is one of the most commonly used methods. However, it highly depends on the initial solution and is easy to fall into local optimum solution. In view of the disadvantages of the k-means method, this paper proposed a hybrid monkey algorithm based on search operator of artificial bee colony algorithm for clustering analysis and experiment on synthetic and real life datasets to show that the algorithm has a good performance than that of the basic monkey algorithm for clustering analysis.

  7. Instantaneous normal mode analysis of melting of finite dust clusters.

    PubMed

    Melzer, André; Schella, André; Schablinski, Jan; Block, Dietmar; Piel, Alexander

    2012-06-01

    The experimental melting transition of finite two-dimensional dust clusters in a dusty plasma is analyzed using the method of instantaneous normal modes. In the experiment, dust clusters are heated in a thermodynamic equilibrium from a solid to a liquid state using a four-axis laser manipulation system. The fluid properties of the dust cluster, such as the diffusion constant, are measured from the instantaneous normal mode analysis. Thereby, the phase transition of these finite clusters is approached from the liquid phase. From the diffusion constants, unique melting temperatures have been assigned to dust clusters of various sizes that very well reflect their dynamical stability properties.

  8. ARC: automated resource classifier for agglomerative functional classification of prokaryotic proteins using annotation texts.

    PubMed

    Gnanamani, Muthiah; Kumar, Naveen; Ramachandran, Srinivasan

    2007-08-01

    Functional classification of proteins is central to comparative genomics. The need for algorithms tuned to enable integrative interpretation of analytical data is felt globally. The availability of a general,automated software with built-in flexibility will significantly aid this activity. We have prepared ARC (Automated Resource Classifier), which is an open source software meeting the user requirements of flexibility. The default classification scheme based on keyword match is agglomerative and directs entries into any of the 7 basic non-overlapping functional classes: Cell wall, Cell membrane and Transporters (C), Cell division (D), Information (I), Translocation (L), Metabolism (M), Stress(R), Signal and communication (S) and 2 ancillary classes: Others (O) and Hypothetical (H). The keyword library of ARC was built serially by first drawing keywords from Bacillus subtilis and Escherichia coli K12. In subsequent steps,this library was further enriched by collecting terms from archaeal representative Archaeoglobus fulgidus, Gene Ontology, and Gene Symbols. ARC is 94.04% successful on 6,75,663 annotated proteins from 348 prokaryotes. Three examples are provided to illuminate the current perspectives on mycobacterial physiology and costs of proteins in 333 prokaryotes. ARC is available at http://arc.igib.res.in.

  9. Simultaneous Two-Way Clustering of Multiple Correspondence Analysis

    ERIC Educational Resources Information Center

    Hwang, Heungsun; Dillon, William R.

    2010-01-01

    A 2-way clustering approach to multiple correspondence analysis is proposed to account for cluster-level heterogeneity of both respondents and variable categories in multivariate categorical data. Specifically, in the proposed method, multiple correspondence analysis is combined with k-means in a unified framework in which "k"-means is…

  10. Using Cluster Analysis for Data Mining in Educational Technology Research

    ERIC Educational Resources Information Center

    Antonenko, Pavlo D.; Toy, Serkan; Niederhauser, Dale S.

    2012-01-01

    Cluster analysis is a group of statistical methods that has great potential for analyzing the vast amounts of web server-log data to understand student learning from hyperlinked information resources. In this methodological paper we provide an introduction to cluster analysis for educational technology researchers and illustrate its use through…

  11. A Survey of Popular R Packages for Cluster Analysis

    ERIC Educational Resources Information Center

    Flynt, Abby; Dean, Nema

    2016-01-01

    Cluster analysis is a set of statistical methods for discovering new group/class structure when exploring data sets. This article reviews the following popular libraries/commands in the R software language for applying different types of cluster analysis: from the stats library, the kmeans, and hclust functions; the mclust library; the poLCA…

  12. The Psychology of Yoga Practitioners: A Cluster Analysis.

    PubMed

    Genovese, Jeremy E C; Fondran, Kristine M

    2017-03-30

    Yoga practitioners (N = 261) completed the revised Expression of Spirituality Inventory (ESI) and the Multidimensional Body-Self Relations Questionnaire. Cluster analysis revealed three clusters: Cluster A scored high on all four spiritual constructs. They had high positive evaluations of their appearance, but a lower orientation towards their appearance. They tended to have a high evaluation of their fitness and health, and higher body satisfaction. Cluster B showed lower scores on the spiritual constructs. Like Cluster A, members of Cluster B tended to show high positive evaluations of appearance and fitness. They also had higher body satisfaction. Members of Cluster B had a higher fitness orientation and a higher appearance orientation than members of Cluster A. Members of Cluster C had low scores for all spiritual constructs. They had a low evaluation of, and unhappiness with, their appearance. They were unhappy with the size and appearance of their bodies. They tended to see themselves as overweight. There was a significant difference in years of practice between the three groups (Kruskall-Wallis, p = .0041). Members of Cluster A have the most years of yoga experience and members of Cluster B have more yoga experience than members of Cluster C. These results suggest the possible existence of a developmental trajectory for yoga practitioners. Such a developmental sequence may have important implications for yoga practice and instruction.

  13. Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale

    PubMed Central

    Kobourov, Stephen; Gallant, Mike; Börner, Katy

    2016-01-01

    Overview Notions of community quality underlie the clustering of networks. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms—Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. Cluster Quality Metrics We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on the information recovery metrics. Additionally, our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information. Network Clustering Algorithms Smart local moving is the overall best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it an absolutely superior algorithm. Interestingly, Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large

  14. Evaluation of Hierarchical Clustering Algorithms for Document Datasets

    DTIC Science & Technology

    2002-06-03

    new class of agglomerative algorithms, in which we introduced intermediate clusters obtained by partitional clustering algorithms to constrain the space ...of the corresponding clusters. The various clustering algorithms that are described in this paper use the vector- space model [26] to represent each...document. In this model, each document d is considered to be a vector in the term- space . In particular, we employed the t f id f term weighting model

  15. Characterization of population exposure to organochlorines: a cluster analysis application.

    PubMed

    Guimarães, Raphael Mendonça; Asmus, Carmen Ildes Rodrigues Fróes; Burdorf, Alex

    2013-06-01

    This study aimed to show the results from a cluster analysis application in the characterization of population exposure to organochlorines through variables related to time and exposure dose. Characteristics of 354 subjects in a population exposed to organochlorine pesticides residues related to time and exposure dose were subjected to cluster analysis to separate them into subgroups. We performed hierarchical cluster analysis. To evaluate the classification accuracy, compared to intra-group and inter-group variability by ANOVA for each dimension. The aggregation strategy was accomplished by the method of Ward. It was, for the creation of clusters, variables associated with exposure and routes of contamination. The information on the estimated intake doses of compound were used to weight the values of exposure time at each of the routes, so as to obtain values proxy exposure intensity. The results showed three clusters: cluster 1 (n = 45), characteristics of greatest exposure, the cluster 2 (n = 103), intermediate exposure, and cluster 3 (n = 206), less exposure. The bivariate analyzes performed with groups that are groups showed a statistically significant difference. This study demonstrated the applicability of cluster analysis to categorize populations exposed to organochlorines and also points to the relevance of typological studies that may contribute to a better classification of subjects exposed to chemical agents, which is typical of environmental epidemiology studies to a wider understanding of etiological, preventive and therapeutic contamination.

  16. Investigating Subtypes of Child Development: A Comparison of Cluster Analysis and Latent Class Cluster Analysis in Typology Creation

    ERIC Educational Resources Information Center

    DiStefano, Christine; Kamphaus, R. W.

    2006-01-01

    Two classification methods, latent class cluster analysis and cluster analysis, are used to identify groups of child behavioral adjustment underlying a sample of elementary school children aged 6 to 11 years. Behavioral rating information across 14 subscales was obtained from classroom teachers and used as input for analyses. Both the procedures…

  17. MASSCLEAN: MASSive CLuster Evolution and ANalysis package -- A new tool for stellar clusters

    NASA Astrophysics Data System (ADS)

    Popescu, Bogdan

    2010-11-01

    Stellar clusters are laboratories for stellar evolution. Their stellar content have an uniform age and chemical composition, but span a large mass interval. The majority of stars are born in clusters and end up in the general field population. An accurate characterization of stellar clusters could be used to built better models, from stellar evolution to the evolution of an entire galaxy. Regardless of the fact that they are so close, for many Milky Way clusters it is difficult to be observed because they are obscured by the dust in the disk of our Galaxy. The clusters from the Local Group and beyond are too distant, so only their integrated properties could be used most of the time. There is one way to analyze the observational data, to search for clusters, and to describe them: simulations. MASSCLEAN (MASSive CLuster Evolution and ANalysis) package was developed to provide a better characterization of Galactic clusters, to derive selection effects of current surveys, and to provide information about the extra-galactic clusters. Simulations of known Galactic clusters are used to get better constraints on their parameters, like mass, age, extinction, chemical composition and distance. This is the traditional way to describe the Galactic clusters, fitting the data using the available models. The difference is that MASSCLEAN simulations provide a consistent set of parameters. The majority of extra-galactic clusters are known only from their integrated properties, integrated magnitudes and colors. The current models for stellar populations are available only in the infinite mass limit. But the real clusters have a finite mass, and their integrated colors show a large dispersion (stochastic fluctuations). The description of the variation of integrated colors as a function of mass and age lead to the creation of MASSCLEANcolors database, based on 70 million Monte Carlo simulations. Since the entries in the database form a consistent set of integrated colors, integrated

  18. Two-Way Regularized Fuzzy Clustering of Multiple Correspondence Analysis.

    PubMed

    Kim, Sunmee; Choi, Ji Yeh; Hwang, Heungsun

    2017-01-01

    Multiple correspondence analysis (MCA) is a useful tool for investigating the interrelationships among dummy-coded categorical variables. MCA has been combined with clustering methods to examine whether there exist heterogeneous subclusters of a population, which exhibit cluster-level heterogeneity. These combined approaches aim to classify either observations only (one-way clustering of MCA) or both observations and variable categories (two-way clustering of MCA). The latter approach is favored because its solutions are easier to interpret by providing explicitly which subgroup of observations is associated with which subset of variable categories. Nonetheless, the two-way approach has been built on hard classification that assumes observations and/or variable categories to belong to only one cluster. To relax this assumption, we propose two-way fuzzy clustering of MCA. Specifically, we combine MCA with fuzzy k-means simultaneously to classify a subgroup of observations and a subset of variable categories into a common cluster, while allowing both observations and variable categories to belong partially to multiple clusters. Importantly, we adopt regularized fuzzy k-means, thereby enabling us to decide the degree of fuzziness in cluster memberships automatically. We evaluate the performance of the proposed approach through the analysis of simulated and real data, in comparison with existing two-way clustering approaches.

  19. Visual cluster analysis and pattern recognition methods

    DOEpatents

    Osbourn, Gordon Cecil; Martinez, Rubel Francisco

    2001-01-01

    A method of clustering using a novel template to define a region of influence. Using neighboring approximation methods, computation times can be significantly reduced. The template and method are applicable and improve pattern recognition techniques.

  20. The smart cluster method - Adaptive earthquake cluster identification and analysis in strong seismic regions

    NASA Astrophysics Data System (ADS)

    Schaefer, Andreas M.; Daniell, James E.; Wenzel, Friedemann

    2017-03-01

    Earthquake clustering is an essential part of almost any statistical analysis of spatial and temporal properties of seismic activity. The nature of earthquake clusters and subsequent declustering of earthquake catalogues plays a crucial role in determining the magnitude-dependent earthquake return period and its respective spatial variation for probabilistic seismic hazard assessment. This study introduces the Smart Cluster Method (SCM), a new methodology to identify earthquake clusters, which uses an adaptive point process for spatio-temporal cluster identification. It utilises the magnitude-dependent spatio-temporal earthquake density to adjust the search properties, subsequently analyses the identified clusters to determine directional variation and adjusts its search space with respect to directional properties. In the case of rapid subsequent ruptures like the 1992 Landers sequence or the 2010-2011 Darfield-Christchurch sequence, a reclassification procedure is applied to disassemble subsequent ruptures using near-field searches, nearest neighbour classification and temporal splitting. The method is capable of identifying and classifying earthquake clusters in space and time. It has been tested and validated using earthquake data from California and New Zealand. A total of more than 1500 clusters have been found in both regions since 1980 with M m i n = 2.0. Utilising the knowledge of cluster classification, the method has been adjusted to provide an earthquake declustering algorithm, which has been compared to existing methods. Its performance is comparable to established methodologies. The analysis of earthquake clustering statistics lead to various new and updated correlation functions, e.g. for ratios between mainshock and strongest aftershock and general aftershock activity metrics.

  1. Distinct clinical phenotypes of airways disease defined by cluster analysis.

    PubMed

    Weatherall, M; Travers, J; Shirtcliffe, P M; Marsh, S E; Williams, M V; Nowitz, M R; Aldington, S; Beasley, R

    2009-10-01

    Airways disease is currently classified using diagnostic labels such as asthma, chronic bronchitis and emphysema. The current definitions of these classifications may not reflect the phenotypes of airways disease in the community, which may have differing disease processes, clinical features or responses to treatment. The aim of the present study was to use cluster analysis to explore clinical phenotypes in a community population with airways disease. A random population sample of 25-75-yr-old adults underwent detailed investigation, including a clinical questionnaire, pulmonary function tests, nitric oxide measurements, blood tests and chest computed tomography. Cluster analysis was performed on the subgroup with current respiratory symptoms or obstructive spirometric results. Subjects with a complete dataset (n = 175) were included in the cluster analysis. Five clusters were identified with the following characteristics: cluster 1: severe and markedly variable airflow obstruction with features of atopic asthma, chronic bronchitis and emphysema; cluster 2: features of emphysema alone; cluster 3: atopic asthma with eosinophilic airways inflammation; cluster 4: mild airflow obstruction without other dominant phenotypic features; and cluster 5: chronic bronchitis in nonsmokers. Five distinct clinical phenotypes of airflow obstruction were identified. If confirmed in other populations, these findings may form the basis of a modified taxonomy for the disorders of airways obstruction.

  2. Correlation analysis of objectively defined galaxy and cluster catalogues

    NASA Astrophysics Data System (ADS)

    Stevenson, P. R. F.; Fong, R.; Shanks, T.

    1988-10-01

    The authors present further galaxy clustering results from the objective COSMOS/UKST galaxy catalogue of Stevenson et al. They first re-examine the results of SSFM for the galaxy correlation function, wgg(θ), testing the stability of the result against possible systematic effects and extending the analysis to larger angular scales. They then use the method of Turner & Gott to automatically detect groups and clusters in these catalogues. The authors next present the cluster-galaxy cross-correlation function wcg. Finally, the above correlation analyses are carried out on simulated galaxy and cluster catalogues.

  3. First CCD UBVI photometric analysis of six open cluster candidates

    NASA Astrophysics Data System (ADS)

    Piatti, A. E.; Clariá, J. J.; Ahumada, A. V.

    2011-04-01

    We have obtained CCD UBVIKC photometry down to V ˜ 22 for the open cluster candidates Haffner 3, Haffner 5, NGC 2368, Haffner 25, Hogg 3 and Hogg 4 and their surrounding fields. None of these objects have been photometrically studied so far. Our analysis shows that these stellar groups are not genuine open clusters since no clear main sequences or other meaningful features can be seen in their colour-magnitude and colour-colour diagrams. We checked for possible differential reddening across the studied fields that could be hiding the characteristics of real open clusters. However, the dust in the directions to these objects appears to be uniformly distributed. Moreover, star counts carried out within and outside the open cluster candidate fields do not support the hypothesis that these objects are real open clusters or even open cluster remnants.

  4. Obstructive Sleep Apnea: A Cluster Analysis at Time of Diagnosis

    PubMed Central

    Grillet, Yves; Richard, Philippe; Stach, Bruno; Vivodtzev, Isabelle; Timsit, Jean-Francois; Lévy, Patrick; Tamisier, Renaud; Pépin, Jean-Louis

    2016-01-01

    Background The classification of obstructive sleep apnea is on the basis of sleep study criteria that may not adequately capture disease heterogeneity. Improved phenotyping may improve prognosis prediction and help select therapeutic strategies. Objectives: This study used cluster analysis to investigate the clinical clusters of obstructive sleep apnea. Methods An ascending hierarchical cluster analysis was performed on baseline symptoms, physical examination, risk factor exposure and co-morbidities from 18,263 participants in the OSFP (French national registry of sleep apnea). The probability for criteria to be associated with a given cluster was assessed using odds ratios, determined by univariate logistic regression. Results: Six clusters were identified, in which patients varied considerably in age, sex, symptoms, obesity, co-morbidities and environmental risk factors. The main significant differences between clusters were minimally symptomatic versus sleepy obstructive sleep apnea patients, lean versus obese, and among obese patients different combinations of co-morbidities and environmental risk factors. Conclusions Our cluster analysis identified six distinct clusters of obstructive sleep apnea. Our findings underscore the high degree of heterogeneity that exists within obstructive sleep apnea patients regarding clinical presentation, risk factors and consequences. This may help in both research and clinical practice for validating new prevention programs, in diagnosis and in decisions regarding therapeutic strategies. PMID:27314230

  5. Mission analysis of clusters of satellites

    NASA Astrophysics Data System (ADS)

    Frayssinhes, Eric; Lansard, Erick

    1996-09-01

    An innovative satellite system that provides high precision localisation of beacon positions consists of a cluster of satellites, i.e. a group of satellites that maintain assigned positions at relatively short distances from each other. Compared to a single satellite, the interest of such a cluster lies in its ability to synthesise antenna bases much longer than those who can be physically mounted on one satellite. Each satellite of the cluster measures the time-of-arrival of the signal transmitted by the beacon. The derived time-differences-of-arrival (TDOA) are processed to estimate the beacon position. At first, this paper summarises the investigations performed on the localisation accuracy that have yielded the optimal cluster geometry. In a previous paper [E. Frayssinhes and E. Lansard, AAS paper 95-334 (1995)], Alcatel Espace has proposed a mathematical formulation relying on a strong analogy with GPS geometrical characterisation of navigation performances. The effects of geometry are expressed by geometric dilution of precision (GDOP) parameters. Such parameters are obtained by solving the TDOA measurement equations for the beacon position using an iterated-least-squares procedure. Then, the paper focuses at the system level on the peculiar problems that arise when such a satellite cluster system is dealt with, and more particularly the launch and early operations phases, the station-keeping strategies of manoeuvres, and the relative localisation and clock synchronisation of the satellites. In particular, it is shown that even with the "civil" C/A GPS measurements, differential techniques can yield respective accuracies better than 5 m r.m.s. and 15 ns r.m.s.

  6. Visual verification and analysis of cluster detection for molecular dynamics.

    PubMed

    Grottel, Sebastian; Reina, Guido; Vrabec, Jadran; Ertl, Thomas

    2007-01-01

    A current research topic in molecular thermodynamics is the condensation of vapor to liquid and the investigation of this process at the molecular level. Condensation is found in many physical phenomena, e.g. the formation of atmospheric clouds or the processes inside steam turbines, where a detailed knowledge of the dynamics of condensation processes will help to optimize energy efficiency and avoid problems with droplets of macroscopic size. The key properties of these processes are the nucleation rate and the critical cluster size. For the calculation of these properties it is essential to make use of a meaningful definition of molecular clusters, which currently is a not completely resolved issue. In this paper a framework capable of interactively visualizing molecular datasets of such nucleation simulations is presented, with an emphasis on the detected molecular clusters. To check the quality of the results of the cluster detection, our framework introduces the concept of flow groups to highlight potential cluster evolution over time which is not detected by the employed algorithm. To confirm the findings of the visual analysis, we coupled the rendering view with a schematic view of the clusters' evolution. This allows to rapidly assess the quality of the molecular cluster detection algorithm and to identify locations in the simulation data in space as well as in time where the cluster detection fails. Thus, thermodynamics researchers can eliminate weaknesses in their cluster detection algorithms. Several examples for the effective and efficient usage of our tool are presented.

  7. Automated analysis of organic particles using cluster SIMS

    NASA Astrophysics Data System (ADS)

    Gillen, Greg; Zeissler, Cindy; Mahoney, Christine; Lindstrom, Abigail; Fletcher, Robert; Chi, Peter; Verkouteren, Jennifer; Bright, David; Lareau, Richard T.; Boldman, Mike

    2004-06-01

    Cluster primary ion bombardment combined with secondary ion imaging is used on an ion microscope secondary ion mass spectrometer for the spatially resolved analysis of organic particles on various surfaces. Compared to the use of monoatomic primary ion beam bombardment, the use of a cluster primary ion beam (SF 5+ or C 8-) provides significant improvement in molecular ion yields and a reduction in beam-induced degradation of the analyte molecules. These characteristics of cluster bombardment, along with automated sample stage control and custom image analysis software are utilized to rapidly characterize the spatial distribution of trace explosive particles, narcotics and inkjet-printed microarrays on a variety of surfaces.

  8. Atlas-Guided Cluster Analysis of Large Tractography Datasets

    PubMed Central

    Ros, Christian; Güllmar, Daniel; Stenzel, Martin; Mentzel, Hans-Joachim; Reichenbach, Jürgen Rainer

    2013-01-01

    Diffusion Tensor Imaging (DTI) and fiber tractography are important tools to map the cerebral white matter microstructure in vivo and to model the underlying axonal pathways in the brain with three-dimensional fiber tracts. As the fast and consistent extraction of anatomically correct fiber bundles for multiple datasets is still challenging, we present a novel atlas-guided clustering framework for exploratory data analysis of large tractography datasets. The framework uses an hierarchical cluster analysis approach that exploits the inherent redundancy in large datasets to time-efficiently group fiber tracts. Structural information of a white matter atlas can be incorporated into the clustering to achieve an anatomically correct and reproducible grouping of fiber tracts. This approach facilitates not only the identification of the bundles corresponding to the classes of the atlas; it also enables the extraction of bundles that are not present in the atlas. The new technique was applied to cluster datasets of 46 healthy subjects. Prospects of automatic and anatomically correct as well as reproducible clustering are explored. Reconstructed clusters were well separated and showed good correspondence to anatomical bundles. Using the atlas-guided cluster approach, we observed consistent results across subjects with high reproducibility. In order to investigate the outlier elimination performance of the clustering algorithm, scenarios with varying amounts of noise were simulated and clustered with three different outlier elimination strategies. By exploiting the multithreading capabilities of modern multiprocessor systems in combination with novel algorithms, our toolkit clusters large datasets in a couple of minutes. Experiments were conducted to investigate the achievable speedup and to demonstrate the high performance of the clustering framework in a multiprocessing environment. PMID:24386292

  9. Using cluster analysis to organize and explore regional GPS velocities

    USGS Publications Warehouse

    Simpson, Robert W.; Thatcher, Wayne; Savage, James C.

    2012-01-01

    Cluster analysis offers a simple visual exploratory tool for the initial investigation of regional Global Positioning System (GPS) velocity observations, which are providing increasingly precise mappings of actively deforming continental lithosphere. The deformation fields from dense regional GPS networks can often be concisely described in terms of relatively coherent blocks bounded by active faults, although the choice of blocks, their number and size, can be subjective and is often guided by the distribution of known faults. To illustrate our method, we apply cluster analysis to GPS velocities from the San Francisco Bay Region, California, to search for spatially coherent patterns of deformation, including evidence of block-like behavior. The clustering process identifies four robust groupings of velocities that we identify with four crustal blocks. Although the analysis uses no prior geologic information other than the GPS velocities, the cluster/block boundaries track three major faults, both locked and creeping.

  10. Comparative analysis of genomic signal processing for microarray data clustering.

    PubMed

    Istepanian, Robert S H; Sungoor, Ala; Nebel, Jean-Christophe

    2011-12-01

    Genomic signal processing is a new area of research that combines advanced digital signal processing methodologies for enhanced genetic data analysis. It has many promising applications in bioinformatics and next generation of healthcare systems, in particular, in the field of microarray data clustering. In this paper we present a comparative performance analysis of enhanced digital spectral analysis methods for robust clustering of gene expression across multiple microarray data samples. Three digital signal processing methods: linear predictive coding, wavelet decomposition, and fractal dimension are studied to provide a comparative evaluation of the clustering performance of these methods on several microarray datasets. The results of this study show that the fractal approach provides the best clustering accuracy compared to other digital signal processing and well known statistical methods.

  11. A Distributed Flocking Approach for Information Stream Clustering Analysis

    SciTech Connect

    Cui, Xiaohui; Potok, Thomas E

    2006-01-01

    Intelligence analysts are currently overwhelmed with the amount of information streams generated everyday. There is a lack of comprehensive tool that can real-time analyze the information streams. Document clustering analysis plays an important role in improving the accuracy of information retrieval. However, most clustering technologies can only be applied for analyzing the static document collection because they normally require a large amount of computation resource and long time to get accurate result. It is very difficult to cluster a dynamic changed text information streams on an individual computer. Our early research has resulted in a dynamic reactive flock clustering algorithm which can continually refine the clustering result and quickly react to the change of document contents. This character makes the algorithm suitable for cluster analyzing dynamic changed document information, such as text information stream. Because of the decentralized character of this algorithm, a distributed approach is a very natural way to increase the clustering speed of the algorithm. In this paper, we present a distributed multi-agent flocking approach for the text information stream clustering and discuss the decentralized architectures and communication schemes for load balance and status information synchronization in this approach.

  12. Clustering rainfall pattern in Malaysia using functional data analysis

    NASA Astrophysics Data System (ADS)

    Hamdan, Muhammad Fauzee; Suhaila, Jamaludin; Jemain, Abdul Aziz

    2015-02-01

    Understanding rainfall pattern is important for planning and prediction in hydrology, meteorology, water planning and agriculture. There are two important features of rainfall: the rainfall amount and the probability of rainfall occurrence. The discrete raw data of rainfall precipitation was reconstructed into rainfall amount curves by using functional data analysis method. Hierarchical clustering method with complete-linkage method was used to search for natural similar groupings of rainfall amount curves. The functional clustering illustrated the four dominant patterns for rainfall amount curves. In additional, adaptive Neyman test showed that each clusters are significantly different with from each others.

  13. Multivariate Analysis of the Globular Clusters in M87

    NASA Astrophysics Data System (ADS)

    Das, Sukanta; Chattopadhayay, Tanuka; Davoust, Emmanuel

    2015-11-01

    An objective classification of 147 globular clusters (GCs) in the inner region of the giant elliptical galaxy M87 is carried out with the help of two methods of multivariate analysis. First, independent component analysis (ICA) is used to determine a set of independent variables that are linear combinations of various observed parameters (mostly Lick indices) of the GCs. Next, K-means cluster analysis (CA) is applied on the independent components (ICs), to find the optimum number of homogeneous groups having an underlying structure. The properties of the four groups of GCs thus uncovered are used to explain the formation mechanism of the host galaxy. It is suggested that M87 formed in two successive phases. First a monolithic collapse, which gave rise to an inner group of metal-rich clusters with little systematic rotation and an outer group of metal-poor clusters in eccentric orbits. In a second phase, the galaxy accreted low-mass satellites in a dissipationless fashion, from the gas of which the two other groups of GCs formed. Evidence is given for a blue stellar population in the more metal rich clusters, which we interpret by Helium enrichment. Finally, it is found that the clusters of M87 differ in some of their chemical properties (NaD, TiO1, light-element abundances) from GCs in our Galaxy and M31.

  14. IDAS: a Windows based software package for cluster analysis

    NASA Astrophysics Data System (ADS)

    Bondarenko, Igor; Treiger, Boris; Van Grieken, René; Van Espen, Pierre

    1996-03-01

    This article is an electronic publication in Spectrochimica Acta Electronica (SAE), the electronic section of Spectrochimica Acta Part B (SAB). The hardcopy text, comprising the main article and one appendix, is accompanied by two installation diskettes with the software package and data files. The main article discusses the chemometric aspects of the package and explains its purpose. The IDAS software package combines three cluster analysis methods (hierarchical, non-hierarchical and fuzzy) and runs under MS Windows. Modified algorithms for non-hierarchical and fuzzy clusterings are described. The interpretation of the clustering results is facilitated by the extensive use of different types of graph. New approaches to the graphical representation of the results of fuzzy clustering are proposed. Two data sets, the Iris data by Fisher and a data set on the chemical composition of tea, are used to demonstrate the capabilities of the software.

  15. Hierarchical clustering in minimum spanning trees.

    PubMed

    Yu, Meichen; Hillebrand, Arjan; Tewarie, Prejaas; Meier, Jil; van Dijk, Bob; Van Mieghem, Piet; Stam, Cornelis Jan

    2015-02-01

    The identification of clusters or communities in complex networks is a reappearing problem. The minimum spanning tree (MST), the tree connecting all nodes with minimum total weight, is regarded as an important transport backbone of the original weighted graph. We hypothesize that the clustering of the MST reveals insight in the hierarchical structure of weighted graphs. However, existing theories and algorithms have difficulties to define and identify clusters in trees. Here, we first define clustering in trees and then propose a tree agglomerative hierarchical clustering (TAHC) method for the detection of clusters in MSTs. We then demonstrate that the TAHC method can detect clusters in artificial trees, and also in MSTs of weighted social networks, for which the clusters are in agreement with the previously reported clusters of the original weighted networks. Our results therefore not only indicate that clusters can be found in MSTs, but also that the MSTs contain information about the underlying clusters of the original weighted network.

  16. Hierarchical clustering in minimum spanning trees

    NASA Astrophysics Data System (ADS)

    Yu, Meichen; Hillebrand, Arjan; Tewarie, Prejaas; Meier, Jil; van Dijk, Bob; Van Mieghem, Piet; Stam, Cornelis Jan

    2015-02-01

    The identification of clusters or communities in complex networks is a reappearing problem. The minimum spanning tree (MST), the tree connecting all nodes with minimum total weight, is regarded as an important transport backbone of the original weighted graph. We hypothesize that the clustering of the MST reveals insight in the hierarchical structure of weighted graphs. However, existing theories and algorithms have difficulties to define and identify clusters in trees. Here, we first define clustering in trees and then propose a tree agglomerative hierarchical clustering (TAHC) method for the detection of clusters in MSTs. We then demonstrate that the TAHC method can detect clusters in artificial trees, and also in MSTs of weighted social networks, for which the clusters are in agreement with the previously reported clusters of the original weighted networks. Our results therefore not only indicate that clusters can be found in MSTs, but also that the MSTs contain information about the underlying clusters of the original weighted network.

  17. Characterizing Suicide in Toronto: An Observational Study and Cluster Analysis

    PubMed Central

    Sinyor, Mark; Schaffer, Ayal; Streiner, David L

    2014-01-01

    Objective: To determine whether people who have died from suicide in a large epidemiologic sample form clusters based on demographic, clinical, and psychosocial factors. Method: We conducted a coroner’s chart review for 2886 people who died in Toronto, Ontario, from 1998 to 2010, and whose death was ruled as suicide by the Office of the Chief Coroner of Ontario. A cluster analysis using known suicide risk factors was performed to determine whether suicide deaths separate into distinct groups. Clusters were compared according to person- and suicide-specific factors. Results: Five clusters emerged. Cluster 1 had the highest proportion of females and nonviolent methods, and all had depression and a past suicide attempt. Cluster 2 had the highest proportion of people with a recent stressor and violent suicide methods, and all were married. Cluster 3 had mostly males between the ages of 20 and 64, and all had either experienced recent stressors, suffered from mental illness, or had a history of substance abuse. Cluster 4 had the youngest people and the highest proportion of deaths by jumping from height, few were married, and nearly one-half had bipolar disorder or schizophrenia. Cluster 5 had all unmarried people with no prior suicide attempts, and were the least likely to have an identified mental illness and most likely to leave a suicide note. Conclusions: People who die from suicide assort into different patterns of demographic, clinical, and death-specific characteristics. Identifying and studying subgroups of suicides may advance our understanding of the heterogeneous nature of suicide and help to inform development of more targeted suicide prevention strategies. PMID:24444321

  18. An Empirical Analysis of Rough Set Categorical Clustering Techniques

    PubMed Central

    2017-01-01

    Clustering a set of objects into homogeneous groups is a fundamental operation in data mining. Recently, many attentions have been put on categorical data clustering, where data objects are made up of non-numerical attributes. For categorical data clustering the rough set based approaches such as Maximum Dependency Attribute (MDA) and Maximum Significance Attribute (MSA) has outperformed their predecessor approaches like Bi-Clustering (BC), Total Roughness (TR) and Min-Min Roughness(MMR). This paper presents the limitations and issues of MDA and MSA techniques on special type of data sets where both techniques fails to select or faces difficulty in selecting their best clustering attribute. Therefore, this analysis motivates the need to come up with better and more generalize rough set theory approach that can cope the issues with MDA and MSA. Hence, an alternative technique named Maximum Indiscernible Attribute (MIA) for clustering categorical data using rough set indiscernible relations is proposed. The novelty of the proposed approach is that, unlike other rough set theory techniques, it uses the domain knowledge of the data set. It is based on the concept of indiscernibility relation combined with a number of clusters. To show the significance of proposed approach, the effect of number of clusters on rough accuracy, purity and entropy are described in the form of propositions. Moreover, ten different data sets from previously utilized research cases and UCI repository are used for experiments. The results produced in tabular and graphical forms shows that the proposed MIA technique provides better performance in selecting the clustering attribute in terms of purity, entropy, iterations, time, accuracy and rough accuracy. PMID:28068344

  19. Multivariate analysis of fatty acid and biochemical constitutes of seaweeds to characterize their potential as bioresource for biofuel and fine chemicals.

    PubMed

    Verma, Priyanka; Kumar, Manoj; Mishra, Girish; Sahoo, Dinabandhu

    2017-02-01

    In the present study bio prospecting of thirty seaweeds from Indian coasts was analyzed for their biochemical components including pigments, fatty acid and ash content. Multivariate analysis of biochemical components and fatty acids was done using Principal Component Analysis (PCA) and Agglomerative hierarchical clustering (AHC) to manifest chemotaxonomic relationship among various seaweeds. The overall analysis suggests that these seaweeds have multi-functional properties and can be utilized as promising bioresource for proteins, lipids, pigments and carbohydrates for the food/feed and biofuel industry.

  20. Sun Protection Belief Clusters: Analysis of Amazon Mechanical Turk Data.

    PubMed

    Santiago-Rivas, Marimer; Schnur, Julie B; Jandorf, Lina

    2016-12-01

    This study aimed (i) to determine whether people could be differentiated on the basis of their sun protection belief profiles and individual characteristics and (ii) explore the use of a crowdsourcing web service for the assessment of sun protection beliefs. A sample of 500 adults completed an online survey of sun protection belief items using Amazon Mechanical Turk. A two-phased cluster analysis (i.e., hierarchical and non-hierarchical K-means) was utilized to determine clusters of sun protection barriers and facilitators. Results yielded three distinct clusters of sun protection barriers and three distinct clusters of sun protection facilitators. Significant associations between gender, age, sun sensitivity, and cluster membership were identified. Results also showed an association between barrier and facilitator cluster membership. The results of this study provided a potential alternative approach to developing future sun protection promotion initiatives in the population. Findings add to our knowledge regarding individuals who support, oppose, or are ambivalent toward sun protection and inform intervention research by identifying distinct subtypes that may best benefit from (or have a higher need for) skin cancer prevention efforts.

  1. Bayesian analysis of two stellar populations in Galactic globular clusters- III. Analysis of 30 clusters

    NASA Astrophysics Data System (ADS)

    Wagner-Kaiser, R.; Stenning, D. C.; Sarajedini, A.; von Hippel, T.; van Dyk, D. A.; Robinson, E.; Stein, N.; Jefferys, W. H.

    2016-12-01

    We use Cycle 21 Hubble Space Telescope (HST) observations and HST archival ACS Treasury observations of 30 Galactic globular clusters to characterize two distinct stellar populations. A sophisticated Bayesian technique is employed to simultaneously sample the joint posterior distribution of age, distance, and extinction for each cluster, as well as unique helium values for two populations within each cluster and the relative proportion of those populations. We find the helium differences among the two populations in the clusters fall in the range of ˜0.04 to 0.11. Because adequate models varying in carbon, nitrogen, and oxygen are not presently available, we view these spreads as upper limits and present them with statistical rather than observational uncertainties. Evidence supports previous studies suggesting an increase in helium content concurrent with increasing mass of the cluster and we also find that the proportion of the first population of stars increases with mass as well. Our results are examined in the context of proposed globular cluster formation scenarios. Additionally, we leverage our Bayesian technique to shed light on the inconsistencies between the theoretical models and the observed data.

  2. Cluster Analysis of Clinical Data Identifies Fibromyalgia Subgroups

    PubMed Central

    Docampo, Elisa; Collado, Antonio; Escaramís, Geòrgia; Carbonell, Jordi; Rivera, Javier; Vidal, Javier; Alegre, José

    2013-01-01

    Introduction Fibromyalgia (FM) is mainly characterized by widespread pain and multiple accompanying symptoms, which hinder FM assessment and management. In order to reduce FM heterogeneity we classified clinical data into simplified dimensions that were used to define FM subgroups. Material and Methods 48 variables were evaluated in 1,446 Spanish FM cases fulfilling 1990 ACR FM criteria. A partitioning analysis was performed to find groups of variables similar to each other. Similarities between variables were identified and the variables were grouped into dimensions. This was performed in a subset of 559 patients, and cross-validated in the remaining 887 patients. For each sample and dimension, a composite index was obtained based on the weights of the variables included in the dimension. Finally, a clustering procedure was applied to the indexes, resulting in FM subgroups. Results Variables clustered into three independent dimensions: “symptomatology”, “comorbidities” and “clinical scales”. Only the two first dimensions were considered for the construction of FM subgroups. Resulting scores classified FM samples into three subgroups: low symptomatology and comorbidities (Cluster 1), high symptomatology and comorbidities (Cluster 2), and high symptomatology but low comorbidities (Cluster 3), showing differences in measures of disease severity. Conclusions We have identified three subgroups of FM samples in a large cohort of FM by clustering clinical data. Our analysis stresses the importance of family and personal history of FM comorbidities. Also, the resulting patient clusters could indicate different forms of the disease, relevant to future research, and might have an impact on clinical assessment. PMID:24098674

  3. Mokken Scale Analysis Using Hierarchical Clustering Procedures

    ERIC Educational Resources Information Center

    van Abswoude, Alexandra A. H.; Vermunt, Jeroen K.; Hemker, Bas T.; van der Ark, L. Andries

    2004-01-01

    Mokken scale analysis (MSA) can be used to assess and build unidimensional scales from an item pool that is sensitive to multiple dimensions. These scales satisfy a set of scaling conditions, one of which follows from the model of monotone homogeneity. An important drawback of the MSA program is that the sequential item selection and scale…

  4. Learning From Hidden Traits: Joint Factor Analysis and Latent Clustering

    NASA Astrophysics Data System (ADS)

    Yang, Bo; Fu, Xiao; Sidiropoulos, Nicholas D.

    2017-01-01

    Dimensionality reduction techniques play an essential role in data analytics, signal processing and machine learning. Dimensionality reduction is usually performed in a preprocessing stage that is separate from subsequent data analysis, such as clustering or classification. Finding reduced-dimension representations that are well-suited for the intended task is more appealing. This paper proposes a joint factor analysis and latent clustering framework, which aims at learning cluster-aware low-dimensional representations of matrix and tensor data. The proposed approach leverages matrix and tensor factorization models that produce essentially unique latent representations of the data to unravel latent cluster structure -- which is otherwise obscured because of the freedom to apply an oblique transformation in latent space. At the same time, latent cluster structure is used as prior information to enhance the performance of factorization. Specific contributions include several custom-built problem formulations, corresponding algorithms, and discussion of associated convergence properties. Besides extensive simulations, real-world datasets such as Reuters document data and MNIST image data are also employed to showcase the effectiveness of the proposed approaches.

  5. Unsupervised feature relevance analysis applied to improve ECG heartbeat clustering.

    PubMed

    Rodríguez-Sotelo, J L; Peluffo-Ordoñez, D; Cuesta-Frau, D; Castellanos-Domínguez, G

    2012-10-01

    The computer-assisted analysis of biomedical records has become an essential tool in clinical settings. However, current devices provide a growing amount of data that often exceeds the processing capacity of normal computers. As this amount of information rises, new demands for more efficient data extracting methods appear. This paper addresses the task of data mining in physiological records using a feature selection scheme. An unsupervised method based on relevance analysis is described. This scheme uses a least-squares optimization of the input feature matrix in a single iteration. The output of the algorithm is a feature weighting vector. The performance of the method was assessed using a heartbeat clustering test on real ECG records. The quantitative cluster validity measures yielded a correctly classified heartbeat rate of 98.69% (specificity), 85.88% (sensitivity) and 95.04% (general clustering performance), which is even higher than the performance achieved by other similar ECG clustering studies. The number of features was reduced on average from 100 to 18, and the temporal cost was a 43% lower than in previous ECG clustering schemes.

  6. Phage cluster relationships identified through single gene analysis

    PubMed Central

    2013-01-01

    Background Phylogenetic comparison of bacteriophages requires whole genome approaches such as dotplot analysis, genome pairwise maps, and gene content analysis. Currently mycobacteriophages, a highly studied phage group, are categorized into related clusters based on the comparative analysis of whole genome sequences. With the recent explosion of phage isolation, a simple method for phage cluster prediction would facilitate analysis of crude or complex samples without whole genome isolation and sequencing. The hypothesis of this study was that mycobacteriophage-cluster prediction is possible using comparison of a single, ubiquitous, semi-conserved gene. Tape Measure Protein (TMP) was selected to test the hypothesis because it is typically the longest gene in mycobacteriophage genomes and because regions within the TMP gene are conserved. Results A single gene, TMP, identified the known Mycobacteriophage clusters and subclusters using a Gepard dotplot comparison or a phylogenetic tree constructed from global alignment and maximum likelihood comparisons. Gepard analysis of 247 mycobacteriophage TMP sequences appropriately recovered 98.8% of the subcluster assignments that were made by whole-genome comparison. Subcluster-specific primers within TMP allow for PCR determination of the mycobacteriophage subcluster from DNA samples. Using the single-gene comparison approach for siphovirus coliphages, phage groupings by TMP comparison reflected relationships observed in a whole genome dotplot comparison and confirm the potential utility of this approach to another widely studied group of phages. Conclusions TMP sequence comparison and PCR results support the hypothesis that a single gene can be used for distinguishing phage cluster and subcluster assignments. TMP single-gene analysis can quickly and accurately aid in mycobacteriophage classification. PMID:23777341

  7. Language Learner Motivational Types: A Cluster Analysis Study

    ERIC Educational Resources Information Center

    Papi, Mostafa; Teimouri, Yasser

    2014-01-01

    The study aimed to identify different second language (L2) learner motivational types drawing on the framework of the L2 motivational self system. A total of 1,278 secondary school students learning English in Iran completed a questionnaire survey. Cluster analysis yielded five different groups based on the strength of different variables within…

  8. Cluster analysis as a prediction tool for pregnancy outcomes.

    PubMed

    Banjari, Ines; Kenjerić, Daniela; Šolić, Krešimir; Mandić, Milena L

    2015-03-01

    Considering specific physiology changes during gestation and thinking of pregnancy as a "critical window", classification of pregnant women at early pregnancy can be considered as crucial. The paper demonstrates the use of a method based on an approach from intelligent data mining, cluster analysis. Cluster analysis method is a statistical method which makes possible to group individuals based on sets of identifying variables. The method was chosen in order to determine possibility for classification of pregnant women at early pregnancy to analyze unknown correlations between different variables so that the certain outcomes could be predicted. 222 pregnant women from two general obstetric offices' were recruited. The main orient was set on characteristics of these pregnant women: their age, pre-pregnancy body mass index (BMI) and haemoglobin value. Cluster analysis gained a 94.1% classification accuracy rate with three branch- es or groups of pregnant women showing statistically significant correlations with pregnancy outcomes. The results are showing that pregnant women both of older age and higher pre-pregnancy BMI have a significantly higher incidence of delivering baby of higher birth weight but they gain significantly less weight during pregnancy. Their babies are also longer, and these women have significantly higher probability for complications during pregnancy (gestosis) and higher probability of induced or caesarean delivery. We can conclude that the cluster analysis method can appropriately classify pregnant women at early pregnancy to predict certain outcomes.

  9. Making Sense of Cluster Analysis: Revelations from Pakistani Science Classes

    ERIC Educational Resources Information Center

    Pell, Tony; Hargreaves, Linda

    2011-01-01

    Cluster analysis has been applied to quantitative data in educational research over several decades and has been a feature of the Maurice Galton's research in primary and secondary classrooms. It has offered potentially useful insights for teaching yet its implications for practice are rarely implemented. It has been subject also to negative…

  10. Large-Scale Graph Processing Analysis using Supercomputer Cluster

    NASA Astrophysics Data System (ADS)

    Vildario, Alfrido; Fitriyani; Nugraha Nurkahfi, Galih

    2017-01-01

    Graph implementation is widely use in various sector such as automotive, traffic, image processing and many more. They produce graph in large-scale dimension, cause the processing need long computational time and high specification resources. This research addressed the analysis of implementation large-scale graph using supercomputer cluster. We impelemented graph processing by using Breadth-First Search (BFS) algorithm with single destination shortest path problem. Parallel BFS implementation with Message Passing Interface (MPI) used supercomputer cluster at High Performance Computing Laboratory Computational Science Telkom University and Stanford Large Network Dataset Collection. The result showed that the implementation give the speed up averages more than 30 times and eficiency almost 90%.

  11. K-means cluster analysis and seismicity partitioning for Pakistan

    NASA Astrophysics Data System (ADS)

    Rehman, Khaista; Burton, Paul W.; Weatherill, Graeme A.

    2014-07-01

    Pakistan and the western Himalaya is a region of high seismic activity located at the triple junction between the Arabian, Eurasian and Indian plates. Four devastating earthquakes have resulted in significant numbers of fatalities in Pakistan and the surrounding region in the past century (Quetta, 1935; Makran, 1945; Pattan, 1974 and the recent 2005 Kashmir earthquake). It is therefore necessary to develop an understanding of the spatial distribution of seismicity and the potential seismogenic sources across the region. This forms an important basis for the calculation of seismic hazard; a crucial input in seismic design codes needed to begin to effectively mitigate the high earthquake risk in Pakistan. The development of seismogenic source zones for seismic hazard analysis is driven by both geological and seismotectonic inputs. Despite the many developments in seismic hazard in recent decades, the manner in which seismotectonic information feeds the definition of the seismic source can, in many parts of the world including Pakistan and the surrounding regions, remain a subjective process driven primarily by expert judgment. Whilst much research is ongoing to map and characterise active faults in Pakistan, knowledge of the seismogenic properties of the active faults is still incomplete in much of the region. Consequently, seismicity, both historical and instrumental, remains a primary guide to the seismogenic sources of Pakistan. This study utilises a cluster analysis approach for the purposes of identifying spatial differences in seismicity, which can be utilised to form a basis for delineating seismogenic source regions. An effort is made to examine seismicity partitioning for Pakistan with respect to earthquake database, seismic cluster analysis and seismic partitions in a seismic hazard context. A magnitude homogenous earthquake catalogue has been compiled using various available earthquake data. The earthquake catalogue covers a time span from 1930 to 2007 and

  12. Outcome-Driven Cluster Analysis with Application to Microarray Data.

    PubMed

    Hsu, Jessie J; Finkelstein, Dianne M; Schoenfeld, David A

    2015-01-01

    One goal of cluster analysis is to sort characteristics into groups (clusters) so that those in the same group are more highly correlated to each other than they are to those in other groups. An example is the search for groups of genes whose expression of RNA is correlated in a population of patients. These genes would be of greater interest if their common level of RNA expression were additionally predictive of the clinical outcome. This issue arose in the context of a study of trauma patients on whom RNA samples were available. The question of interest was whether there were groups of genes that were behaving similarly, and whether each gene in the cluster would have a similar effect on who would recover. For this, we develop an algorithm to simultaneously assign characteristics (genes) into groups of highly correlated genes that have the same effect on the outcome (recovery). We propose a random effects model where the genes within each group (cluster) equal the sum of a random effect, specific to the observation and cluster, and an independent error term. The outcome variable is a linear combination of the random effects of each cluster. To fit the model, we implement a Markov chain Monte Carlo algorithm based on the likelihood of the observed data. We evaluate the effect of including outcome in the model through simulation studies and describe a strategy for prediction. These methods are applied to trauma data from the Inflammation and Host Response to Injury research program, revealing a clustering of the genes that are informed by the recovery outcome.

  13. Seismicity monitoring by cluster analysis of moment tensors

    NASA Astrophysics Data System (ADS)

    Cesca, Simone; Şen, Ali Tolga; Dahm, Torsten

    2014-03-01

    We suggest a new clustering approach to classify focal mechanisms from large moment tensor catalogues, with the purpose of automatically identify families of earthquakes with similar source geometry, recognize the orientation of most active faults, and detect temporal variations of the rupture processes. The approach differs in comparison to waveform similarity methods since clusters are detected even if they occur in large spatial distances. This approach is particularly helpful to analyse large moment tensor catalogues, as in microseismicity applications, where a manual analysis and classification is not feasible. A flexible algorithm is here proposed: it can handle different metrics, norms, and focal mechanism representations. In particular, the method can handle full moment tensor or constrained source model catalogues, for which different metrics are suggested. The method can account for variable uncertainties of different moment tensor components. We verify the method with synthetic catalogues. An application to real data from mining induced seismicity illustrates possible applications of the method and demonstrate the cluster detection and event classification performance with different moment tensor catalogues. Results proof that main earthquake source types occur on spatially separated faults, and that temporal changes in the number and characterization of focal mechanism clusters are detected. We suggest that moment tensor clustering can help assessing time dependent hazard in mines.

  14. CLAG: an unsupervised non hierarchical clustering algorithm handling biological data

    PubMed Central

    2012-01-01

    Background Searching for similarities in a set of biological data is intrinsically difficult due to possible data points that should not be clustered, or that should group within several clusters. Under these hypotheses, hierarchical agglomerative clustering is not appropriate. Moreover, if the dataset is not known enough, like often is the case, supervised classification is not appropriate either. Results CLAG (for CLusters AGgregation) is an unsupervised non hierarchical clustering algorithm designed to cluster a large variety of biological data and to provide a clustered matrix and numerical values indicating cluster strength. CLAG clusterizes correlation matrices for residues in protein families, gene-expression and miRNA data related to various cancer types, sets of species described by multidimensional vectors of characters, binary matrices. It does not ask to all data points to cluster and it converges yielding the same result at each run. Its simplicity and speed allows it to run on reasonably large datasets. Conclusions CLAG can be used to investigate the cluster structure present in biological datasets and to identify its underlying graph. It showed to be more informative and accurate than several known clustering methods, as hierarchical agglomerative clustering, k-means, fuzzy c-means, model-based clustering, affinity propagation clustering, and not to suffer of the convergence problem proper to this latter. PMID:23216858

  15. REGIONAL-SCALE WIND FIELD CLASSIFICATION EMPLOYING CLUSTER ANALYSIS

    SciTech Connect

    Glascoe, L G; Glaser, R E; Chin, H S; Loosmore, G A

    2004-06-17

    The classification of time-varying multivariate regional-scale wind fields at a specific location can assist event planning as well as consequence and risk analysis. Further, wind field classification involves data transformation and inference techniques that effectively characterize stochastic wind field variation. Such a classification scheme is potentially useful for addressing overall atmospheric transport uncertainty and meteorological parameter sensitivity issues. Different methods to classify wind fields over a location include the principal component analysis of wind data (e.g., Hardy and Walton, 1978) and the use of cluster analysis for wind data (e.g., Green et al., 1992; Kaufmann and Weber, 1996). The goal of this study is to use a clustering method to classify the winds of a gridded data set, i.e, from meteorological simulations generated by a forecast model.

  16. Transcriptional analysis of exopolysaccharides biosynthesis gene clusters in Lactobacillus plantarum.

    PubMed

    Vastano, Valeria; Perrone, Filomena; Marasco, Rosangela; Sacco, Margherita; Muscariello, Lidia

    2016-04-01

    Exopolysaccharides (EPS) from lactic acid bacteria contribute to specific rheology and texture of fermented milk products and find applications also in non-dairy foods and in therapeutics. Recently, four clusters of genes (cps) associated with surface polysaccharide production have been identified in Lactobacillus plantarum WCFS1, a probiotic and food-associated lactobacillus. These clusters are involved in cell surface architecture and probably in release and/or exposure of immunomodulating bacterial molecules. Here we show a transcriptional analysis of these clusters. Indeed, RT-PCR experiments revealed that the cps loci are organized in five operons. Moreover, by reverse transcription-qPCR analysis performed on L. plantarum WCFS1 (wild type) and WCFS1-2 (ΔccpA), we demonstrated that expression of three cps clusters is under the control of the global regulator CcpA. These results, together with the identification of putative CcpA target sequences (catabolite responsive element CRE) in the regulatory region of four out of five transcriptional units, strongly suggest for the first time a role of the master regulator CcpA in EPS gene transcription among lactobacilli.

  17. The Productivity Analysis of Chennai Automotive Industry Cluster

    NASA Astrophysics Data System (ADS)

    Bhaskaran, E.

    2014-07-01

    Chennai, also called the Detroit of India, is India's second fastest growing auto market and exports auto components and vehicles to US, Germany, Japan and Brazil. For inclusive growth and sustainable development, 250 auto component industries in Ambattur, Thirumalisai and Thirumudivakkam Industrial Estates located in Chennai have adopted the Cluster Development Approach called Automotive Component Cluster. The objective is to study the Value Chain, Correlation and Data Envelopment Analysis by determining technical efficiency, peer weights, input and output slacks of 100 auto component industries in three estates. The methodology adopted is using Data Envelopment Analysis of Output Oriented Banker Charnes Cooper model by taking net worth, fixed assets, employment as inputs and gross output as outputs. The non-zero represents the weights for efficient clusters. The higher slack obtained reveals the excess net worth, fixed assets, employment and shortage in gross output. To conclude, the variables are highly correlated and the inefficient industries should increase their gross output or decrease the fixed assets or employment. Moreover for sustainable development, the cluster should strengthen infrastructure, technology, procurement, production and marketing interrelationships to decrease costs and to increase productivity and efficiency to compete in the indigenous and export market.

  18. Full Text Clustering and Relationship Network Analysis of Biomedical Publications

    PubMed Central

    Guan, Renchu; Yang, Chen; Marchese, Maurizio; Liang, Yanchun; Shi, Xiaohu

    2014-01-01

    Rapid developments in the biomedical sciences have increased the demand for automatic clustering of biomedical publications. In contrast to current approaches to text clustering, which focus exclusively on the contents of abstracts, a novel method is proposed for clustering and analysis of complete biomedical article texts. To reduce dimensionality, Cosine Coefficient is used on a sub-space of only two vectors, instead of computing the Euclidean distance within the space of all vectors. Then a strategy and algorithm is introduced for Semi-supervised Affinity Propagation (SSAP) to improve analysis efficiency, using biomedical journal names as an evaluation background. Experimental results show that by avoiding high-dimensional sparse matrix computations, SSAP outperforms conventional k-means methods and improves upon the standard Affinity Propagation algorithm. In constructing a directed relationship network and distribution matrix for the clustering results, it can be noted that overlaps in scope and interests among BioMed publications can be easily identified, providing a valuable analytical tool for editors, authors and readers. PMID:25250864

  19. The Quantitative Analysis of Chennai Automotive Industry Cluster

    NASA Astrophysics Data System (ADS)

    Bhaskaran, Ethirajan

    2016-07-01

    Chennai, also called as Detroit of India due to presence of Automotive Industry producing over 40 % of the India's vehicle and components. During 2001-2002, the Automotive Component Industries (ACI) in Ambattur, Thirumalizai and Thirumudivakkam Industrial Estate, Chennai has faced problems on infrastructure, technology, procurement, production and marketing. The objective is to study the Quantitative Performance of Chennai Automotive Industry Cluster before (2001-2002) and after the CDA (2008-2009). The methodology adopted is collection of primary data from 100 ACI using quantitative questionnaire and analyzing using Correlation Analysis (CA), Regression Analysis (RA), Friedman Test (FMT), and Kruskall Wallis Test (KWT).The CA computed for the different set of variables reveals that there is high degree of relationship between the variables studied. The RA models constructed establish the strong relationship between the dependent variable and a host of independent variables. The models proposed here reveal the approximate relationship in a closer form. KWT proves, there is no significant difference between three locations clusters with respect to: Net Profit, Production Cost, Marketing Costs, Procurement Costs and Gross Output. This supports that each location has contributed for development of automobile component cluster uniformly. The FMT proves, there is no significant difference between industrial units in respect of cost like Production, Infrastructure, Technology, Marketing and Net Profit. To conclude, the Automotive Industries have fully utilized the Physical Infrastructure and Centralised Facilities by adopting CDA and now exporting their products to North America, South America, Europe, Australia, Africa and Asia. The value chain analysis models have been implemented in all the cluster units. This Cluster Development Approach (CDA) model can be implemented in industries of under developed and developing countries for cost reduction and productivity

  20. Bayesian Analysis of Multiple Populations in Galactic Globular Clusters

    NASA Astrophysics Data System (ADS)

    Wagner-Kaiser, Rachel A.; Sarajedini, Ata; von Hippel, Ted; Stenning, David; Piotto, Giampaolo; Milone, Antonino; van Dyk, David A.; Robinson, Elliot; Stein, Nathan

    2016-01-01

    We use GO 13297 Cycle 21 Hubble Space Telescope (HST) observations and archival GO 10775 Cycle 14 HST ACS Treasury observations of Galactic Globular Clusters to find and characterize multiple stellar populations. Determining how globular clusters are able to create and retain enriched material to produce several generations of stars is key to understanding how these objects formed and how they have affected the structural, kinematic, and chemical evolution of the Milky Way. We employ a sophisticated Bayesian technique with an adaptive MCMC algorithm to simultaneously fit the age, distance, absorption, and metallicity for each cluster. At the same time, we also fit unique helium values to two distinct populations of the cluster and determine the relative proportions of those populations. Our unique numerical approach allows objective and precise analysis of these complicated clusters, providing posterior distribution functions for each parameter of interest. We use these results to gain a better understanding of multiple populations in these clusters and their role in the history of the Milky Way.Support for this work was provided by NASA through grant numbers HST-GO-10775 and HST-GO-13297 from the Space Telescope Science Institute, which is operated by AURA, Inc., under NASA contract NAS5-26555. This material is based upon work supported by the National Aeronautics and Space Administration under Grant NNX11AF34G issued through the Office of Space Science. This project was supported by the National Aeronautics & Space Administration through the University of Central Florida's NASA Florida Space Grant Consortium.

  1. Applications of cluster analysis to the creation of perfectionism profiles: a comparison of two clustering approaches.

    PubMed

    Bolin, Jocelyn H; Edwards, Julianne M; Finch, W Holmes; Cassady, Jerrell C

    2014-01-01

    Although traditional clustering methods (e.g., K-means) have been shown to be useful in the social sciences it is often difficult for such methods to handle situations where clusters in the population overlap or are ambiguous. Fuzzy clustering, a method already recognized in many disciplines, provides a more flexible alternative to these traditional clustering methods. Fuzzy clustering differs from other traditional clustering methods in that it allows for a case to belong to multiple clusters simultaneously. Unfortunately, fuzzy clustering techniques remain relatively unused in the social and behavioral sciences. The purpose of this paper is to introduce fuzzy clustering to these audiences who are currently relatively unfamiliar with the technique. In order to demonstrate the advantages associated with this method, cluster solutions of a common perfectionism measure were created using both fuzzy clustering and K-means clustering, and the results compared. Results of these analyses reveal that different cluster solutions are found by the two methods, and the similarity between the different clustering solutions depends on the amount of cluster overlap allowed for in fuzzy clustering.

  2. An Interpretation of the Boshier-Collins Cluster Analysis Testing Houle's Typology.

    ERIC Educational Resources Information Center

    Furst, Edward J.

    1986-01-01

    This article speculates on an underlying order obscured by the details of the Boshier-Collins cluster analysis and the mapping of Houle's types onto it. A table illustrates an interpretation of cluster analysis on Boshier's Education Participation Scale. (CT)

  3. Analysis of data separation and recovery problems using clustered sparsity

    NASA Astrophysics Data System (ADS)

    King, Emily J.; Kutyniok, Gitta; Zhuang, Xiaosheng

    2011-09-01

    Data often have two or more fundamental components, like cartoon-like and textured elements in images; point, filament, and sheet clusters in astronomical data; and tonal and transient layers in audio signals. For many applications, separating these components is of interest. Another issue in data analysis is that of incomplete data, for example a photograph with scratches or seismic data collected with fewer than necessary sensors. There exists a unified approach to solving these problems which is minimizing the l1 norm of the analysis coefficients with respect to particular frame(s). This approach using the concept of clustered sparsity leads to similar theoretical bounds and results, which are presented here. Furthermore, necessary conditions for the frames to lead to sufficiently good solutions are also shown.

  4. Segment clustering methodology for unsupervised Holter recordings analysis

    NASA Astrophysics Data System (ADS)

    Rodríguez-Sotelo, Jose Luis; Peluffo-Ordoñez, Diego; Castellanos Dominguez, German

    2015-01-01

    Cardiac arrhythmia analysis on Holter recordings is an important issue in clinical settings, however such issue implicitly involves attending other problems related to the large amount of unlabelled data which means a high computational cost. In this work an unsupervised methodology based in a segment framework is presented, which consists of dividing the raw data into a balanced number of segments in order to identify fiducial points, characterize and cluster the heartbeats in each segment separately. The resulting clusters are merged or split according to an assumed criterion of homogeneity. This framework compensates the high computational cost employed in Holter analysis, being possible its implementation for further real time applications. The performance of the method is measure over the records from the MIT/BIH arrhythmia database and achieves high values of sensibility and specificity, taking advantage of database labels, for a broad kind of heartbeats types recommended by the AAMI.

  5. Assessing intraplate volcano compositional similarities with cluster analysis

    NASA Astrophysics Data System (ADS)

    Konter, J. G.

    2012-12-01

    The compositional variation in intraplate volcanoes is commonly assessed as a function of end-members that were recognized as extrema in a 3D space, defined by radiogenic isotope ratios. The specific isotope ratios used are the principle components in the intraplate volcano compositional data set, and by reducing the dimensionality of the data set to 3, groupings and trends in the data can be visually identified. Such groupings can then be used to compare to other geochemical or geophysical data sets (e.g. correlations with seismic models). A complementary approach in examining groupings and trends in a data set is the use of cluster analysis, which can be used to recognize groups of similar intraplate volcanic systems. Since it is not known a priori how many clusters may exist, hierarchical cluster analysis can be used to examine the relationships between individual intraplate volcanic systems. The technique compares the Euclidian distance between the data available at the different locations, and this data can have a large number of dimensions. The results can be visualized as a dendrogram, where individual locations are represented by different branches (or leafs) that join at different distances. We use Matlab to examine the data extracted from pre-compiled GEOROC database files, including location, major elements, large ion lithophile elements, high field strength elements, rare earth elements and radiogenic isotopes. These data do not vary over the same range in values and are therefore first normalized by the total range in the data set for each particular element or isotope ratio. Since multiple samples have been analyzed for most intraplate volcanic systems, we assess the results for the average, the maximum, and the minimum values for each element. In addition, we investigate the robustness of the outcome by removing one element at a time from the data set and recalculating a new dendrogram. One of the outcomes is that the resulting clusters seem to

  6. The relative vertex clustering value - a new criterion for the fast discovery of functional modules in protein interaction networks

    PubMed Central

    2015-01-01

    Background Cellular processes are known to be modular and are realized by groups of proteins implicated in common biological functions. Such groups of proteins are called functional modules, and many community detection methods have been devised for their discovery from protein interaction networks (PINs) data. In current agglomerative clustering approaches, vertices with just a very few neighbors are often classified as separate clusters, which does not make sense biologically. Also, a major limitation of agglomerative techniques is that their computational efficiency do not scale well to large PINs. Finally, PIN data obtained from large scale experiments generally contain many false positives, and this makes it hard for agglomerative clustering methods to find the correct clusters, since they are known to be sensitive to noisy data. Results We propose a local similarity premetric, the relative vertex clustering value, as a new criterion allowing to decide when a node can be added to a given node's cluster and which addresses the above three issues. Based on this criterion, we introduce a novel and very fast agglomerative clustering technique, FAC-PIN, for discovering functional modules and protein complexes from a PIN data. Conclusions Our proposed FAC-PIN algorithm is applied to nine PIN data from eight different species including the yeast PIN, and the identified functional modules are validated using Gene Ontology (GO) annotations from DAVID Bioinformatics Resources. Identified protein complexes are also validated using experimentally verified complexes. Computational results show that FAC-PIN can discover functional modules or protein complexes from PINs more accurately and more efficiently than HC-PIN and CNM, the current state-of-the-art approaches for clustering PINs in an agglomerative manner. PMID:25734691

  7. Psychosocial Costs of Racism to Whites: Exploring Patterns through Cluster Analysis

    ERIC Educational Resources Information Center

    Spanierman, Lisa B.; Poteat, V. Paul; Beer, Amanda M.; Armstrong, Patrick Ian

    2006-01-01

    Participants (230 White college students) completed the Psychosocial Costs of Racism to Whites (PCRW) Scale. Using cluster analysis, we identified 5 distinct cluster groups on the basis of PCRW subscale scores: the unempathic and unaware cluster contained the lowest empathy scores; the insensitive and afraid cluster consisted of low empathy and…

  8. ACPA: automated cluster plot analysis of genotype data.

    PubMed

    Schillert, Arne; Schwarz, Daniel F; Vens, Maren; Szymczak, Silke; König, Inke R; Ziegler, Andreas

    2009-12-15

    Genome-wide association studies have become standard in genetic epidemiology. Analyzing hundreds of thousands of markers simultaneously imposes some challenges for statisticians. One issue is the problem of multiplicity, which has been compared with the search for the needle in a haystack. To reduce the number of false-positive findings, a number of quality filters such as exclusion of single-nucleotide polymorphisms (SNPs) with a high missing fraction are employed. Another filter is exclusion of SNPs for which the calling algorithm had difficulties in assigning the genotypes. The only way to do this is the visual inspection of the cluster plots, also termed signal intensity plots, but this approach is often neglected. We developed an algorithm ACPA (automated cluster plot analysis), which performs this task automatically for autosomal SNPs. It is based on counting samples that lie too close to the cluster of a different genotype; SNPs are excluded when a certain threshold is exceeded. We evaluated ACPA using 1,000 randomly selected quality controlled SNPs from the Framingham Heart Study data that were provided for the Genetic Analysis Workshop 16. We compared the decision of ACPA with the decision made by two independent readers. We achieved a sensitivity of 88% (95% CI: 81%-93%) and a specificity of 86% (95% CI: 83%-89%). In a screening setting in which one aims at not losing any good SNP, we achieved 99% (95% CI: 98%-100%) specificity and still detected every second low-quality SNP.

  9. Cluster: Mission Overview and End-of-Life Analysis

    NASA Technical Reports Server (NTRS)

    Pallaschke, S.; Munoz, I.; Rodriquez-Canabal, J.; Sieg, D.; Yde, J. J.

    2007-01-01

    The Cluster mission is part of the scientific programme of the European Space Agency (ESA) and its purpose is the analysis of the Earth's magnetosphere. The Cluster project consists of four satellites. The selected polar orbit has a shape of 4.0 and 19.2 Re which is required for performing measurements near the cusp and the tail of the magnetosphere. When crossing these regions the satellites form a constellation which in most of the cases so far has been a regular tetrahedron. The satellite operations are carried out by the European Space Operations Centre (ESOC) at Darmstadt, Germany. The paper outlines the future orbit evolution and the envisaged operations from a Flight Dynamics point of view. In addition a brief summary of the LEOP and routine operations is included beforehand.

  10. Accident patterns for construction-related workers: a cluster analysis

    NASA Astrophysics Data System (ADS)

    Liao, Chia-Wen; Tyan, Yaw-Yauan

    2011-12-01

    The construction industry has been identified as one of the most hazardous industries. The risk of constructionrelated workers is far greater than that in a manufacturing based industry. However, some steps can be taken to reduce worker risk through effective injury prevention strategies. In this article, k-means clustering methodology is employed in specifying the factors related to different worker types and in identifying the patterns of industrial occupational accidents. Accident reports during the period 1998 to 2008 are extracted from case reports of the Northern Region Inspection Office of the Council of Labor Affairs of Taiwan. The results show that the cluster analysis can indicate some patterns of occupational injuries in the construction industry. Inspection plans should be proposed according to the type of construction-related workers. The findings provide a direction for more effective inspection strategies and injury prevention programs.

  11. Accident patterns for construction-related workers: a cluster analysis

    NASA Astrophysics Data System (ADS)

    Liao, Chia-Wen; Tyan, Yaw-Yauan

    2012-01-01

    The construction industry has been identified as one of the most hazardous industries. The risk of constructionrelated workers is far greater than that in a manufacturing based industry. However, some steps can be taken to reduce worker risk through effective injury prevention strategies. In this article, k-means clustering methodology is employed in specifying the factors related to different worker types and in identifying the patterns of industrial occupational accidents. Accident reports during the period 1998 to 2008 are extracted from case reports of the Northern Region Inspection Office of the Council of Labor Affairs of Taiwan. The results show that the cluster analysis can indicate some patterns of occupational injuries in the construction industry. Inspection plans should be proposed according to the type of construction-related workers. The findings provide a direction for more effective inspection strategies and injury prevention programs.

  12. Clustering Analysis of Fast-ion Driven Instabilities

    NASA Astrophysics Data System (ADS)

    Gresl, J.; Heidbrink, W. W.; Haskey, S.; Blackwell, B. D.

    2016-10-01

    Beam ions often drive Alfvén eigenmodes and other instabilities unstable in DIII-D. Many of these modes have been unambigously identified but some frequently occurring features have been neglected. In this work, datamining analysis techniques that successfully analyzed magnetics data from the H-1NF heliac are applied to arrays of magnetic and electron cyclotron emission (ECE) data from DIII-D. The techniques group instabilities with similar magnetic or ECE features into clusters. Once the clusters are found, a database of plasma parameters will facilitate mode identification. Work supported by the US Department of Energy under DE-FC02-04ER54698, DE-FG03-94ER54271, DE-AC02-09CH11466.

  13. Clustered Numerical Data Analysis Using Markov Lie Monoid Based Networks

    NASA Astrophysics Data System (ADS)

    Johnson, Joseph

    2016-03-01

    We have designed and build an optimal numerical standardization algorithm that links numerical values with their associated units, error level, and defining metadata thus supporting automated data exchange and new levels of artificial intelligence (AI). The software manages all dimensional and error analysis and computational tracing. Tables of entities verses properties of these generalized numbers (called ``metanumbers'') support a transformation of each table into a network among the entities and another network among their properties where the network connection matrix is based upon a proximity metric between the two items. We previously proved that every network is isomorphic to the Lie algebra that generates continuous Markov transformations. We have also shown that the eigenvectors of these Markov matrices provide an agnostic clustering of the underlying patterns. We will present this methodology and show how our new work on conversion of scientific numerical data through this process can reveal underlying information clusters ordered by the eigenvalues. We will also show how the linking of clusters from different tables can be used to form a ``supernet'' of all numerical information supporting new initiatives in AI.

  14. IGSA: Individual Gene Sets Analysis, including Enrichment and Clustering

    PubMed Central

    Liu, Lei; Ma, Hongzhe; Yang, Jingbo; Xie, Hongbo; Liu, Bo; Jin, Qing

    2016-01-01

    Analysis of gene sets has been widely applied in various high-throughput biological studies. One weakness in the traditional methods is that they neglect the heterogeneity of genes expressions in samples which may lead to the omission of some specific and important gene sets. It is also difficult for them to reflect the severities of disease and provide expression profiles of gene sets for individuals. We developed an application software called IGSA that leverages a powerful analytical capacity in gene sets enrichment and samples clustering. IGSA calculates gene sets expression scores for each sample and takes an accumulating clustering strategy to let the samples gather into the set according to the progress of disease from mild to severe. We focus on gastric, pancreatic and ovarian cancer data sets for the performance of IGSA. We also compared the results of IGSA in KEGG pathways enrichment with David, GSEA, SPIA, ssGSEA and analyzed the results of IGSA clustering and different similarity measurement methods. Notably, IGSA is proved to be more sensitive and specific in finding significant pathways, and can indicate related changes in pathways with the severity of disease. In addition, IGSA provides with significant gene sets profile for each sample. PMID:27764138

  15. Covariance analysis of differential drag-based satellite cluster flight

    NASA Astrophysics Data System (ADS)

    Ben-Yaacov, Ohad; Ivantsov, Anatoly; Gurfil, Pini

    2016-06-01

    One possibility for satellite cluster flight is to control relative distances using differential drag. The idea is to increase or decrease the drag acceleration on each satellite by changing its attitude, and use the resulting small differential acceleration as a controller. The most significant advantage of the differential drag concept is that it enables cluster flight without consuming fuel. However, any drag-based control algorithm must cope with significant aerodynamical and mechanical uncertainties. The goal of the current paper is to develop a method for examination of the differential drag-based cluster flight performance in the presence of noise and uncertainties. In particular, the differential drag control law is examined under measurement noise, drag uncertainties, and initial condition-related uncertainties. The method used for uncertainty quantification is the Linear Covariance Analysis, which enables us to propagate the augmented state and filter covariance without propagating the state itself. Validation using a Monte-Carlo simulation is provided. The results show that all uncertainties have relatively small effect on the inter-satellite distance, even in the long term, which validates the robustness of the used differential drag controller.

  16. IGSA: Individual Gene Sets Analysis, including Enrichment and Clustering.

    PubMed

    Wu, Lingxiang; Chen, Xiujie; Zhang, Denan; Zhang, Wubing; Liu, Lei; Ma, Hongzhe; Yang, Jingbo; Xie, Hongbo; Liu, Bo; Jin, Qing

    2016-01-01

    Analysis of gene sets has been widely applied in various high-throughput biological studies. One weakness in the traditional methods is that they neglect the heterogeneity of genes expressions in samples which may lead to the omission of some specific and important gene sets. It is also difficult for them to reflect the severities of disease and provide expression profiles of gene sets for individuals. We developed an application software called IGSA that leverages a powerful analytical capacity in gene sets enrichment and samples clustering. IGSA calculates gene sets expression scores for each sample and takes an accumulating clustering strategy to let the samples gather into the set according to the progress of disease from mild to severe. We focus on gastric, pancreatic and ovarian cancer data sets for the performance of IGSA. We also compared the results of IGSA in KEGG pathways enrichment with David, GSEA, SPIA, ssGSEA and analyzed the results of IGSA clustering and different similarity measurement methods. Notably, IGSA is proved to be more sensitive and specific in finding significant pathways, and can indicate related changes in pathways with the severity of disease. In addition, IGSA provides with significant gene sets profile for each sample.

  17. Common and Cluster-Specific Simultaneous Component Analysis

    PubMed Central

    De Roover, Kim; Timmerman, Marieke E.; Mesquita, Batja; Ceulemans, Eva

    2013-01-01

    In many fields of research, so-called ‘multiblock’ data are collected, i.e., data containing multivariate observations that are nested within higher-level research units (e.g., inhabitants of different countries). Each higher-level unit (e.g., country) then corresponds to a ‘data block’. For such data, it may be interesting to investigate the extent to which the correlation structure of the variables differs between the data blocks. More specifically, when capturing the correlation structure by means of component analysis, one may want to explore which components are common across all data blocks and which components differ across the data blocks. This paper presents a common and cluster-specific simultaneous component method which clusters the data blocks according to their correlation structure and allows for common and cluster-specific components. Model estimation and model selection procedures are described and simulation results validate their performance. Also, the method is applied to data from cross-cultural values research to illustrate its empirical value. PMID:23667463

  18. The REFLEX II galaxy cluster survey: power spectrum analysis

    NASA Astrophysics Data System (ADS)

    Balaguera-Antolínez, A.; Sánchez, Ariel G.; Böhringer, H.; Collins, C.; Guzzo, L.; Phleps, S.

    2011-05-01

    We present the power spectrum of galaxy clusters measured from the new ROSAT-ESO Flux-Limited X-Ray (REFLEX II) galaxy cluster catalogue. This new sample extends the flux limit of the original REFLEX catalogue to 1.8 × 10-12 erg s-1 cm-2, yielding a total of 911 clusters with ≥94 per cent completeness in redshift follow-up. The analysis of the data is improved by creating a set of 100 REFLEX II-catalogue-like mock galaxy cluster catalogues built from a suite of large-volume Λ cold dark matter (ΛCDM) N-body simulations (L-BASICC II). The measured power spectrum is in agreement with the predictions from a ΛCDM cosmological model. The measurements show the expected increase in the amplitude of the power spectrum with increasing X-ray luminosity. On large scales, we show that the shape of the measured power spectrum is compatible with a scale-independent bias and provide a model for the amplitude that allows us to connect our measurements with a cosmological model. By implementing a luminosity-dependent power-spectrum estimator, we observe that the power spectrum measured from the REFLEX II sample is weakly affected by flux-selection effects. The shape of the measured power spectrum is compatible with a featureless power spectrum on scales k > 0.01 h Mpc-1 and hence no statistically significant signal of baryonic acoustic oscillations can be detected. We show that the measured REFLEX II power spectrum displays signatures of non-linear evolution.

  19. Cluster analysis in systems of magnetic spheres and cubes

    NASA Astrophysics Data System (ADS)

    Pyanzina, E. S.; Gudkova, A. V.; Donaldson, J. G.; Kantorovich, S. S.

    2017-06-01

    In the present work we use molecular dynamics simulations and graph-theory based cluster analysis to compare self-assembly in systems of magnetic spheres, and cubes where the dipole moment is oriented along the side of the cube in the [001] crystallographic direction. We show that under the same conditions cubes aggregate far less than their spherical counterparts. This difference can be explained in terms of the volume of phase space in which the formation of the bond is thermodynamically advantageous. It follows that this volume is much larger for a dipolar sphere than for a dipolar cube.

  20. Multivariate cluster analysis of forest fire events in Portugal

    NASA Astrophysics Data System (ADS)

    Tonini, Marj; Pereira, Mario; Vega Orozco, Carmen; Parente, Joana

    2015-04-01

    Portugal is one of the major fire-prone European countries, mainly due to its favourable climatic, topographic and vegetation conditions. Compared to the other Mediterranean countries, the number of events registered here from 1980 up to nowadays is the highest one; likewise, with respect to the burnt area, Portugal is the third most affected country. Portuguese mapped burnt areas are available from the website of the Institute for the Conservation of Nature and Forests (ICNF). This official geodatabase is the result of satellite measurements starting from the year 1990. The spatial information, delivered in shapefile format, provides a detailed description of the shape and the size of area burnt by each fire, while the date/time information relate to the ignition fire is restricted to the year of occurrence. In terms of a statistical formalism wildfires can be associated to a stochastic point process, where events are analysed as a set of geographical coordinates corresponding, for example, to the centroid of each burnt area. The spatio/temporal pattern of stochastic point processes, including the cluster analysis, is a basic procedure to discover predisposing factorsas well as for prevention and forecasting purposes. These kinds of studies are primarily focused on investigating the spatial cluster behaviour of environmental data sequences and/or mapping their distribution at different times. To include both the two dimensions (space and time) a comprehensive spatio-temporal analysis is needful. In the present study authors attempt to verify if, in the case of wildfires in Portugal, space and time act independently or if, conversely, neighbouring events are also closer in time. We present an application of the spatio-temporal K-function to a long dataset (1990-2012) of mapped burnt areas. Moreover, the multivariate K-function allowed checking for an eventual different distribution between small and large fires. The final objective is to elaborate a 3D

  1. Year clustering analysis for modelling olive flowering phenology.

    PubMed

    Oteros, J; García-Mozo, H; Hervás-Martínez, C; Galán, C

    2013-07-01

    It is now widely accepted that weather conditions occurring several months prior to the onset of flowering have a major influence on various aspects of olive reproductive phenology, including flowering intensity. Given the variable characteristics of the Mediterranean climate, we analyse its influence on the registered variations in olive flowering intensity in southern Spain, and relate them to previous climatic parameters using a year-clustering approach, as a first step towards an olive flowering phenology model adapted to different year categories. Phenological data from Cordoba province (Southern Spain) for a 30-year period (1982-2011) were analysed. Meteorological and phenological data were first subjected to both hierarchical and "K-means" clustering analysis, which yielded four year-categories. For this classification purpose, three different models were tested: (1) discriminant analysis; (2) decision-tree analysis; and (3) neural network analysis. Comparison of the results showed that the neural-networks model was the most effective, classifying four different year categories with clearly distinct weather features. Flowering-intensity models were constructed for each year category using the partial least squares regression method. These category-specific models proved to be more effective than general models. They are better suited to the variability of the Mediterranean climate, due to the different response of plants to the same environmental stimuli depending on the previous weather conditions in any given year. The present detailed analysis of the influence of weather patterns of different years on olive phenology will help us to understand the short-term effects of climate change on olive crop in the Mediterranean area that is highly affected by it.

  2. Analysis and Implementation of Graph Clustering for Digital News Using Star Clustering Algorithm

    NASA Astrophysics Data System (ADS)

    Ahdi, A. B.; SW, K. R.; Herdiani, A.

    2017-01-01

    Since Web 2.0 notion emerged and is used extensively by many services in the Internet, we see an unprecedented proliferation of digital news. Those digital news is very rich in term of content and link to other news/sources but lack of category information. This make the user could not easily identify or grouping all the news that they read into set of groups. Naturally, digital news are linked data because every digital new has relation/connection with other digital news/resources. The most appropriate model for linked data is graph model. Graph model is suitable for this purpose due its flexibility in describing relation and its easy-to-understand visualization. To handle the grouping issue, we use graph clustering approach. There are many graph clustering algorithm available, such as MST Clustering, Chameleon, Makarov Clustering and Star Clustering. From all of these options, we choose Star Clustering because this algorithm is more easy-to-understand, more accurate, efficient and guarantee the quality of clusters results. In this research, we investigate the accuracy of the cluster results by comparing it with expert judgement. We got quite high accuracy level, which is 80.98% and for the cluster quality, we got promising result which is 62.87%.

  3. Time series clustering analysis of health-promoting behavior

    NASA Astrophysics Data System (ADS)

    Yang, Chi-Ta; Hung, Yu-Shiang; Deng, Guang-Feng

    2013-10-01

    Health promotion must be emphasized to achieve the World Health Organization goal of health for all. Since the global population is aging rapidly, ComCare elder health-promoting service was developed by the Taiwan Institute for Information Industry in 2011. Based on the Pender health promotion model, ComCare service offers five categories of health-promoting functions to address the everyday needs of seniors: nutrition management, social support, exercise management, health responsibility, stress management. To assess the overall ComCare service and to improve understanding of the health-promoting behavior of elders, this study analyzed health-promoting behavioral data automatically collected by the ComCare monitoring system. In the 30638 session records collected for 249 elders from January, 2012 to March, 2013, behavior patterns were identified by fuzzy c-mean time series clustering algorithm combined with autocorrelation-based representation schemes. The analysis showed that time series data for elder health-promoting behavior can be classified into four different clusters. Each type reveals different health-promoting needs, frequencies, function numbers and behaviors. The data analysis result can assist policymakers, health-care providers, and experts in medicine, public health, nursing and psychology and has been provided to Taiwan National Health Insurance Administration to assess the elder health-promoting behavior.

  4. IPC two-color analysis of x ray galaxy clusters

    NASA Technical Reports Server (NTRS)

    White, Raymond E., III

    1990-01-01

    The mass distributions were determined of several clusters of galaxies by using X ray surface brightness data from the Einstein Observatory Imaging Proportional Counter (IPC). Determining cluster mass distributions is important for constraining the nature of the dark matter which dominates the mass of galaxies, galaxy clusters, and the Universe. Galaxy clusters are permeated with hot gas in hydrostatic equilibrium with the gravitational potentials of the clusters. Cluster mass distributions can be determined from x ray observations of cluster gas by using the equation of hydrostatic equilibrium and knowledge of the density and temperature structure of the gas. The x ray surface brightness at some distance from the cluster is the result of the volume x ray emissivity being integrated along the line of sight in the cluster.

  5. The composite sequential clustering technique for analysis of multispectral scanner data

    NASA Technical Reports Server (NTRS)

    Su, M. Y.

    1972-01-01

    The clustering technique consists of two parts: (1) a sequential statistical clustering which is essentially a sequential variance analysis, and (2) a generalized K-means clustering. In this composite clustering technique, the output of (1) is a set of initial clusters which are input to (2) for further improvement by an iterative scheme. This unsupervised composite technique was employed for automatic classification of two sets of remote multispectral earth resource observations. The classification accuracy by the unsupervised technique is found to be comparable to that by traditional supervised maximum likelihood classification techniques. The mathematical algorithms for the composite sequential clustering program and a detailed computer program description with job setup are given.

  6. Highlights of the Merging Cluster Collaboration's Analysis of 26 Radio Relic Galaxy Cluster Mergers

    NASA Astrophysics Data System (ADS)

    Dawson, William; Golovich, Nathan; Wittman, David M.; Bradac, Marusa; Brüggen, Marcus; Bullock, James; Elbert, Oliver; Jee, James; Kaplinghat, Manoj; Kim, Stacy; Mahdavi, Andisheh; Merten, Julian; Ng, Karen; Annika, Peter; Rocha, Miguel E.; Sobral, David; Stroe, Andra; Van Weeren, Reinout J.; Merging Cluster Collaboration

    2016-01-01

    Merging galaxy clusters are now recognized as multifaceted probes providing unique insight into the properties of dark matter, the environmental impact of plasma shocks on galaxy evolution, and the physics of high energy particle acceleration. The Merging Cluster Collaboration has used the diffuse radio emission associated with the synchrotron radiation of relativistic particles accelerated by shocks generated during major cluster mergers (i.e. radio relics) to identify a homogenous sample of 26 galaxy cluster mergers. We have confirmed theoretical expectations that radio relics are predominantly associated with mergers occurring near the plane of the sky and at a relatively common merger phase; making them ideal probes of self-interacting dark matter, and eliminating much of the dominant uncertainty when relating the observed star formation rates to the event of the major cluster merger. We will highlight a number of the discovered common traits of this sample as well as detailed measurements of individual mergers.

  7. Principal Component Analysis and Cluster Analysis in Profile of Electrical System

    NASA Astrophysics Data System (ADS)

    Iswan; Garniwa, I.

    2017-03-01

    This paper propose to present approach for profile of electrical system, presented approach is combination algorithm, namely principal component analysis (PCA) and cluster analysis. Based on relevant data of gross domestic regional product and electric power and energy use. This profile is set up to show the condition of electrical system of the region, that will be used as a policy in the electrical system of spatial development in the future. This paper consider 24 region in South Sulawesi province as profile center points and use principal component analysis (PCA) to asses the regional profile for development. Cluster analysis is used to group these region into few cluster according to the new variable be produced PCA. The general planning of electrical system of South Sulawesi province can provide support for policy making of electrical system development. The future research can be added several variable into existing variable.

  8. Investigating Faculty Familiarity with Assessment Terminology by Applying Cluster Analysis to Interpret Survey Data

    ERIC Educational Resources Information Center

    Raker, Jeffrey R.; Holme, Thomas A.

    2014-01-01

    A cluster analysis was conducted with a set of survey data on chemistry faculty familiarity with 13 assessment terms. Cluster groupings suggest a high, middle, and low overall familiarity with the terminology and an independent high and low familiarity with terms related to fundamental statistics. The six resultant clusters were found to be…

  9. Cluster analysis of rural, urban, and curbside atmospheric particle size data.

    PubMed

    Beddows, David C S; Dall'Osto, Manuel; Harrison, Roy M

    2009-07-01

    Particle size is a key determinant of the hazard posed by airborne particles. Continuous multivariate particle size data have been collected using aerosol particle size spectrometers sited at four locations within the UK: Harwell (Oxfordshire); Regents Park (London); British Telecom Tower (London); and Marylebone Road (London). These data have been analyzed using k-means cluster analysis, deduced to be the preferred cluster analysis technique, selected from an option of four partitional cluster packages, namelythe following: Fuzzy; k-means; k-median; and Model-Based clustering. Using cluster validation indices k-means clustering was shown to produce clusters with the smallest size, furthest separation, and importantly the highest degree of similarity between the elements within each partition. Using k-means clustering, the complexity of the data set is reduced allowing characterization of the data according to the temporal and spatial trends of the clusters. At Harwell, the rural background measurement site, the cluster analysis showed that the spectra may be differentiated by their modal-diameters and average temporal trends showing either high counts during the day-time or night-time hours. Likewise for the urban sites, the cluster analysis differentiated the spectra into a small number of size distributions according their modal-diameter, the location of the measurement site, and time of day. The responsible aerosol emission, formation, and dynamic processes can be inferred according to the cluster characteristics and correlation to concurrently measured meteorological, gas phase, and particle phase measurements.

  10. The Use of Cluster Analysis in Typological Research on Community College Students

    ERIC Educational Resources Information Center

    Bahr, Peter Riley; Bielby, Rob; House, Emily

    2011-01-01

    One useful and increasingly popular method of classifying students is known commonly as cluster analysis. The variety of techniques that comprise the cluster analytic family are intended to sort observations (for example, students) within a data set into subsets (clusters) that share similar characteristics and differ in meaningful ways from other…

  11. [Clustering Analysis of Hydatid Disease in Gansu Province].

    PubMed

    Yu, Da-wei; Ding, Guo-wu; Hou, Yan-dong; Feng, Yu; Li, Fan

    2015-08-01

    The prevalence of hydatid disease in human population and livestock, and the positive rate of echinococcal antigen in canine feces were analyzed with sample clustering method, according to the survey on hydatid disease in 72 counties in Gansu province in the database of the National Survey on Prevalence of Echinococcosis in 2012. The prevalence of hydatid disease in huma and livestock, and the positive rate of echinococcal antigen in canine feces were 0-1.59%, 0-15.22%, and 0-16.87% respectively. Clustering analysis revealed four types of prevalence in the 72 counties. The first type existed only in Dunhuang city, with the three indicators being 0.27%, 15.22% and 16.87%; the second in four counties, with the three indicators being 0.43%, 6.57% and 1.83%; the third in 22 counties, with the three indicators being 0.22%, 1.15% and 1035%; and the fourth in 45 counties, with the three indicators being 0.16%, 0.58% and 1.69%.

  12. The methodology of multi-viewpoint clustering analysis

    NASA Technical Reports Server (NTRS)

    Mehrotra, Mala; Wild, Chris

    1993-01-01

    One of the greatest challenges facing the software engineering community is the ability to produce large and complex computer systems, such as ground support systems for unmanned scientific missions, that are reliable and cost effective. In order to build and maintain these systems, it is important that the knowledge in the system be suitably abstracted, structured, and otherwise clustered in a manner which facilitates its understanding, manipulation, testing, and utilization. Development of complex mission-critical systems will require the ability to abstract overall concepts in the system at various levels of detail and to consider the system from different points of view. Multi-ViewPoint - Clustering Analysis MVP-CA methodology has been developed to provide multiple views of large, complicated systems. MVP-CA provides an ability to discover significant structures by providing an automated mechanism to structure both hierarchically (from detail to abstract) and orthogonally (from different perspectives). We propose to integrate MVP/CA into an overall software engineering life cycle to support the development and evolution of complex mission critical systems.

  13. Using n-gram analysis to cluster heartbeat signals

    PubMed Central

    2012-01-01

    Background Biological signals may carry specific characteristics that reflect basic dynamics of the body. In particular, heart beat signals carry specific signatures that are related to human physiologic mechanisms. In recent years, many researchers have shown that representations which used non-linear symbolic sequences can often reveal much hidden dynamic information. This kind of symbolization proved to be useful for predicting life-threatening cardiac diseases. Methods This paper presents an improved method called the “Adaptive Interbeat Interval Analysis (AIIA) method”. The AIIA method uses the Simple K-Means algorithm for symbolization, which offers a new way to represent subtle variations between two interbeat intervals without human intervention. After symbolization, it uses the n-gram algorithm to generate different kinds of symbolic sequences. Each symbolic sequence stands for a variation phase. Finally, the symbolic sequences are categorized by classic classifiers. Results In the experiments presented in this paper, AIIA method achieved 91% (3-gram, 26 clusters) accuracy in successfully classifying between the patients with Atrial Fibrillation (AF), Congestive Heart Failure (CHF) and healthy people. It also achieved 87% (3-gram, 26 clusters) accuracy in classifying the patients with apnea. Conclusions The two experiments presented in this paper demonstrate that AIIA method can categorize different heart diseases. Both experiments acquired the best category results when using the Bayesian Network. For future work, the concept of the AIIA method can be extended to the categorization of other physiological signals. More features can be added to improve the accuracy. PMID:22769567

  14. A Hierarchical Bayesian Procedure for Two-Mode Cluster Analysis

    ERIC Educational Resources Information Center

    DeSarbo, Wayne S.; Fong, Duncan K. H.; Liechty, John; Saxton, M. Kim

    2004-01-01

    This manuscript introduces a new Bayesian finite mixture methodology for the joint clustering of row and column stimuli/objects associated with two-mode asymmetric proximity, dominance, or profile data. That is, common clusters are derived which partition both the row and column stimuli/objects simultaneously into the same derived set of clusters.…

  15. Cluster randomized clinical trials in orthodontics: design, analysis and reporting issues.

    PubMed

    Pandis, Nikolaos; Walsh, Tanya; Polychronopoulou, Argy; Eliades, Theodore

    2013-10-01

    Cluster randomized trials (CRTs) use as the unit of randomization clusters, which are usually defined as a collection of individuals sharing some common characteristics. Common examples of clusters include entire dental practices, hospitals, schools, school classes, villages, and towns. Additionally, several measurements (repeated measurements) taken on the same individual at different time points are also considered to be clusters. In dentistry, CRTs are applicable as patients may be treated as clusters containing several individual teeth. CRTs require certain methodological procedures during sample calculation, randomization, data analysis, and reporting, which are often ignored in dental research publications. In general, due to similarity of the observations within clusters, each individual within a cluster provides less information compared with an individual in a non-clustered trial. Therefore, clustered designs require larger sample sizes compared with non-clustered randomized designs, and special statistical analyses that account for the fact that observations within clusters are correlated. It is the purpose of this article to highlight with relevant examples the important methodological characteristics of cluster randomized designs as they may be applied in orthodontics and to explain the problems that may arise if clustered observations are erroneously treated and analysed as independent (non-clustered).

  16. Independent Component Analysis to Detect Clustered Microcalcification Breast Cancers

    PubMed Central

    Gallardo-Caballero, R.; García-Orellana, C. J.; García-Manso, A.; González-Velasco, H. M.; Macías-Macías, M.

    2012-01-01

    The presence of clustered microcalcifications is one of the earliest signs in breast cancer detection. Although there exist many studies broaching this problem, most of them are nonreproducible due to the use of proprietary image datasets. We use a known subset of the currently largest publicly available mammography database, the Digital Database for Screening Mammography (DDSM), to develop a computer-aided detection system that outperforms the current reproducible studies on the same mammogram set. This proposal is mainly based on the use of extracted image features obtained by independent component analysis, but we also study the inclusion of the patient's age as a nonimage feature which requires no human expertise. Our system achieves an average of 2.55 false positives per image at a sensitivity of 81.8% and 4.45 at a sensitivity of 91.8% in diagnosing the BCRP_CALC_1 subset of DDSM. PMID:22654626

  17. Ultrafast dynamics in atomic clusters: Analysis and control

    PubMed Central

    Bonačić-Koutecký, Vlasta; Mitrić, Roland; Werner, Ute; Wöste, Ludger; Berry, R. Stephen

    2006-01-01

    We present a study of dynamics and ultrafast observables in the frame of pump–probe negative-to-neutral-to-positive ion (NeNePo) spectroscopy illustrated by the examples of bimetallic trimers Ag2Au−/Ag2Au/Ag2Au+ and silver oxides Ag3O2−/Ag3O2/Ag3O2+ in the context of cluster reactivity. First principle multistate adiabatic dynamics allows us to determine time scales of different ultrafast processes and conditions under which these processes can be experimentally observed. Furthermore, we present a strategy for optimal pump–dump control in complex systems based on the ab initio Wigner distribution approach and apply it to tailor laser fields for selective control of the isomerization process in Na3F2. The shapes of pulses can be assigned to underlying processes, and therefore control can be used as a tool for analysis. PMID:16740664

  18. Hierarchical clustering techniques for image database organization and summarization

    NASA Astrophysics Data System (ADS)

    Vellaikal, Asha; Kuo, C.-C. Jay

    1998-10-01

    This paper investigates clustering techniques as a method of organizing image databases to support popular visual management functions such as searching, browsing and navigation. Different types of hierarchical agglomerative clustering techniques are studied as a method of organizing features space as well as summarizing image groups by the selection of a few appropriate representatives. Retrieval performance using both single and multiple level hierarchies are experimented with and the algorithms show an interesting relationship between the top k correct retrievals and the number of comparisons required. Some arguments are given to support the use of such cluster-based techniques for managing distributed image databases.

  19. Archetypal TRMM Radar Profiles Identified Through Cluster Analysis

    NASA Technical Reports Server (NTRS)

    Boccippio, Dennis J.

    2003-01-01

    It is widely held that identifiable 'convective regimes' exist in nature, although precise definitions of these are elusive. Examples include land / Ocean distinctions, break / monsoon beahvior, seasonal differences in the Amazon (SON vs DJF), etc. These regimes are often described by differences in the realized local convective spectra, and measured by various metrics of convective intensity, depth, areal coverage and rainfall amount. Objective regime identification may be valuable in several ways: regimes may serve as natural 'branch points' in satellite retrieval algorithms or data assimilation efforts; one example might be objective identification of regions that 'should' share a similar 2-R relationship. Similarly, objectively defined regimes may provide guidance on optimal siting of ground validation efforts. Objectively defined regimes could also serve as natural (rather than arbitrary geographic) domain 'controls' in studies of convective response to environmental forcing. Quantification of convective vertical structure has traditionally involved parametric study of prescribed quantities thought to be important to convective dynamics: maximum radar reflectivity, cloud top height, 30-35 dBZ echo top height, rain rate, etc. Individually, these parameters are somewhat deficient as their interpretation is often nonunique (the same metric value may signify different physics in different storm realizations). Individual metrics also fail to capture the coherence and interrelationships between vertical levels available in full 3-D radar datasets. An alternative approach is discovery of natural partitions of vertical structure in a globally representative dataset, or 'archetypal' reflectivity profiles. In this study, this is accomplished through cluster analysis of a very large sample (0[107) of TRMM-PR reflectivity columns. Once achieved, the rainconditional and unconditional 'mix' of archetypal profile types in a given location and/or season provides a description

  20. Outlier Identification in Model-Based Cluster Analysis

    PubMed Central

    Evans, Katie; Love, Tanzy; Thurston, Sally W.

    2015-01-01

    In model-based clustering based on normal-mixture models, a few outlying observations can influence the cluster structure and number. This paper develops a method to identify these, however it does not attempt to identify clusters amidst a large field of noisy observations. We identify outliers as those observations in a cluster with minimal membership proportion or for which the cluster-specific variance with and without the observation is very different. Results from a simulation study demonstrate the ability of our method to detect true outliers without falsely identifying many non-outliers and improved performance over other approaches, under most scenarios. We use the contributed R package MCLUST for model-based clustering, but propose a modified prior for the cluster-specific variance which avoids degeneracies in estimation procedures. We also compare results from our outlier method to published results on National Hockey League data. PMID:26806993

  1. Cluster analysis application in research on pork quality determinants

    NASA Astrophysics Data System (ADS)

    Przybylski, W.; Wasiewicz, P.; Zieliński, P.; Gromadzka-Ostrowska, J.; Olczak, E.; Jaworska, D.; Niemyjski, S.; Santé-Lhoutellier, V.

    2010-09-01

    In this paper data mining methods were applied to investigate features determining high quality pork meat. The aim of the study was analysis of conditionality of the pork meat quality defined in coherence with HDL and LDL cholesterol concentration, plasma leptin, triglycerides, plasma glucose and serum. The research was carried out on 54 pigs. originated from crossbreeding of Naima sows with P76-PenArLan boars hybrids line. Meat quality parameters were evaluated in samples derived from the Longissimus (LD) muscle taken behind the last rib on the basis: the pH value, meat colour, drip loss, the RTN, intramuscular fat and glycolytic potential. The results of this study were elaborated by using R environment and show that cluster and regression analysis can be a useful tool for in-depth analysis of the determinants of the quality of pig meat in homogeneous populations of pigs. However, the question of determinants of the level of glycogen and fat in meat requires further research.

  2. A combined multidimensional scaling and hierarchical clustering view for the exploratory analysis of multidimensional data

    NASA Astrophysics Data System (ADS)

    Craig, Paul; Roa-Seïler, Néna

    2013-01-01

    This paper describes a novel information visualization technique that combines multidimensional scaling and hierarchical clustering to support the exploratory analysis of multidimensional data. The technique displays the results of multidimensional scaling using a scatter plot where the proximity of any two items' representations is approximate to their similarity according to a Euclidean distance metric. The results of hierarchical clustering are overlaid onto this view by drawing smoothed outlines around each nested cluster. The difference in similarity between successive cluster combinations is used to colour code clusters and make stronger natural clusters more prominent in the display. When a cluster or group of items is selected, multidimensional scaling and hierarchical clustering are re-applied to a filtered subset of the data, and animation is used to smooth the transition between successive filtered views. As a case study we demonstrate the technique being used to analyse survey data relating to the appropriateness of different phrases to different emotionally charged situations.

  3. Cluster analysis of indermediate deep events in the southeastern Aegean

    NASA Astrophysics Data System (ADS)

    Ruscic, Marija; Becker, Dirk; Brüstle, Andrea; Meier, Thomas

    2015-04-01

    The Hellenic subduction zone (HSZ) is the seismically most active region in Europe where the oceanic African litosphere is subducting beneath the continental Aegean plate. Although there are numerous studies of seismicity in the HSZ, very few focus on the eastern HSZ and the Wadati-Benioff-Zone of the subducting slab in that part of the HSZ. In order to gain a better understanding of the geodynamic processes in the region a dense local seismic network is required. From September 2005 to March 2007, the temporary seismic network EGELADOS has been deployed covering the entire HSZ. It consisted of 56 onshore and 23 offshore broadband stations with addition of 19 stations from GEOFON, NOA and MedNet to complete the network. Here, we focus on a cluster of intermediate deep seismicity recorded by the EGELADOS network within the subducting African slab in the region of the Nysiros volcano. The cluster consists of 159 events at 80 to 190 km depth with magnitudes between 0.2 and 4.1 that were located using nonlinear location tool NonLinLoc. A double-difference earthquake relocation using the HypoDD software is performed with both manual readings of onset times and differential traveltimes obtained by separate cross correlation of P- and S-waveforms. Single event locations are compared to relative relocations. The event hypocenters fall into a thin zone close to the top of the slab defining its geometry with an accuracy of a few kilometers. At intermediate depth the slab is dipping towards the NW at an angle of about 30°. That means it is dipping steeper than in the western part of the HSZ. The edge of the slab is clearly defined by an abrupt disappearance of intermediate depths seismicity towards the NE. It is found approximately beneath the Turkish coastline. Furthermore, results of a cluster analysis based on the cross correlation of three-component waveforms are shown as a function of frequency and the spatio-temporal migration of the seismic activity is analysed.

  4. AVES: A Computer Cluster System approach for INTEGRAL Scientific Analysis

    NASA Astrophysics Data System (ADS)

    Federici, M.; Martino, B. L.; Natalucci, L.; Umbertini, P.

    The AVES computing system, based on an "Cluster" architecture is a fully integrated, low cost computing facility dedicated to the archiving and analysis of the INTEGRAL data. AVES is a modular system that uses the software resource manager (SLURM) and allows almost unlimited expandibility (65,536 nodes and hundreds of thousands of processors); actually is composed by 30 Personal Computers with Quad-Cores CPU able to reach the computing power of 300 Giga Flops (300x10{9} Floating point Operations Per Second), with 120 GB of RAM and 7.5 Tera Bytes (TB) of storage memory in UFS configuration plus 6 TB for users area. AVES was designed and built to solve growing problems raised from the analysis of the large data amount accumulated by the INTEGRAL mission (actually about 9 TB) and due to increase every year. The used analysis software is the OSA package, distributed by the ISDC in Geneva. This is a very complex package consisting of dozens of programs that can not be converted to parallel computing. To overcome this limitation we developed a series of programs to distribute the workload analysis on the various nodes making AVES automatically divide the analysis in N jobs sent to N cores. This solution thus produces a result similar to that obtained by the parallel computing configuration. In support of this we have developed tools that allow a flexible use of the scientific software and quality control of on-line data storing. The AVES software package is constituted by about 50 specific programs. Thus the whole computing time, compared to that provided by a Personal Computer with single processor, has been enhanced up to a factor 70.

  5. Adapting Spectral Co-clustering to Documents and Terms Using Latent Semantic Analysis

    NASA Astrophysics Data System (ADS)

    Park, Laurence A. F.; Leckie, Christopher A.; Ramamohanarao, Kotagiri; Bezdek, James C.

    Spectral co-clustering is a generic method of computing co-clusters of relational data, such as sets of documents and their terms. Latent semantic analysis is a method of document and term smoothing that can assist in the information retrieval process. In this article we examine the process behind spectral clustering for documents and terms, and compare it to Latent Semantic Analysis. We show that both spectral co-clustering and LSA follow the same process, using different normalisation schemes and metrics. By combining the properties of the two co-clustering methods, we obtain an improved co-clustering method for document-term relational data that provides an increase in the cluster quality of 33.0%.

  6. Combined clustering models for the analysis of gene expression

    SciTech Connect

    Angelova, M. Ellman, J.

    2010-02-15

    Clustering has become one of the fundamental tools for analyzing gene expression and producing gene classifications. Clustering models enable finding patterns of similarity in order to understand gene function, gene regulation, cellular processes and sub-types of cells. The clustering results however have to be combined with sequence data or knowledge about gene functionality in order to make biologically meaningful conclusions. In this work, we explore a new model that integrates gene expression with sequence or text information.

  7. Study on Cluster Analysis Used with Laser-Induced Breakdown Spectroscopy

    NASA Astrophysics Data System (ADS)

    He, Li'ao; Wang, Qianqian; Zhao, Yu; Liu, Li; Peng, Zhong

    2016-06-01

    Supervised learning methods (eg. PLS-DA, SVM, etc.) have been widely used with laser-induced breakdown spectroscopy (LIBS) to classify materials; however, it may induce a low correct classification rate if a test sample type is not included in the training dataset. Unsupervised cluster analysis methods (hierarchical clustering analysis, K-means clustering analysis, and iterative self-organizing data analysis technique) are investigated in plastics classification based on the line intensities of LIBS emission in this paper. The results of hierarchical clustering analysis using four different similarity measuring methods (single linkage, complete linkage, unweighted pair-group average, and weighted pair-group average) are compared. In K-means clustering analysis, four kinds of choosing initial centers methods are applied in our case and their results are compared. The classification results of hierarchical clustering analysis, K-means clustering analysis, and ISODATA are analyzed. The experiment results demonstrated cluster analysis methods can be applied to plastics discrimination with LIBS. supported by Beijing Natural Science Foundation of China (No. 4132063)

  8. Leukaemia clusters in childhood: geographical analysis in Britain.

    PubMed Central

    Knox, E G

    1994-01-01

    STUDY OBJECTIVE--To validate previously demonstrated spatial clustering of childhood leukaemias by showing relative proximities of selected map features to cluster locations, compared with control locations. If clusters are real, then they are likely to be close to a determining hazard. DESIGN--Cluster postcode loci and partially matched control postcodes were compared in terms of distances to railways, main roads, churches, surface water, woodland areas, and railside industrial installations. Further supporting comparisons between non-clustered cases and random postcode controls with those map features representable as single grid points were made. SETTING--England, Wales, and Scotland 1966-83. SUBJECTS--Grid referenced registrations of 9406 childhood leukaemias and non-Hodgkin's lymphomas, including 264 pairs (or more) separated by < 150 m, and grid references of random postcodes in equal numbers. MAIN RESULTS--The 264 clusters showed relative proximities (or the inverse) to several map features, of which the most powerful was an association with railways. The non-railway associations seemed to be statistically indirect. Some railside industrial installations, identified from a railway atlas, also showed relative proximities to leukaemia clusters, as well as to non-clustered cases, but did not "explain" the railway effect. These installations, with seemingly independent geographical associations, included oil refineries, petrochemical plants, oil storage and oil distribution depots, power stations, and steelworks. CONCLUSIONS--The previously shown childhood leukaemia clusters are confirmed to be non-random through their systematic associations with certain map features when compared with the control locations. The common patterns of close association of clustered and non-clustered cases imply a common aetiological component arising from a common environmental hazard--namely the use of fossil fuels, especially petroleum. PMID:7964336

  9. Segmenting Business Students Using Cluster Analysis Applied to Student Satisfaction Survey Results

    ERIC Educational Resources Information Center

    Gibson, Allen

    2009-01-01

    This paper demonstrates a new application of cluster analysis to segment business school students according to their degree of satisfaction with various aspects of the academic program. The resulting clusters provide additional insight into drivers of student satisfaction that are not evident from analysis of the responses of the student body as a…

  10. Cluster Analysis of the Luria-Nebraska Neuropsychological Battery with Learning Disabled Adults.

    ERIC Educational Resources Information Center

    McCue, Michael; And Others

    The study reports a cluster analysis of Luria-Nebraska Neuropsychological Battery sources of 25 learning disabled adults. The cluster analysis suggested the presence of three subgroups within this sample, one having high elevations on the Rhythm, Writing, Reading, and Arithmetic Rhythm scales, the second having an extremely high evelation on the…

  11. Cluster Analysis as a Method of Recovering Types of Intraindividual Growth Trajectories: A Monte Carlo Study.

    ERIC Educational Resources Information Center

    Dumenci, Levent; Windle, Michael

    2001-01-01

    Used Monte Carlo methods to evaluate the adequacy of cluster analysis to recover group membership based on simulated latent growth curve (LCG) models. Cluster analysis failed to recover growth subtypes adequately when the difference between growth curves was shape only. Discusses circumstances under which it was more successful. (SLD)

  12. Tracking Undergraduate Student Achievement in a First-Year Physiology Course Using a Cluster Analysis Approach

    ERIC Educational Resources Information Center

    Brown, S. J.; White, S.; Power, N.

    2015-01-01

    A cluster analysis data classification technique was used on assessment scores from 157 undergraduate nursing students who passed 2 successive compulsory courses in human anatomy and physiology. Student scores in five summative assessment tasks, taken in each of the courses, were used as inputs for a cluster analysis procedure. We aimed to group…

  13. Investigating Regional Disparities of China's Human Development with Cluster Analysis: A Historical Perspective

    ERIC Educational Resources Information Center

    Yang, Yongheng; Hu, Angang

    2008-01-01

    This paper adopts both one-dimensional and multi-dimensional cluster analysis to analyze China's HDI data for 1982, 1995, 1999, and 2003, and to classify China's provinces into four tiers based on the three basic developmental aspects embedded in HDI. The classifications by cluster analysis depends on the observations' similarities with respect to…

  14. Multilevel Analysis Methods for Partially Nested Cluster Randomized Trials

    ERIC Educational Resources Information Center

    Sanders, Elizabeth A.

    2011-01-01

    This paper explores multilevel modeling approaches for 2-group randomized experiments in which a treatment condition involving clusters of individuals is compared to a control condition involving only ungrouped individuals, otherwise known as partially nested cluster randomized designs (PNCRTs). Strategies for comparing groups from a PNCRT in the…

  15. Alternatives to Multilevel Modeling for the Analysis of Clustered Data

    ERIC Educational Resources Information Center

    Huang, Francis L.

    2016-01-01

    Multilevel modeling has grown in use over the years as a way to deal with the nonindependent nature of observations found in clustered data. However, other alternatives to multilevel modeling are available that can account for observations nested within clusters, including the use of Taylor series linearization for variance estimation, the design…

  16. Two worlds collide: Image analysis methods for quantifying structural variation in cluster molecular dynamics

    SciTech Connect

    Steenbergen, K. G.; Gaston, N.

    2014-02-14

    Inspired by methods of remote sensing image analysis, we analyze structural variation in cluster molecular dynamics (MD) simulations through a unique application of the principal component analysis (PCA) and Pearson Correlation Coefficient (PCC). The PCA analysis characterizes the geometric shape of the cluster structure at each time step, yielding a detailed and quantitative measure of structural stability and variation at finite temperature. Our PCC analysis captures bond structure variation in MD, which can be used to both supplement the PCA analysis as well as compare bond patterns between different cluster sizes. Relying only on atomic position data, without requirement for a priori structural input, PCA and PCC can be used to analyze both classical and ab initio MD simulations for any cluster composition or electronic configuration. Taken together, these statistical tools represent powerful new techniques for quantitative structural characterization and isomer identification in cluster MD.

  17. Boundaries, links and clusters: a new paradigm in spatial analysis?

    PubMed Central

    Jacquez, Geoff M.; Kaufmann, Andy; Goovaerts, Pierre

    2008-01-01

    This paper develops and applies new techniques for the simultaneous detection of boundaries and clusters within a probabilistic framework. The new statistic “little b” (written bij) evaluates boundaries between adjacent areas with different values, as well as links between adjacent areas with similar values. Clusters of high values (hotspots) and low values (coldspots) are then constructed by joining areas abutting locations that are significantly high (e.g., an unusually high disease rate) and that are connected through a “link” such that the values in the adjoining areas are not significantly different. Two techniques are proposed and evaluated for accomplishing cluster construction: “big B” and the “ladder” approach. We compare the statistical power and empirical Type I and Type II error of these approaches to those of wombling and the local Moran test. Significance may be evaluated using distribution theory based on the product of two continuous (e.g., non-discrete) variables. We also provide a “distribution free” algorithm based on resampling of the observed values. The methods are applied to simulated data for which the locations of boundaries and clusters is known, and compared and contrasted with clusters found using the local Moran statistic and with polygon Womble boundaries. The little b approach to boundary detection is comparable to polygon wombling in terms of Type I error, Type II error and empirical statistical power. For cluster detection, both the big B and ladder approaches have lower Type I and Type II error and are more powerful than the local Moran statistic. The new methods are not constrained to find clusters of a pre-specified shape, such as circles, ellipses and donuts, and yield a more accurate description of geographic variation than alternative cluster tests that presuppose a specific cluster shape. We recommend these techniques over existing cluster and boundary detection methods that do not provide such a

  18. Boundaries, links and clusters: a new paradigm in spatial analysis?

    PubMed

    Jacquez, Geoff M; Kaufmann, Andy; Goovaerts, Pierre

    2008-12-01

    This paper develops and applies new techniques for the simultaneous detection of boundaries and clusters within a probabilistic framework. The new statistic "little b" (written b(ij)) evaluates boundaries between adjacent areas with different values, as well as links between adjacent areas with similar values. Clusters of high values (hotspots) and low values (coldspots) are then constructed by joining areas abutting locations that are significantly high (e.g., an unusually high disease rate) and that are connected through a "link" such that the values in the adjoining areas are not significantly different. Two techniques are proposed and evaluated for accomplishing cluster construction: "big B" and the "ladder" approach. We compare the statistical power and empirical Type I and Type II error of these approaches to those of wombling and the local Moran test. Significance may be evaluated using distribution theory based on the product of two continuous (e.g., non-discrete) variables. We also provide a "distribution free" algorithm based on resampling of the observed values. The methods are applied to simulated data for which the locations of boundaries and clusters is known, and compared and contrasted with clusters found using the local Moran statistic and with polygon Womble boundaries. The little b approach to boundary detection is comparable to polygon wombling in terms of Type I error, Type II error and empirical statistical power. For cluster detection, both the big B and ladder approaches have lower Type I and Type II error and are more powerful than the local Moran statistic. The new methods are not constrained to find clusters of a pre-specified shape, such as circles, ellipses and donuts, and yield a more accurate description of geographic variation than alternative cluster tests that presuppose a specific cluster shape. We recommend these techniques over existing cluster and boundary detection methods that do not provide such a comprehensive description

  19. Fully Automated Operational Modal Analysis using multi-stage clustering

    NASA Astrophysics Data System (ADS)

    Neu, Eugen; Janser, Frank; Khatibi, Akbar A.; Orifici, Adrian C.

    2017-02-01

    The interest for robust automatic modal parameter extraction techniques has increased significantly over the last years, together with the rising demand for continuous health monitoring of critical infrastructure like bridges, buildings and wind turbine blades. In this study a novel, multi-stage clustering approach for Automated Operational Modal Analysis (AOMA) is introduced. In contrast to existing approaches, the procedure works without any user-provided thresholds, is applicable within large system order ranges, can be used with very small sensor numbers and does not place any limitations on the damping ratio or the complexity of the system under investigation. The approach works with any parametric system identification algorithm that uses the system order n as sole parameter. Here a data-driven Stochastic Subspace Identification (SSI) method is used. Measurements from a wind tunnel investigation with a composite cantilever equipped with Fiber Bragg Grating Sensors (FBGSs) and piezoelectric sensors are used to assess the performance of the algorithm with a highly damped structure and low signal to noise ratio conditions. The proposed method was able to identify all physical system modes in the investigated frequency range from over 1000 individual datasets using FBGSs under challenging signal to noise ratio conditions and under better signal conditions but from only two sensors.

  20. Subsets in psoriatic arthritis formed by cluster analysis.

    PubMed

    Koó, T; Nagy, Z; Seszták, M; Ujfalussy, I; Merétey, K; Böhm, U; Forgács, S; Szilágyi, M; Czirják, L; Farkas, V

    2001-01-01

    The aim of the study was to create subgroups among psoriatic arthritis patients on the basis of dermatological features, clinical pattern of arthritis, and laboratory, immunological and radiological findings. Data on 100 patients were expressed in a standardised form and entered into hierarchical cluster analysis according to Ward's method. Seven subgroups were created. Fifty-six patients with mild psoriasis were sorted into a 'polyarticular group'. Two 'RA-like groups' were formed, differing from each other serologically and in axial involvement. In an 'oligoarticular group' (18 patients) serious skin disease and female gender predominancy were found to be characteristic. Eight patients with polyarticular arthritis were assigned to an 'erythrodermal group', in which polyarticular arthritis, mutilating, severe arthritis and a history of erythroderma were characteristic. Close to this group on the dendrogram eight women were sorted into a 'distal form'. Sausage fingers were frequent, and nail dystrophy was present in every case. In a 'pustular group' (three patients) the different type of skin involvement was considered and nail dystrophy was common. In the newly created subgroups not only the arthritic status, but also the type of the skin disease, played a determining role.

  1. Cluster analysis of Scedosporium boydii infections in a single hospital.

    PubMed

    Bernhardt, Anne; Seibold, Michael; Rickerts, Volker; Tintelnot, Kathrin

    2015-10-01

    Scedosporiosis is a rare, but often fatal mycotic infection occurring in immunosuppressed as well as in immunocompetent patients. Over a period of 14 months, Scedosporium boydii isolates were sent to our reference laboratory from six immunocompetent patients treated at a single hospital in Germany. In analogy to the EORTC/MSG criteria, four patients were classified as proven invasive scedosporiosis cases, and two patients as probable or possible cases. Of note, in five patients scedosporiosis was diagnosed between 1 and 14 months (median 5.0 months) after cardiac surgery. Despite antimycotic treatment two patients died, and three were lost for long-term follow-up. All clinical S. boydii isolates were characterized by molecular analysis using multilocus sequence typing (MLST). An identical MLST type was found in five patients who had been treated in the surgery unit, suggesting a link between these infections. The source of S. boydii has not been identified. Within an observation period of 2 years before and after this cluster of infections no further cases of scedosporiosis were reported from this hospital.

  2. Considerations on the thermal performances of ribbed channels by means of a novel dynamic method for hierarchical clustering

    NASA Astrophysics Data System (ADS)

    Niro, A.; Fustinoni, D.; Vignati, F.; Gramazio, P.; Ciminà, S.

    2016-09-01

    The investigation of ribbed surfaces for the enhancement of heat transfer in forced convection allowed to observe that different geometries may lead to comparable performances. Due to the lack of an underlying structure of the data, a novel method for data clustering is introduced here, to assess to what extent comparable performances can be achieved using different rib geometries. The clustering method is an agglomerative technique, based on the inclusion of each configuration in another ones bounding box, whose size depends dynamically on the Nusselt number and the pumping power. The method is applied to a large database experimentally obtained at ThermALab of Politecnico di Milano, in order to identify the Nusselt number and the friction factor for diverse-rib configurations in a large-aspect ratio channel with low-Reynolds flows. The clusters are determined, and the resulting families of configurations are used to assess the possible effects of the rib geometry on the thermal and fluid-dynamic performances. The clustering analysis results suggest interesting considerations.

  3. Topological Analysis of Emerging Bipole Clusters Producing Violent Solar Events

    NASA Astrophysics Data System (ADS)

    Mandrini, C. H.; Schmieder, B.; Démoulin, P.; Guo, Y.; Cristiani, G. D.

    2014-06-01

    During the rising phase of Solar Cycle 24 tremendous activity occurred on the Sun with rapid and compact emergence of magnetic flux leading to bursts of flares (C to M and even X-class). We investigate the violent events occurring in the cluster of two active regions (ARs), NOAA numbers 11121 and 11123, observed in November 2010 with instruments onboard the Solar Dynamics Observatory and from Earth. Within one day the total magnetic flux increased by 70 % with the emergence of new groups of bipoles in AR 11123. From all the events on 11 November, we study, in particular, the ones starting at around 07:16 UT in GOES soft X-ray data and the brightenings preceding them. A magnetic-field topological analysis indicates the presence of null points, associated separatrices, and quasi-separatrix layers (QSLs) where magnetic reconnection is prone to occur. The presence of null points is confirmed by a linear and a non-linear force-free magnetic-field model. Their locations and general characteristics are similar in both modelling approaches, which supports their robustness. However, in order to explain the full extension of the analysed event brightenings, which are not restricted to the photospheric traces of the null separatrices, we compute the locations of QSLs. Based on this more complete topological analysis, we propose a scenario to explain the origin of a low-energy event preceding a filament eruption, which is accompanied by a two-ribbon flare, and a consecutive confined flare in AR 11123. The results of our topology computation can also explain the locations of flare ribbons in two other events, one preceding and one following the ones at 07:16 UT. Finally, this study provides further examples where flare-ribbon locations can be explained when compared to QSLs and only, partially, when using separatrices.

  4. Modest validity and fair reproducibility of dietary patterns derived by cluster analysis.

    PubMed

    Funtikova, Anna N; Benítez-Arciniega, Alejandra A; Fitó, Montserrat; Schröder, Helmut

    2015-03-01

    Cluster analysis is widely used to analyze dietary patterns. We aimed to analyze the validity and reproducibility of the dietary patterns defined by cluster analysis derived from a food frequency questionnaire (FFQ). We hypothesized that the dietary patterns derived by cluster analysis have fair to modest reproducibility and validity. Dietary data were collected from 107 individuals from population-based survey, by an FFQ at baseline (FFQ1) and after 1 year (FFQ2), and by twelve 24-hour dietary recalls (24-HDR). Repeatability and validity were measured by comparing clusters obtained by the FFQ1 and FFQ2 and by the FFQ2 and 24-HDR (reference method), respectively. Cluster analysis identified a "fruits & vegetables" and a "meat" pattern in each dietary data source. Cluster membership was concordant for 66.7% of participants in FFQ1 and FFQ2 (reproducibility), and for 67.0% in FFQ2 and 24-HDR (validity). Spearman correlation analysis showed reasonable reproducibility, especially in the "fruits & vegetables" pattern, and lower validity also especially in the "fruits & vegetables" pattern. κ statistic revealed a fair validity and reproducibility of clusters. Our findings indicate a reasonable reproducibility and fair to modest validity of dietary patterns derived by cluster analysis.

  5. Poisson approach to clustering analysis of regulatory sequences.

    PubMed

    Wang, Haiying; Zheng, Huiru; Hu, Jinglu

    2008-01-01

    The presence of similar patterns in regulatory sequences may aid users in identifying co-regulated genes or inferring regulatory modules. By modelling pattern occurrences in regulatory regions with Poisson statistics, this paper presents a log likelihood ratio statistics-based distance measure to calculate pair-wise similarities between regulatory sequences. We employed it within three clustering algorithms: hierarchical clustering, Self-Organising Map, and a self-adaptive neural network. The results indicate that, in comparison to traditional clustering algorithms, the incorporation of the log likelihood ratio statistics-based distance into the learning process may offer considerable improvements in the process of regulatory sequence-based classification of genes.

  6. "Follow the leader": a centrality guided clustering and its application to social network analysis.

    PubMed

    Wu, Qin; Qi, Xingqin; Fuller, Eddie; Zhang, Cun-Quan

    2013-01-01

    Within graph theory and network analysis, centrality of a vertex measures the relative importance of a vertex within a graph. The centrality plays key role in network analysis and has been widely studied using different methods. Inspired by the idea of vertex centrality, a novel centrality guided clustering (CGC) is proposed in this paper. Different from traditional clustering methods which usually choose the initial center of a cluster randomly, the CGC clustering algorithm starts from a "LEADER"--a vertex with the highest centrality score--and a new "member" is added into the same cluster as the "LEADER" when some criterion is satisfied. The CGC algorithm also supports overlapping membership. Experiments on three benchmark social network data sets are presented and the results indicate that the proposed CGC algorithm works well in social network clustering.

  7. Delineation of river bed-surface patches by clustering high-resolution spatial grain size data

    NASA Astrophysics Data System (ADS)

    Nelson, Peter A.; Bellugi, Dino; Dietrich, William E.

    2014-01-01

    The beds of gravel-bed rivers commonly display distinct sorting patterns, which at length scales of ~ 0.1 - 1 channel widths appear to form an organization of patches or facies. This paper explores alternatives to traditional visual facies mapping by investigating methods of patch delineation in which clustering analysis is applied to a high-resolution grid of spatial grain-size distributions (GSDs) collected during a flume experiment. Specifically, we examine four clustering techniques: 1) partitional clustering of grain-size distributions with the k-means algorithm (assigning each GSD to a type of patch based solely on its distribution characteristics), 2) spatially-constrained agglomerative clustering ("growing" patches by merging adjacent GSDs, thus generating a hierarchical structure of patchiness), 3) spectral clustering using Normalized Cuts (using the spatial distance between GSDs and the distribution characteristics to generate a matrix describing the similarity between all GSDs, and using the eigenvalues of this matrix to divide the bed into patches), and 4) fuzzy clustering with the fuzzy c-means algorithm (assigning each GSD a membership probability to every patch type). For each clustering method, we calculate metrics describing how well-separated cluster-average GSDs are and how patches are arranged in space. We use these metrics to compute optimal clustering parameters, to compare the clustering methods against each other, and to compare clustering results with patches mapped visually during the flume experiment.All clustering methods produced better-separated patch GSDs than the visually-delineated patches. Although they do not produce crisp cluster assignment, fuzzy algorithms provide useful information that can characterize the uncertainty of a location on the bed belonging to any particular type of patch, and they can be used to characterize zones of transition from one patch to another. The extent to which spatial information influences

  8. Forensic discrimination of dyed hair color: II. Multivariate statistical analysis.

    PubMed

    Barrett, Julie A; Siegel, Jay A; Goodpaster, John V

    2011-01-01

    This research is intended to assess the ability of UV-visible microspectrophotometry to successfully discriminate the color of dyed hair. Fifty-five red hair dyes were analyzed and evaluated using multivariate statistical techniques including agglomerative hierarchical clustering (AHC), principal component analysis (PCA), and discriminant analysis (DA). The spectra were grouped into three classes, which were visually consistent with different shades of red. A two-dimensional PCA observations plot was constructed, describing 78.6% of the overall variance. The wavelength regions associated with the absorbance of hair and dye were highly correlated. Principal components were selected to represent 95% of the overall variance for analysis with DA. A classification accuracy of 89% was observed for the comprehensive dye set, while external validation using 20 of the dyes resulted in a prediction accuracy of 75%. Significant color loss from successive washing of hair samples was estimated to occur within 3 weeks of dye application.

  9. [Principal component analysis and cluster analysis of inorganic elements in sea cucumber Apostichopus japonicus].

    PubMed

    Liu, Xiao-Fang; Xue, Chang-Hu; Wang, Yu-Ming; Li, Zhao-Jie; Xue, Yong; Xu, Jie

    2011-11-01

    The present study is to investigate the feasibility of multi-elements analysis in determination of the geographical origin of sea cucumber Apostichopus japonicus, and to make choice of the effective tracers in sea cucumber Apostichopus japonicus geographical origin assessment. The content of the elements such as Al, V, Cr, Mn, Fe, Co, Ni, Cu, Zn, As, Se, Mo, Cd, Hg and Pb in sea cucumber Apostichopus japonicus samples from seven places of geographical origin were determined by means of ICP-MS. The results were used for the development of elements database. Cluster analysis(CA) and principal component analysis (PCA) were applied to differentiate the sea cucumber Apostichopus japonicus geographical origin. Three principal components which accounted for over 89% of the total variance were extracted from the standardized data. The results of Q-type cluster analysis showed that the 26 samples could be clustered reasonably into five groups, the classification results were significantly associated with the marine distribution of the sea cucumber Apostichopus japonicus samples. The CA and PCA were the effective methods for elements analysis of sea cucumber Apostichopus japonicus samples. The content of the mineral elements in sea cucumber Apostichopus japonicus samples was good chemical descriptors for differentiating their geographical origins.

  10. Quality assessment of cortex cinnamomi by HPLC chemical fingerprint, principle component analysis and cluster analysis.

    PubMed

    Yang, Jie; Chen, Li-Hong; Zhang, Qin; Lai, Mao-Xiang; Wang, Qiang

    2007-06-01

    HPLC fingerprint analysis, principle component analysis (PCA), and cluster analysis were introduced for quality assessment of Cortex cinnamomi (CC). The fingerprint of CC was developed and validated by analyzing 30 samples of CC from different species and geographic locations. Seventeen chromatographic peaks were selected as characteristic peaks and their relative peak areas (RPA) were calculated for quantitative expression of the HPLC fingerprints. The correlation coefficients of similarity in chromatograms were higher than 0.95 for the same species while much lower than 0.6 for different species. Besides, two principal components (PCs) have been extracted by PCA. PC1 separated Cinnamomum cassia from other species, capturing 56.75% of variance while PC2 contributed for their further separation, capturing 19.08% variance. The scores of the samples showed that the samples could be clustered reasonably into different groups corresponding to different species and different regions. The scores and loading plots together revealed different chemical properties of each group clearly. The cluster analysis confirmed the results of PCA analysis. Therefore, HPLC fingerprint in combination with chemometric techniques provide a very flexible and reliable method for quality assessment of traditional Chinese medicines.

  11. Critical clusters in interdependent economic sectors. A data-driven spectral clustering analysis

    NASA Astrophysics Data System (ADS)

    Oliva, Gabriele; Setola, Roberto; Panzieri, Stefano

    2016-10-01

    In this paper we develop a data-driven hierarchical clustering methodology to group the economic sectors of a country in order to highlight strongly coupled groups that are weakly coupled with other groups. Specifically, we consider an input-output representation of the coupling among the sectors and we interpret the relation among sectors as a directed graph; then we recursively apply the spectral clustering methodology over the graph, without a priori information on the number of groups that have to be obtained. In order to do this, we resort to the eigengap criterion, where a suitable number of groups is selected automatically based on the intensity and structure of the coupling among the sectors. We validate the proposed methodology considering a case study for Italy, inspecting how the coupling among clusters and sectors changes from the year 1995 to 2011, showing that in the years the Italian structure underwent deep changes, becoming more and more interdependent, i.e., a large part of the economy has become tightly coupled.

  12. Bivariate Mixed Effects Analysis of Clustered Data with Large Cluster Sizes.

    PubMed

    Zhang, Daowen; Sun, Jie Lena; Pieper, Karen

    2016-10-01

    Linear mixed effects models are widely used to analyze a clustered response variable. Motivated by a recent study to examine and compare the hospital length of stay (LOS) between patients undertaking percutaneous coronary intervention (PCI) and coronary artery bypass graft (CABG) from several international clinical trials, we proposed a bivariate linear mixed effects model for the joint modeling of clustered PCI and CABG LOS's where each clinical trial is considered a cluster. Due to the large number of patients in some trials, commonly used commercial statistical software for fitting (bivariate) linear mixed models failed to run since it could not allocate enough memory to invert large dimensional matrices during the optimization process. We consider ways to circumvent the computational problem in the maximum likelihood (ML) inference and restricted maximum likelihood (REML) inference. Particularly, we developed an expected and maximization (EM) algorithm for the REML inference and presented an ML implementation using existing software. The new REML EM algorithm is easy to implement and computationally stable and efficient. With this REML EM algorithm, we could analyze the LOS data and obtained meaningful results.

  13. Clinical Significance of Asthma Clusters by Longitudinal Analysis in Korean Asthma Cohort

    PubMed Central

    Kim, Sujeong; Yoon, Sun-young; Kwon, Hyouk-Soo; Chang, Yoon-Seok; Cho, You Sook; Jang, An-Soo; Park, Jung Won; Nahm, Dong-Ho; Yoon, Ho-Joo; Cho, Sang-Heon; Cho, Young-Joo; Choi, ByoungWhui; Moon, Hee-Bom; Kim, Tae-Bum

    2013-01-01

    Background We have previously identified four distinct groups of asthma patients in Korean cohorts using cluster analysis: (A) smoking asthma, (B) severe obstructive asthma, (C) early-onset atopic asthma, and (D) late-onset mild asthma. Methods and Results A longitudinal analysis of each cluster in a Korean adult asthma cohort was performed to investigate the clinical significance of asthma clusters over 12 months. Cluster A showed relatively high asthma control test (ACT) scores but relatively low FEV1 scores, despite a high percentage of systemic corticosteroid use. Cluster B had the lowest mean FEV1, ACT, and the quality of life questionnaire for adult Korean asthmatics (QLQAKA) scores throughout the year, even though the percentage of systemic corticosteroid use was the highest among the four clusters. Cluster C was ranked second in terms of FEV1, with the second lowest percentage of systemic corticosteroid use, and showed a marked improvement in subjective symptoms over time. Cluster D consistently showed the highest FEV1, the lowest systemic corticosteroid use, and had high ACT and QLQAKA scores. Conclusion Our asthma clusters had clinical significance with consistency among clusters over 12 months. These distinctive phenotypes may be useful in classifying asthma in real practice. PMID:24391784

  14. Cluster analysis of spontaneous preterm birth phenotypes identifies potential associations among preterm birth mechanisms

    PubMed Central

    Esplin, M Sean; Manuck, Tracy A.; Varner, Michael W.; Christensen, Bryce; Biggio, Joseph; Bukowski, Radek; Parry, Samuel; Zhang, Heping; Huang, Hao; Andrews, William; Saade, George; Sadovsky, Yoel; Reddy, Uma M.; Ilekis, John

    2015-01-01

    Objective We sought to employ an innovative tool based on common biological pathways to identify specific phenotypes among women with spontaneous preterm birth (SPTB), in order to enhance investigators' ability to identify to highlight common mechanisms and underlying genetic factors responsible for SPTB. Study Design A secondary analysis of a prospective case-control multicenter study of SPTB. All cases delivered a preterm singleton at SPTB ≤34.0 weeks gestation. Each woman was assessed for the presence of underlying SPTB etiologies. A hierarchical cluster analysis was used to identify groups of women with homogeneous phenotypic profiles. One of the phenotypic clusters was selected for candidate gene association analysis using VEGAS software. Results 1028 women with SPTB were assigned phenotypes. Hierarchical clustering of the phenotypes revealed five major clusters. Cluster 1 (N=445) was characterized by maternal stress, cluster 2 (N=294) by premature membrane rupture, cluster 3 (N=120) by familial factors, and cluster 4 (N=63) by maternal comorbidities. Cluster 5 (N=106) was multifactorial, characterized by infection (INF), decidual hemorrhage (DH) and placental dysfunction (PD). These three phenotypes were highly correlated by Chi-square analysis [PD and DH (p<2.2e-6); PD and INF (p=6.2e-10); INF and DH (p=0.0036)]. Gene-based testing identified the INS (insulin) gene as significantly associated with cluster 3 of SPTB. Conclusion We identified 5 major clusters of SPTB based on a phenotype tool and hierarchal clustering. There was significant correlation between several of the phenotypes. The INS gene was associated with familial factors underlying SPTB. PMID:26070700

  15. Visual cluster analysis and pattern recognition template and methods

    SciTech Connect

    Osbourn, G.C.; Martinez, R.F.

    1993-12-31

    This invention is comprised of a method of clustering using a novel template to define a region of influence. Using neighboring approximation methods, computation times can be significantly reduced. The template and method are applicable and improve pattern recognition techniques.

  16. Visual cluster analysis and pattern recognition template and methods

    DOEpatents

    Osbourn, G.C.; Martinez, R.F.

    1999-05-04

    A method of clustering using a novel template to define a region of influence is disclosed. Using neighboring approximation methods, computation times can be significantly reduced. The template and method are applicable and improve pattern recognition techniques. 30 figs.

  17. Visual cluster analysis and pattern recognition template and methods

    DOEpatents

    Osbourn, Gordon Cecil; Martinez, Rubel Francisco

    1999-01-01

    A method of clustering using a novel template to define a region of influence. Using neighboring approximation methods, computation times can be significantly reduced. The template and method are applicable and improve pattern recognition techniques.

  18. Development and optimization of SPECT gated blood pool cluster analysis for the prediction of CRT outcome

    SciTech Connect

    Lalonde, Michel Wassenaar, Richard; Wells, R. Glenn; Birnie, David; Ruddy, Terrence D.

    2014-07-15

    Purpose: Phase analysis of single photon emission computed tomography (SPECT) radionuclide angiography (RNA) has been investigated for its potential to predict the outcome of cardiac resynchronization therapy (CRT). However, phase analysis may be limited in its potential at predicting CRT outcome as valuable information may be lost by assuming that time-activity curves (TAC) follow a simple sinusoidal shape. A new method, cluster analysis, is proposed which directly evaluates the TACs and may lead to a better understanding of dyssynchrony patterns and CRT outcome. Cluster analysis algorithms were developed and optimized to maximize their ability to predict CRT response. Methods: About 49 patients (N = 27 ischemic etiology) received a SPECT RNA scan as well as positron emission tomography (PET) perfusion and viability scans prior to undergoing CRT. A semiautomated algorithm sampled the left ventricle wall to produce 568 TACs from SPECT RNA data. The TACs were then subjected to two different cluster analysis techniques, K-means, and normal average, where several input metrics were also varied to determine the optimal settings for the prediction of CRT outcome. Each TAC was assigned to a cluster group based on the comparison criteria and global and segmental cluster size and scores were used as measures of dyssynchrony and used to predict response to CRT. A repeated random twofold cross-validation technique was used to train and validate the cluster algorithm. Receiver operating characteristic (ROC) analysis was used to calculate the area under the curve (AUC) and compare results to those obtained for SPECT RNA phase analysis and PET scar size analysis methods. Results: Using the normal average cluster analysis approach, the septal wall produced statistically significant results for predicting CRT results in the ischemic population (ROC AUC = 0.73;p < 0.05 vs. equal chance ROC AUC = 0.50) with an optimal operating point of 71% sensitivity and 60% specificity. Cluster

  19. Marketing Mix Formulation for Higher Education: An Integrated Analysis Employing Analytic Hierarchy Process, Cluster Analysis and Correspondence Analysis

    ERIC Educational Resources Information Center

    Ho, Hsuan-Fu; Hung, Chia-Chi

    2008-01-01

    Purpose: The purpose of this paper is to examine how a graduate institute at National Chiayi University (NCYU), by using a model that integrates analytic hierarchy process, cluster analysis and correspondence analysis, can develop effective marketing strategies. Design/methodology/approach: This is primarily a quantitative study aimed at…

  20. Abundance analysis of the outer halo globular cluster Palomar 14

    NASA Astrophysics Data System (ADS)

    Çalışkan, Ş.; Christlieb, N.; Grebel, E. K.

    2012-01-01

    We determine the elemental abundances of nine red giant stars belonging to Palomar 14 (Pal 14). Pal 14 is an outer halo globular cluster (GC) at a distance of ~70 kpc. Our abundance analysis is based on high-resolution spectra and one-dimensional stellar model atmospheres. We derived the abundances for the iron peak elements Sc, V, Cr, Mn, Co, Ni, the α-elements O, Mg, Si, Ca, Ti, the light odd element Na, and the neutron-capture elements Y, Zr, Ba, La, Ce, Nd, Eu, Dy, and Cu. Our data do not permit us to investigate light element (i.e., O to Mg) abundance variations. The neutron-capture elements show an r-process signature. We compare our measurements with the abundance ratios of inner and other outer halo GCs, halo field stars, GCs of recognized extragalactic origin, and stars in dwarf spheroidal galaxies (dSphs). The abundance pattern of Pal 14 is almost identical to those of Pal 3 and Pal 4, the next distant members of the outer halo GC population after Pal 14. The abundance pattern of Pal 14 is also similar to those of the inner halo GCs, halo field stars, and GCs of recognized extragalactic origin, but differs from what is customarily found in dSphs field stars. The abundance properties of Pal 14, as well as those of the other outer halo GCs, are thus compatible with an accretion origin from dSphs. Whether or not GC accretion played a role, it seems that the formation conditions of outer halo GCs and GCs in dSphs were similar. Based on observations collected at the European Southern Observatory, Chile (Program IDs 077.B-0769).Tables A.1 and A.2 are only available at the CDS via anonymous ftp to cdsarc.u-strasbg.fr (130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/537/A83

  1. Global Myeloma Research Clusters, Output, and Citations: A Bibliometric Mapping and Clustering Analysis

    PubMed Central

    Andersen, Jens Peter; Bøgsted, Martin; Dybkær, Karen; Mellqvist, Ulf-Henrik; Morgan, Gareth J.; Goldschmidt, Hartmut; Dimopoulos, Meletios A.; Einsele, Hermann; San Miguel, Jesús; Palumbo, Antonio; Sonneveld, Pieter; Johnsen, Hans Erik

    2015-01-01

    Background International collaborative research is a mechanism for improving the development of disease-specific therapies and for improving health at the population level. However, limited data are available to assess the trends in research output related to orphan diseases. Methods and Findings We used bibliometric mapping and clustering methods to illustrate the level of fragmentation in myeloma research and the development of collaborative efforts. Publication data from Thomson Reuters Web of Science were retrieved for 2005–2009 and followed until 2013. We created a database of multiple myeloma publications, and we analysed impact and co-authorship density to identify scientific collaborations, developments, and international key players over time. The global annual publication volume for studies on multiple myeloma increased from 1,144 in 2005 to 1,628 in 2009, which represents a 43% increase. This increase is high compared to the 24% and 14% increases observed for lymphoma and leukaemia. The major proportion (>90% of publications) was from the US and EU over the study period. The output and impact in terms of citations, identified several successful groups with a large number of intra-cluster collaborations in the US and EU. The US-based myeloma clusters clearly stand out as the most productive and highly cited, and the European Myeloma Network members exhibited a doubling of collaborative publications from 2005 to 2009, still increasing up to 2013. Conclusion and Perspective Multiple myeloma research output has increased substantially in the past decade. The fragmented European myeloma research activities based on national or regional groups are progressing, but they require a broad range of targeted research investments to improve multiple myeloma health care. PMID:25629620

  2. Differences Between Ward's and UPGMA Methods of Cluster Analysis: Implications for School Psychology.

    ERIC Educational Resources Information Center

    Hale, Robert L.; Dougherty, Donna

    1988-01-01

    Compared the efficacy of two methods of cluster analysis, the unweighted pair-groups method using arithmetic averages (UPGMA) and Ward's method, for students grouped on intelligence, achievement, and social adjustment by both clustering methods. Found UPGMA more efficacious based on output, on cophenetic correlation coefficients generated by each…

  3. Application of Cluster Analysis to the Study of Piagetian Stages of Intellectual Development.

    ERIC Educational Resources Information Center

    DeLuca, Frederick P.

    1981-01-01

    Reexamined Piagetian stages of males (N=182) and females (N=176), ages nine to eighteen, using cluster analysis, and sought information concerning occurrence of stages and influence of different tasks and gender on cluster patterns. Findings, among others, indicate that deviation from Piagetian stages was influenced by gender and type of task.…

  4. Identifying At-Risk Students in General Chemistry via Cluster Analysis of Affective Characteristics

    ERIC Educational Resources Information Center

    Chan, Julia Y. K.; Bauer, Christopher F.

    2014-01-01

    The purpose of this study is to identify academically at-risk students in first-semester general chemistry using affective characteristics via cluster analysis. Through the clustering of six preselected affective variables, three distinct affective groups were identified: low (at-risk), medium, and high. Students in the low affective group…

  5. A method of using cluster analysis to study statistical dependence in multivariate data

    NASA Technical Reports Server (NTRS)

    Borucki, W. J.; Card, D. H.; Lyle, G. C.

    1975-01-01

    A technique is presented that uses both cluster analysis and a Monte Carlo significance test of clusters to discover associations between variables in multidimensional data. The method is applied to an example of a noisy function in three-dimensional space, to a sample from a mixture of three bivariate normal distributions, and to the well-known Fisher's Iris data.

  6. Social Learning Network Analysis Model to Identify Learning Patterns Using Ontology Clustering Techniques and Meaningful Learning

    ERIC Educational Resources Information Center

    Firdausiah Mansur, Andi Besse; Yusof, Norazah

    2013-01-01

    Clustering on Social Learning Network still not explored widely, especially when the network focuses on e-learning system. Any conventional methods are not really suitable for the e-learning data. SNA requires content analysis, which involves human intervention and need to be carried out manually. Some of the previous clustering techniques need…

  7. Hierarchical clustering analysis of flexible GBR 12909 dialkyl piperazine and piperidine analogs

    NASA Astrophysics Data System (ADS)

    Gilbert, Kathleen M.; Venanzi, Carol A.

    2006-04-01

    Pharmacophore modeling of large, drug-like molecules, such as the dopamine reuptake inhibitor GBR 12909, is complicated by their flexibility. A comprehensive hierarchical clustering study of two GBR 12909 analogs was performed to identify representative conformers for input to three-dimensional quantitative structure-activity relationship studies of closely-related analogs. Two data sets of more than 700 conformers each produced by random search conformational analysis of a piperazine and a piperidine GBR 12909 analog were studied. Several clustering studies were carried out based on different feature sets that include the important pharmacophore elements. The distance maps, the plot of the effective number of clusters versus actual number of clusters, and the novel derived clustering statistic, percentage change in the effective number of clusters, were shown to be useful in determining the appropriate clustering level. Six clusters were chosen for each analog, each representing a different region of the torsional angle space that determines the relative orientation of the pharmacophore elements. Conformers of each cluster that are representative of these regions were identified and compared for each analog. This study illustrates the utility of using hierarchical clustering for the classification of conformers of highly flexible molecules in terms of the three-dimensional spatial orientation of key pharmacophore elements.

  8. The identification of credit card encoders by hierarchical cluster analysis of the jitters of magnetic stripes.

    PubMed

    Leung, S C; Fung, W K; Wong, K H

    1999-01-01

    The relative bit density variation graphs of 207 specimen credit cards processed by 12 encoding machines were examined first visually, and then classified by means of hierarchical cluster analysis. Twenty-nine credit cards being treated as 'questioned' samples were tested by way of cluster analysis against 'controls' derived from known encoders. It was found that hierarchical cluster analysis provided a high accuracy of identification with all 29 'questioned' samples classified correctly. On the other hand, although visual comparison of jitter graphs was less discriminating, it was nevertheless capable of giving a reasonably accurate result.

  9. A Bayesian Analysis of the Ages of Four Open Clusters

    NASA Astrophysics Data System (ADS)

    Jeffery, Elizabeth J.; von Hippel, Ted; van Dyk, David A.; Stenning, David C.; Robinson, Elliot; Stein, Nathan; Jefferys, William H.

    2016-09-01

    In this paper we apply a Bayesian technique to determine the best fit of stellar evolution models to find the main sequence turn-off age and other cluster parameters of four intermediate-age open clusters: NGC 2360, NGC 2477, NGC 2660, and NGC 3960. Our algorithm utilizes a Markov chain Monte Carlo technique to fit these various parameters, objectively finding the best-fit isochrone for each cluster. The result is a high-precision isochrone fit. We compare these results with the those of traditional “by-eye” isochrone fitting methods. By applying this Bayesian technique to NGC 2360, NGC 2477, NGC 2660, and NGC 3960, we determine the ages of these clusters to be 1.35 ± 0.05, 1.02 ± 0.02, 1.64 ± 0.04, and 0.860 ± 0.04 Gyr, respectively. The results of this paper continue our effort to determine cluster ages to a higher precision than that offered by these traditional methods of isochrone fitting.

  10. Nucleotide sequence and transcriptional analysis of the type A2 neurotoxin gene cluster in Clostridium botulinum.

    PubMed

    Dineen, Sean S; Bradshaw, Marite; Karasek, Charles E; Johnson, Eric A

    2004-06-01

    The nucleotide sequences of the upstream regions of the botulinum neurotoxin type A1 (BoNT/A1) cluster of Clostridium botulinum strain NCTC 2916 and the BoNT/A2 cluster of strain Kyoto-F were determined. A novel gene, designated orfx3, was identified following the orfx2 gene in both clusters. ORF-X2 and ORF-X3 exhibit similarity to the BoNT cluster associated P-47 protein. The BoNT/A1 and BoNT/A2 clusters share a similar gene arrangement, but exhibit differences in the spacing between certain genes. Sequences with similarity to transposases were identified in these intergenic regions, suggesting that these differences arose from an ancestral insertion event. Transcriptional analysis of the BoNT/A2 cluster revealed that the genes of the cluster are primarily synthesized as three polycistronic transcripts. Two divergent polycistronic transcripts, one encoding the orfx1, orfx2, and orfx3 genes, the second encoding the p47, ntnh, and bont/a2 genes, are transcribed from conserved BoNT cluster promoters. The third polycistronic transcript, expressed at low levels, encodes the positive regulatory botR gene and the orfx genes. This is the first complete analysis of a botulinum toxin A2 cluster.

  11. Functional clustering algorithm for the analysis of dynamic network data

    NASA Astrophysics Data System (ADS)

    Feldt, S.; Waddell, J.; Hetrick, V. L.; Berke, J. D.; Żochowski, M.

    2009-05-01

    We formulate a technique for the detection of functional clusters in discrete event data. The advantage of this algorithm is that no prior knowledge of the number of functional groups is needed, as our procedure progressively combines data traces and derives the optimal clustering cutoff in a simple and intuitive manner through the use of surrogate data sets. In order to demonstrate the power of this algorithm to detect changes in network dynamics and connectivity, we apply it to both simulated neural spike train data and real neural data obtained from the mouse hippocampus during exploration and slow-wave sleep. Using the simulated data, we show that our algorithm performs better than existing methods. In the experimental data, we observe state-dependent clustering patterns consistent with known neurophysiological processes involved in memory consolidation.

  12. Deconstruction and analysis of multiphonic clusters in the modern flute

    NASA Astrophysics Data System (ADS)

    Barravecchio, Shauna

    The modern flute has been acoustically analyzed in great detail by many, but only from the point of view of traditional playing techniques. Very little research exists to date on more modem, "extended" technique performance. This paper explores the production of multiphonic note clusters as played on the modern flute. Several clusters as notated in James Pellerite's book on flute fingerings are recorded and analyzed for frequency content. Each one is then compared to the expected frequency content based on John Backus' 1978 paper on woodwind multiphonics. Using this information, the fingering configuration of each cluster can be deconstructed and each component pitch explained in terms of the root frequencies, overtone series, and sideband frequencies.

  13. Molecular-dynamics analysis of mobile helium cluster reactions near surfaces of plasma-exposed tungsten

    SciTech Connect

    Hu, Lin; Maroudas, Dimitrios; Hammond, Karl D.; Wirth, Brian D.

    2015-10-28

    We report the results of a systematic atomic-scale analysis of the reactions of small mobile helium clusters (He{sub n}, 4 ≤ n ≤ 7) near low-Miller-index tungsten (W) surfaces, aiming at a fundamental understanding of the near-surface dynamics of helium-carrying species in plasma-exposed tungsten. These small mobile helium clusters are attracted to the surface and migrate to the surface by Fickian diffusion and drift due to the thermodynamic driving force for surface segregation. As the clusters migrate toward the surface, trap mutation (TM) and cluster dissociation reactions are activated at rates higher than in the bulk. TM produces W adatoms and immobile complexes of helium clusters surrounding W vacancies located within the lattice planes at a short distance from the surface. These reactions are identified and characterized in detail based on the analysis of a large number of molecular-dynamics trajectories for each such mobile cluster near W(100), W(110), and W(111) surfaces. TM is found to be the dominant cluster reaction for all cluster and surface combinations, except for the He{sub 4} and He{sub 5} clusters near W(100) where cluster partial dissociation following TM dominates. We find that there exists a critical cluster size, n = 4 near W(100) and W(111) and n = 5 near W(110), beyond which the formation of multiple W adatoms and vacancies in the TM reactions is observed. The identified cluster reactions are responsible for important structural, morphological, and compositional features in the plasma-exposed tungsten, including surface adatom populations, near-surface immobile helium-vacancy complexes, and retained helium content, which are expected to influence the amount of hydrogen re-cycling and tritium retention in fusion tokamaks.

  14. Cluster Analysis of Symptoms Among Patients with Upper Extremity Musculoskeletal Disorders

    PubMed Central

    Piligian, George; Glutting, Joseph J.; Hanlon, Alexandra; Frings-Dresen, Monique H. W.; Sluiter, Judith K.

    2010-01-01

    Introduction Some musculoskeletal disorders of the upper extremity are not readily classified. The study objective was to determine if there were symptom patterns in self-identified repetitive strain injury (RSI) patients. Methods Members (n = 700) of the Dutch RSI Patients Association filled out a detailed symptom questionnaire. Factor analysis followed by cluster analysis grouped correlated symptoms. Results Eight clusters, based largely on symptom severity and quality were formulated. All but one cluster showed diffuse symptoms; the exception was characterized by bilateral symptoms of stiffness and aching pain in the shoulder/neck. Conclusions Case definitions which localize upper extremity musculoskeletal disorders to a specific anatomical area may be incomplete. Future clustering studies should rely on both signs and symptoms. Data could be collected from health care providers prospectively to determine the possible prognostic value of the identified clusters with respect to natural history, chronicity, and return to work. PMID:20414797

  15. Tobacco, Marijuana, and Alcohol Use in University Students: A Cluster Analysis

    PubMed Central

    Primack, Brian A.; Kim, Kevin H.; Shensa, Ariel; Sidani, Jaime E.; Barnett, Tracey E.; Switzer, Galen E.

    2012-01-01

    Objective Segmentation of populations may facilitate development of targeted substance abuse prevention programs. We aimed to partition a national sample of university students according to profiles based on substance use. Participants We used 2008–2009 data from the National College Health Assessment from the American College Health Association. Our sample consisted of 111,245 individuals from 158 institutions. Method We partitioned the sample using cluster analysis according to current substance use behaviors. We examined the association of cluster membership with individual and institutional characteristics. Results Cluster analysis yielded six distinct clusters. Three individual factors—gender, year in school, and fraternity/sorority membership—were the most strongly associated with cluster membership. Conclusions In a large sample of university students, we were able to identify six distinct patterns of substance abuse. It may be valuable to target specific populations of college-aged substance users based on individual factors. However, comprehensive intervention will require a multifaceted approach. PMID:22686360

  16. An Empirical Comparison of Variable Standardization Methods in Cluster Analysis.

    ERIC Educational Resources Information Center

    Schaffer, Catherine M.; Green, Paul E.

    1996-01-01

    The common marketing research practice of standardizing the columns of a persons-by-variables data matrix prior to clustering the entities corresponding to the rows was evaluated with 10 large-scale data sets. Results indicate that the column standardization practice may be problematic for some kinds of data that marketing researchers used for…

  17. Functional Analysis of a Mosquito Short Chain Dehydrogenase Cluster

    PubMed Central

    Mayoral, Jaime G.; Leonard, Kate T.; Defelipe, Lucas A.; Turjansksi, Adrian G.; Nouzova, Marcela; Noriegal, Fernando G.

    2013-01-01

    The short chain dehydrogenases (SDR) constitute one the oldest and largest families of enzymes with over 46,000 members in sequence databases. About 25% of all known dehydrogenases belong to the SDR family. SDR enzymes have critical roles in lipid, amino acid, carbohydrate, hormone and xenobiotic metabolism as well as in redox sensor mechanisms. This family is present in archaea, bacteria, and eukaryota, emphasizing their versatility and fundamental importance for metabolic processes. We identified a cluster of eight SDRs in the mosquito Aedes aegypti (AaSDRs). Members of the cluster differ in tissue specificity and developmental expression. Heterologous expression produced recombinant proteins that had diverse substrate specificities, but distinct from the conventional insect alcohol (ethanol) dehydrogenases. They are all NADP+-dependent and they have S-enantioselectivity and preference for secondary alcohols with 8–15 carbons. Homology modeling was used to build the structure of AaSDR1 and two additional cluster members. The computational study helped explain the selectivity towards the (10S)-isomers as well as the reduced activity of AaSDR4 and AaSDR9 for longer isoprenoid substrates. Similar clusters of SDRs are present in other species of insects, suggesting similar selection mechanisms causing duplication and diversification of this family of enzymes. PMID:23238893

  18. The Therapist Personal Reaction Questionnaire: A Cluster Analysis.

    ERIC Educational Resources Information Center

    Tryon, Georgiana Shick

    1989-01-01

    Analyzed Therapist Personal Reaction Questionnaire. Participants consisted of 6 practicum trainee graduate students and 208 undergraduate clients. Found two clusters--one related to counselor feelings toward client, one related to counselor feelings toward interview. Results indicated that more attractive clients were seen more frequently and for…

  19. The Validity of the Inverse Scree Test for Cluster Analysis.

    ERIC Educational Resources Information Center

    Lathrop, Richard G.; Williams, Janice E.

    1990-01-01

    A Monte Carlo study of the validity of the Inverse Scree Test under conditions where true group membership is known was conducted. Fifty cluster analyses of each distribution involving 2 to 5 true groups of 3,000 simulated subjects were made. Implications for the data analyst are discussed. (SLD)

  20. The Complementary Use of Cluster and Factor Analysis Methods.

    ERIC Educational Resources Information Center

    Gorman, Bernard S.; Primavera, Louis H.

    1983-01-01

    Factor and cluster analyses are distinctly different multivariate procedures with different goals. However, when used in a complementary fashion, each set of methods can be used to enhance the interpretation of results found in the other set of methods. Simple examples illustrating the joint use of the methods are provided. (Author)

  1. First photometric analysis of six open cluster candidates

    NASA Astrophysics Data System (ADS)

    Piatti, A. E.; Clariá, J. J.; Ahumada, A. V.

    2011-10-01

    In this study we try to clarify the nature of six catalogued open cluster (OC) candidates using CCD UBVI_{KC} photometry down to V = 22. The objects are Haffner 3, Haffner 5, NGC 2368, Haffner 25, Hogg 3 and Hogg 4. None of them was found to be a real OC.

  2. Galaxy cluster mass estimation from stacked spectroscopic analysis

    NASA Astrophysics Data System (ADS)

    Farahi, Arya; Evrard, August E.; Rozo, Eduardo; Rykoff, Eli S.; Wechsler, Risa H.

    2016-08-01

    We use simulated galaxy surveys to study: (i) how galaxy membership in redMaPPer clusters maps to the underlying halo population, and (ii) the accuracy of a mean dynamical cluster mass, Mσ(λ), derived from stacked pairwise spectroscopy of clusters with richness λ. Using ˜130 000 galaxy pairs patterned after the Sloan Digital Sky Survey (SDSS) redMaPPer cluster sample study of Rozo et al., we show that the pairwise velocity probability density function of central-satellite pairs with mi < 19 in the simulation matches the form seen in Rozo et al. Through joint membership matching, we deconstruct the main Gaussian velocity component into its halo contributions, finding that the top-ranked halo contributes ˜60 per cent of the stacked signal. The halo mass scale inferred by applying the virial scaling of Evrard et al. to the velocity normalization matches, to within a few per cent, the log-mean halo mass derived through galaxy membership matching. We apply this approach, along with miscentring and galaxy velocity bias corrections, to estimate the log-mean matched halo mass at z = 0.2 of SDSS redMaPPer clusters. Employing the velocity bias constraints of Guo et al., we find = ln (M30) + αm ln (λ/30) with M30 = 1.56 ± 0.35 × 1014 M⊙ and αm = 1.31 ± 0.06stat ± 0.13sys. Systematic uncertainty in the velocity bias of satellite galaxies overwhelmingly dominates the error budget.

  3. NeAT: a toolbox for the analysis of biological networks, clusters, classes and pathways

    PubMed Central

    Brohée, Sylvain; Faust, Karoline; Lima-Mendez, Gipsi; Sand, Olivier; Janky, Rekin's; Vanderstocken, Gilles; Deville, Yves; van Helden, Jacques

    2008-01-01

    The network analysis tools (NeAT) (http://rsat.ulb.ac.be/neat/) provide a user-friendly web access to a collection of modular tools for the analysis of networks (graphs) and clusters (e.g. microarray clusters, functional classes, etc.). A first set of tools supports basic operations on graphs (comparison between two graphs, neighborhood of a set of input nodes, path finding and graph randomization). Another set of programs makes the connection between networks and clusters (graph-based clustering, cliques discovery and mapping of clusters onto a network). The toolbox also includes programs for detecting significant intersections between clusters/classes (e.g. clusters of co-expression versus functional classes of genes). NeAT are designed to cope with large datasets and provide a flexible toolbox for analyzing biological networks stored in various databases (protein interactions, regulation and metabolism) or obtained from high-throughput experiments (two-hybrid, mass-spectrometry and microarrays). The web interface interconnects the programs in predefined analysis flows, enabling to address a series of questions about networks of interest. Each tool can also be used separately by entering custom data for a specific analysis. NeAT can also be used as web services (SOAP/WSDL interface), in order to design programmatic workflows and integrate them with other available resources. PMID:18524799

  4. Choosing appropriate analysis methods for cluster randomised cross-over trials with a binary outcome.

    PubMed

    Morgan, Katy E; Forbes, Andrew B; Keogh, Ruth H; Jairath, Vipul; Kahan, Brennan C

    2017-01-30

    In cluster randomised cross-over (CRXO) trials, clusters receive multiple treatments in a randomised sequence over time. In such trials, there is usual correlation between patients in the same cluster. In addition, within a cluster, patients in the same period may be more similar to each other than to patients in other periods. We demonstrate that it is necessary to account for these correlations in the analysis to obtain correct Type I error rates. We then use simulation to compare different methods of analysing a binary outcome from a two-period CRXO design. Our simulations demonstrated that hierarchical models without random effects for period-within-cluster, which do not account for any extra within-period correlation, performed poorly with greatly inflated Type I errors in many scenarios. In scenarios where extra within-period correlation was present, a hierarchical model with random effects for cluster and period-within-cluster only had correct Type I errors when there were large numbers of clusters; with small numbers of clusters, the error rate was inflated. We also found that generalised estimating equations did not give correct error rates in any scenarios considered. An unweighted cluster-level summary regression performed best overall, maintaining an error rate close to 5% for all scenarios, although it lost power when extra within-period correlation was present, especially for small numbers of clusters. Results from our simulation study show that it is important to model both levels of clustering in CRXO trials, and that any extra within-period correlation should be accounted for. Copyright © 2016 John Wiley & Sons, Ltd.

  5. Groundwater source contamination mechanisms: Physicochemical profile clustering, risk factor analysis and multivariate modelling

    NASA Astrophysics Data System (ADS)

    Hynds, Paul; Misstear, Bruce D.; Gill, Laurence W.; Murphy, Heather M.

    2014-04-01

    An integrated domestic well sampling and "susceptibility assessment" programme was undertaken in the Republic of Ireland from April 2008 to November 2010. Overall, 211 domestic wells were sampled, assessed and collated with local climate data. Based upon groundwater physicochemical profile, three clusters have been identified and characterised by source type (borehole or hand-dug well) and local geological setting. Statistical analysis indicates that cluster membership is significantly associated with the prevalence of bacteria (p = 0.001), with mean Escherichia coli presence within clusters ranging from 15.4% (Cluster-1) to 47.6% (Cluster-3). Bivariate risk factor analysis shows that on-site septic tank presence was the only risk factor significantly associated (p < 0.05) with bacterial presence within all clusters. Point agriculture adjacency was significantly associated with both borehole-related clusters. Well design criteria were associated with hand-dug wells and boreholes in areas characterised by high permeability subsoils, while local geological setting was significant for hand-dug wells and boreholes in areas dominated by low/moderate permeability subsoils. Multivariate susceptibility models were developed for all clusters, with predictive accuracies of 84% (Cluster-1) to 91% (Cluster-2) achieved. Septic tank setback was a common variable within all multivariate models, while agricultural sources were also significant, albeit to a lesser degree. Furthermore, well liner clearance was a significant factor in all models, indicating that direct surface ingress is a significant well contamination mechanism. Identification and elucidation of cluster-specific contamination mechanisms may be used to develop improved overall risk management and wellhead protection strategies, while also informing future remediation and maintenance efforts.

  6. Groundwater source contamination mechanisms: physicochemical profile clustering, risk factor analysis and multivariate modelling.

    PubMed

    Hynds, Paul; Misstear, Bruce D; Gill, Laurence W; Murphy, Heather M

    2014-04-01

    An integrated domestic well sampling and "susceptibility assessment" programme was undertaken in the Republic of Ireland from April 2008 to November 2010. Overall, 211 domestic wells were sampled, assessed and collated with local climate data. Based upon groundwater physicochemical profile, three clusters have been identified and characterised by source type (borehole or hand-dug well) and local geological setting. Statistical analysis indicates that cluster membership is significantly associated with the prevalence of bacteria (p=0.001), with mean Escherichia coli presence within clusters ranging from 15.4% (Cluster-1) to 47.6% (Cluster-3). Bivariate risk factor analysis shows that on-site septic tank presence was the only risk factor significantly associated (p<0.05) with bacterial presence within all clusters. Point agriculture adjacency was significantly associated with both borehole-related clusters. Well design criteria were associated with hand-dug wells and boreholes in areas characterised by high permeability subsoils, while local geological setting was significant for hand-dug wells and boreholes in areas dominated by low/moderate permeability subsoils. Multivariate susceptibility models were developed for all clusters, with predictive accuracies of 84% (Cluster-1) to 91% (Cluster-2) achieved. Septic tank setback was a common variable within all multivariate models, while agricultural sources were also significant, albeit to a lesser degree. Furthermore, well liner clearance was a significant factor in all models, indicating that direct surface ingress is a significant well contamination mechanism. Identification and elucidation of cluster-specific contamination mechanisms may be used to develop improved overall risk management and wellhead protection strategies, while also informing future remediation and maintenance efforts.

  7. Does published orthodontic research account for clustering effects during statistical data analysis?

    PubMed

    Koletsi, Despina; Pandis, Nikolaos; Polychronopoulou, Argy; Eliades, Theodore

    2012-06-01

    In orthodontics, multiple site observations within patients or multiple observations collected at consecutive time points are often encountered. Clustered designs require larger sample sizes compared to individual randomized trials and special statistical analyses that account for the fact that observations within clusters are correlated. It is the purpose of this study to assess to what degree clustering effects are considered during design and data analysis in the three major orthodontic journals. The contents of the most recent 24 issues of the American Journal of Orthodontics and Dentofacial Orthopedics (AJODO), Angle Orthodontist (AO), and European Journal of Orthodontics (EJO) from December 2010 backwards were hand searched. Articles with clustering effects and whether the authors accounted for clustering effects were identified. Additionally, information was collected on: involvement of a statistician, single or multicenter study, number of authors in the publication, geographical area, and statistical significance. From the 1584 articles, after exclusions, 1062 were assessed for clustering effects from which 250 (23.5 per cent) were considered to have clustering effects in the design (kappa = 0.92, 95 per cent CI: 0.67-0.99 for inter rater agreement). From the studies with clustering effects only, 63 (25.20 per cent) had indicated accounting for clustering effects. There was evidence that the studies published in the AO have higher odds of accounting for clustering effects [AO versus AJODO: odds ratio (OR) = 2.17, 95 per cent confidence interval (CI): 1.06-4.43, P = 0.03; EJO versus AJODO: OR = 1.90, 95 per cent CI: 0.84-4.24, non-significant; and EJO versus AO: OR = 1.15, 95 per cent CI: 0.57-2.33, non-significant). The results of this study indicate that only about a quarter of the studies with clustering effects account for this in statistical data analysis.

  8. The different clinical faces of obstructive sleep apnoea: a cluster analysis.

    PubMed

    Ye, Lichuan; Pien, Grace W; Ratcliffe, Sarah J; Björnsdottir, Erla; Arnardottir, Erna Sif; Pack, Allan I; Benediktsdottir, Bryndis; Gislason, Thorarinn

    2014-12-01

    Although commonly observed in clinical practice, the heterogeneity of obstructive sleep apnoea (OSA) clinical presentation has not been formally characterised. This study was the first to apply cluster analysis to identify subtypes of patients with OSA who experience distinct combinations of symptoms and comorbidities. An analysis of baseline data from the Icelandic Sleep Apnoea Cohort (822 patients with newly diagnosed moderate-to-severe OSA) was performed. Three distinct clusters were identified. They were classified as the "disturbed sleep group" (cluster 1), "minimally symptomatic group" (cluster 2) and "excessive daytime sleepiness group" (cluster 3), consisting of 32.7%, 24.7% and 42.6% of the entire cohort, respectively. The probabilities of having comorbid hypertension and cardiovascular disease were highest in cluster 2 but lowest in cluster 3. The clusters did not differ significantly in terms of sex, body mass index or apnoea-hypopnoea index. Patients with OSA have different patterns of clinical presentation, which need to be communicated to both the lay public and the professional community with the goal of facilitating care-seeking and early identification of OSA. Identifying distinct clinical profiles of OSA creates a foundation for offering more personalised therapies in the future.

  9. Coal analysis by diffuse reflectance near-infrared spectroscopy: Hierarchical cluster and linear discriminant analysis.

    PubMed

    Bona, M T; Andrés, J M

    2007-06-15

    An extensive study was carried out in coal samples coming from several origins trying to establish a relationship between nine coal properties (moisture (%), ash (%), volatile matter (%), fixed carbon (%), heating value (kcal/kg), carbon (%), hydrogen (%), nitrogen (%) and sulphur (%)) and the corresponding near-infrared spectral data. This research was developed by applying both quantitative (partial least squares regression, PLS) and qualitative multivariate analysis techniques (hierarchical cluster analysis, HCA; linear discriminant analysis, LDA), to determine a methodology able to estimate property values for a new coal sample. For that, it was necessary to define homogeneous clusters, whose calibration equations could be obtained with accuracy and precision levels comparable to those provided by commercial online analysers and, study the discrimination level between these groups of samples attending only to the instrumental variables. These two steps were performed in three different situations depending on the variables used for the pattern recognition: property values, spectral data (principal component analysis, PCA) or a combination of both. The results indicated that it was the last situation what offered the best results in both two steps previously described, with the added benefit of outlier detection and removal.

  10. SU-E-J-98: Radiogenomics: Correspondence Between Imaging and Genetic Features Based On Clustering Analysis

    SciTech Connect

    Harmon, S; Wendelberger, B; Jeraj, R

    2014-06-01

    Purpose: Radiogenomics aims to establish relationships between patient genotypes and imaging phenotypes. An open question remains on how best to integrate information from these distinct datasets. This work investigates if similarities in genetic features across patients correspond to similarities in PET-imaging features, assessed with various clustering algorithms. Methods: [{sup 18}F]FDG PET data was obtained for 26 NSCLC patients from a public database (TCIA). Tumors were contoured using an in-house segmentation algorithm combining gradient and region-growing techniques; resulting ROIs were used to extract 54 PET-based features. Corresponding genetic microarray data containing 48,778 elements were also obtained for each tumor. Given mismatch in feature sizes, two dimension reduction techniques were also applied to the genetic data: principle component analysis (PCA) and selective filtering of 25 NSCLC-associated genes-ofinterest (GOI). Gene datasets (full, PCA, and GOI) and PET feature datasets were independently clustered using K-means and hierarchical clustering using variable number of clusters (K). Jaccard Index (JI) was used to score similarity of cluster assignments across different datasets. Results: Patient clusters from imaging data showed poor similarity to clusters from gene datasets, regardless of clustering algorithms or number of clusters (JI{sub mean}= 0.3429±0.1623). Notably, we found clustering algorithms had different sensitivities to data reduction techniques. Using hierarchical clustering, the PCA dataset showed perfect cluster agreement to the full-gene set (JI =1) for all values of K, and the agreement between the GOI set and the full-gene set decreased as number of clusters increased (JI=0.9231 and 0.5769 for K=2 and 5, respectively). K-means clustering assignments were highly sensitive to data reduction and showed poor stability for different values of K (JI{sub range}: 0.2301–1). Conclusion: Using commonly-used clustering algorithms

  11. The DANCE Project: Dynamical Analysis of Nearby Clusters

    NASA Astrophysics Data System (ADS)

    Bouy, H.; Bertin, E.; Cuillandre, J. C.; Moraux, E.; Bouvier, J.; Arevalo Sánchez, M.; Barrado Y Navascués, D.

    We present the results of the DANCE project, a ground-based survey meant to prepare and complement Gaia i) down to the planetary mass regime; ii) in regions of high extinction. The DANCE project takes advantage of archival wide-field surveys to derive precise astrometry, and in particular proper motions, for millions of stars in young nearby associations. We present the first preliminary results obtained for the Pleiades cluster, as well as our immediate objectives for other associations.

  12. Insights into quasar UV spectra using unsupervised clustering analysis

    NASA Astrophysics Data System (ADS)

    Tammour, A.; Gallagher, S. C.; Daley, M.; Richards, G. T.

    2016-06-01

    Machine learning techniques can provide powerful tools to detect patterns in multidimensional parameter space. We use K-means - a simple yet powerful unsupervised clustering algorithm which picks out structure in unlabelled data - to study a sample of quasar UV spectra from the Quasar Catalog of the 10th Data Release of the Sloan Digital Sky Survey (SDSS-DR10) of Paris et al. Detecting patterns in large data sets helps us gain insights into the physical conditions and processes giving rise to the observed properties of quasars. We use K-means to find clusters in the parameter space of the equivalent width (EW), the blue- and red-half-width at half-maximum (HWHM) of the Mg II 2800 Å line, the C IV 1549 Å line, and the C III] 1908 Å blend in samples of broad absorption line (BAL) and non-BAL quasars at redshift 1.6-2.1. Using this method, we successfully recover correlations well-known in the UV regime such as the anti-correlation between the EW and blueshift of the C IV emission line and the shape of the ionizing spectra energy distribution (SED) probed by the strength of He II and the Si III]/C III] ratio. We find this to be particularly evident when the properties of C III] are used to find the clusters, while those of Mg II proved to be less strongly correlated with the properties of the other lines in the spectra such as the width of C IV or the Si III]/C III] ratio. We conclude that unsupervised clustering methods (such as K-means) are powerful methods for finding `natural' binning boundaries in multidimensional data sets and discuss caveats and future work.

  13. Hierarchical cluster analysis of matrix effects on 110 pesticide residues in 28 tea matrixes.

    PubMed

    Li, Yan; Pang, Guo-Fang; Fan, Chun-Lin; Chen, Xi

    2013-01-01

    Matrix effects on 110 pesticides in 28 tea matrixes of different varieties and origins by LC/MS/MS were studied, and most of the pesticides exhibited soft and medium signal suppression. To better understand the influence of the tea varieties and the physicochemical characteristics of pesticides on the matrix effects, the multivariate analysis tool called hierarchical cluster analysis was applied. Tea matrixes were grouped into three clusters: unfermented, fermented, and post-fermented teas. Any type of tea can be chosen from each cluster as a corresponding representative matrix within that cluster to make matrix-matched solutions, which could simplify analysis while guaranteeing its accuracy. Matrix effects on most pesticides were similar despite the physicochemical diversities of the pesticides.

  14. Integrating Data Clustering and Visualization for the Analysis of 3D Gene Expression Data

    SciTech Connect

    Data Analysis and Visualization and the Department of Computer Science, University of California, Davis, One Shields Avenue, Davis CA 95616, USA,; nternational Research Training Group ``Visualization of Large and Unstructured Data Sets,'' University of Kaiserslautern, Germany; Computational Research Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA; Genomics Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA; Life Sciences Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA,; Computer Science Division,University of California, Berkeley, CA, USA,; Computer Science Department, University of California, Irvine, CA, USA,; All authors are with the Berkeley Drosophila Transcription Network Project, Lawrence Berkeley National Laboratory,; Rubel, Oliver; Weber, Gunther H.; Huang, Min-Yu; Bethel, E. Wes; Biggin, Mark D.; Fowlkes, Charless C.; Hendriks, Cris L. Luengo; Keranen, Soile V. E.; Eisen, Michael B.; Knowles, David W.; Malik, Jitendra; Hagen, Hans; Hamann, Bernd

    2008-05-12

    The recent development of methods for extracting precise measurements of spatial gene expression patterns from three-dimensional (3D) image data opens the way for new analyses of the complex gene regulatory networks controlling animal development. We present an integrated visualization and analysis framework that supports user-guided data clustering to aid exploration of these new complex datasets. The interplay of data visualization and clustering-based data classification leads to improved visualization and enables a more detailed analysis than previously possible. We discuss (i) integration of data clustering and visualization into one framework; (ii) application of data clustering to 3D gene expression data; (iii) evaluation of the number of clusters k in the context of 3D gene expression clustering; and (iv) improvement of overall analysis quality via dedicated post-processing of clustering results based on visualization. We discuss the use of this framework to objectively define spatial pattern boundaries and temporal profiles of genes and to analyze how mRNA patterns are controlled by their regulatory transcription factors.

  15. Variability in body size and shape of UK offshore workers: A cluster analysis approach.

    PubMed

    Stewart, Arthur; Ledingham, Robert; Williams, Hector

    2017-01-01

    Male UK offshore workers have enlarged dimensions compared with UK norms and knowledge of specific sizes and shapes typifying their physiques will assist a range of functions related to health and ergonomics. A representative sample of the UK offshore workforce (n = 588) underwent 3D photonic scanning, from which 19 extracted dimensional measures were used in k-means cluster analysis to characterise physique groups. Of the 11 resulting clusters four somatotype groups were expressed: one cluster was muscular and lean, four had greater muscularity than adiposity, three had equal adiposity and muscularity and three had greater adiposity than muscularity. Some clusters appeared constitutionally similar to others, differing only in absolute size. These cluster centroids represent an evidence-base for future designs in apparel and other applications where body size and proportions affect functional performance. They also constitute phenotypic evidence providing insight into the 'offshore culture' which may underpin the enlarged dimensions of offshore workers.

  16. Galaxy Cluster Pressure Profiles as Determined by Sunyaev Zel’dovich Effect Observations with MUSTANG and Bolocam. II. Joint Analysis of 14 Clusters

    NASA Astrophysics Data System (ADS)

    Romero, Charles E.; Mason, Brian S.; Sayers, Jack; Mroczkowski, Tony; Sarazin, Craig; Donahue, Megan; Baldi, Alessandro; Clarke, Tracy E.; Young, Alexander H.; Sievers, Jonathan; Dicker, Simon R.; Reese, Erik D.; Czakon, Nicole; Devlin, Mark; Korngut, Phillip M.; Golwala, Sunil

    2017-04-01

    We present pressure profiles of galaxy clusters determined from high-resolution Sunyaev–Zel’dovich (SZ) effect observations of 14 clusters, which span the redshift range of 0.25< z< 0.89. The procedure simultaneously fits spherical cluster models to MUSTANG and Bolocam data. In this analysis, we adopt the generalized NFW parameterization of pressure profiles to produce our models. Our constraints on ensemble-average pressure profile parameters, in this study γ, C 500, and P 0, are consistent with those in previous studies, but for individual clusters we find discrepancies with the X-ray derived pressure profiles from the ACCEPT2 database. We investigate potential sources of these discrepancies, especially cluster geometry, electron temperature of the intracluster medium, and substructure. We find that the ensemble mean profile for all clusters in our sample is described by the parameters [γ ,{C}500,{P}0]=[{0.3}-0.1+0.1,{1.3}-0.1+0.1,{8.6}-2.4+2.4], cool core clusters are described by [γ ,{C}500,{P}0] =[{0.6}-0.1+0.1,{0.9}-0.1+0.1,{3.6}-1.5+1.5], and disturbed clusters are described by [γ ,{C}500,{P}0]=[{0.0}-0.0+0.1,{1.5}-0.2+0.1,{13.8}-1.6+1.6]. Of the 14 clusters, 4 have clear substructure in our SZ observations, while an additional 2 clusters exhibit potential substructure.

  17. A weak-lensing analysis of the Abell 383 cluster

    NASA Astrophysics Data System (ADS)

    Huang, Z.; Radovich, M.; Grado, A.; Puddu, E.; Romano, A.; Limatola, L.; Fu, L.

    2011-05-01

    Aims: We use deep CFHT and SUBARU uBVRIz archival images of the Abell 383 cluster (z = 0.187) to estimate its mass by weak-lensing. Methods: To this end, we first use simulated images to check the accuracy provided by our Kaiser-Squires-Broadhurst (KSB) pipeline. These simulations include shear testing programme (STEP) 1 and 2 simulations, as well as more realistic simulations of the distortion of galaxy shapes by a cluster with a Navarro-Frenk-White (NFW) profile. From these simulations we estimate the effect of noise on shear measurement and derive the correction terms. The R-band image is used to derive the mass by fitting the observed tangential shear profile with an NFW mass profile. Photometric redshifts are computed from the uBVRIz catalogs. Different methods for the foreground/background galaxy selection are implemented, namely selection by magnitude, color, and photometric redshifts, and the results are compared. In particular, we developed a semi-automatic algorithm to select the foreground galaxies in the color-color diagram, based on the observed colors. Results: Using color selection or photometric redshifts improves the correction of dilution from foreground galaxies: this leads to higher signals in the inner parts of the cluster. We obtain a cluster mass Mvir = 7.5+2.7_{-1.9 × 1014} M⊙: this value is 20% higher than previous estimates and is more consistent the mass expected from X-ray data. The R-band luminosity function of the cluster is computed and gives a total luminosity Ltot = (2.14 ± 0.5) × 1012 L⊙ and a mass-to-luminosity ratio M/L 300 M⊙/L⊙. Based on: data collected with the Subaru Telescope (University of Tokyo) and obtained from the SMOKA, which is operated by the Astronomy Data Center, National Astronomical Observatory of Japan; observations obtained with MegaPrime/MegaCam, a joint project of CFHT and CEA/DAPNIA, at the Canada-France-Hawaii Telescope (CFHT), which is operated by the National Research Council (NRC) of Canada

  18. How Teachers Use and Manage Their Blogs? A Cluster Analysis of Teachers' Blogs in Taiwan

    ERIC Educational Resources Information Center

    Liu, Eric Zhi-Feng; Hou, Huei-Tse

    2013-01-01

    The development of Web 2.0 has ushered in a new set of web-based tools, including blogs. This study focused on how teachers use and manage their blogs. A sample of 165 teachers' blogs in Taiwan was analyzed by factor analysis, cluster analysis and qualitative content analysis. First, the teachers' blogs were analyzed according to six criteria…

  19. Analysis of the nutritional status of algae by Fourier transform infrared chemical imaging

    NASA Astrophysics Data System (ADS)

    Hirschmugl, Carol J.; Bayarri, Zuheir-El; Bunta, Maria; Holt, Justin B.; Giordano, Mario

    2006-09-01

    A new non-destructive method to study the nutritional status of algal cells and their environments is demonstrated. This approach allows rapid examination of whole cells without any or little pre-treatment providing a large amount of information on the biochemical composition of cells and growth medium. The method is based on the analysis of a collection of infrared (IR) spectra for individual cells; each spectrum describes the biochemical composition of a portion of a cell; a complete set of spectra is used to reconstruct an image of the entire cell. To obtain spatially resolved information synchrotron radiation was used as a bright IR source. We tested this method on the green flagellate Euglena gracilis; a comparison was conducted between cells grown in nutrient replete conditions (Type 1) and on cells allowed to deplete their medium (Type 2). Complete sets of spectra for individual cells of both types were analyzed with agglomerative hierarchical clustering, leading to distinct clusters representative of the two types of cells. The average spectra for the clusters confirmed the similarities between the clusters and the types of cells. The clustering analysis, therefore, allows the distinction of cells of the same species, but with different nutritional histories. In order to facilitate the application of the method and reduce manipulation (washing), we analyzed the cells in the presence of residual medium. The results obtained showed that even with residual medium the outcome of the clustering analysis is reliable. Our results demonstrate the applicability FTIR microspectroscopy for ecological and ecophysiological studies.

  20. An Atlas of Peroxiredoxins Created Using an Active Site Profile-Based Approach to Functionally Relevant Clustering of Proteins

    PubMed Central

    Babbitt, Patricia C.; Ferrin, Thomas E.

    2017-01-01

    Peroxiredoxins (Prxs or Prdxs) are a large protein superfamily of antioxidant enzymes that rapidly detoxify damaging peroxides and/or affect signal transduction and, thus, have roles in proliferation, differentiation, and apoptosis. Prx superfamily members are widespread across phylogeny and multiple methods have been developed to classify them. Here we present an updated atlas of the Prx superfamily identified using a novel method called MISST (Multi-level Iterative Sequence Searching Technique). MISST is an iterative search process developed to be both agglomerative, to add sequences containing similar functional site features, and divisive, to split groups when functional site features suggest distinct functionally-relevant clusters. Superfamily members need not be identified initially—MISST begins with a minimal representative set of known structures and searches GenBank iteratively. Further, the method’s novelty lies in the manner in which isofunctional groups are selected; rather than use a single or shifting threshold to identify clusters, the groups are deemed isofunctional when they pass a self-identification criterion, such that the group identifies itself and nothing else in a search of GenBank. The method was preliminarily validated on the Prxs, as the Prxs presented challenges of both agglomeration and division. For example, previous sequence analysis clustered the Prx functional families Prx1 and Prx6 into one group. Subsequent expert analysis clearly identified Prx6 as a distinct functionally relevant group. The MISST process distinguishes these two closely related, though functionally distinct, families. Through MISST search iterations, over 38,000 Prx sequences were identified, which the method divided into six isofunctional clusters, consistent with previous expert analysis. The results represent the most complete computational functional analysis of proteins comprising the Prx superfamily. The feasibility of this novel method is demonstrated

  1. Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient

    PubMed Central

    Yao, Jianchao; Chang, Chunqi; Salmi, Mari L; Hung, Yeung Sam; Loraine, Ann; Roux, Stanley J

    2008-01-01

    Background Currently, clustering with some form of correlation coefficient as the gene similarity metric has become a popular method for profiling genomic data. The Pearson correlation coefficient and the standard deviation (SD)-weighted correlation coefficient are the two most widely-used correlations as the similarity metrics in clustering microarray data. However, these two correlations are not optimal for analyzing replicated microarray data generated by most laboratories. An effective correlation coefficient is needed to provide statistically sufficient analysis of replicated microarray data. Results In this study, we describe a novel correlation coefficient, shrinkage correlation coefficient (SCC), that fully exploits the similarity between the replicated microarray experimental samples. The methodology considers both the number of replicates and the variance within each experimental group in clustering expression data, and provides a robust statistical estimation of the error of replicated microarray data. The value of SCC is revealed by its comparison with two other correlation coefficients that are currently the most widely-used (Pearson correlation coefficient and SD-weighted correlation coefficient) using statistical measures on both synthetic expression data as well as real gene expression data from Saccharomyces cerevisiae. Two leading clustering methods, hierarchical and k-means clustering were applied for the comparison. The comparison indicated that using SCC achieves better clustering performance. Applying SCC-based hierarchical clustering to the replicated microarray data obtained from germinating spores of the fern Ceratopteris richardii, we discovered two clusters of genes with shared expression patterns during spore germination. Functional analysis suggested that some of the genetic mechanisms that control germination in such diverse plant lineages as mosses and angiosperms are also conserved among ferns. Conclusion This study shows that SCC is

  2. MMPI-2: Cluster Analysis of Personality Profiles in Perinatal Depression—Preliminary Evidence

    PubMed Central

    Grillo, Alessandra; Lauriola, Marco; Giacchetti, Nicoletta

    2014-01-01

    Background. To assess personality characteristics of women who develop perinatal depression. Methods. The study started with a screening of a sample of 453 women in their third trimester of pregnancy, to which was administered a survey data form, the Edinburgh Postnatal Depression Scale (EPDS) and the Minnesota Multiphasic Personality Inventory 2 (MMPI-2). A clinical group of subjects with perinatal depression (PND, 55 subjects) was selected; clinical and validity scales of MMPI-2 were used as predictors in hierarchical cluster analysis carried out. Results. The analysis identified three clusters of personality profile: two “clinical” clusters (1 and 3) and an “apparently common” one (cluster 2). The first cluster (39.5%) collects structures of personality with prevalent obsessive or dependent functioning tending to develop a “psychasthenic” depression; the third cluster (13.95%) includes women with prevalent borderline functioning tending to develop “dysphoric” depression; the second cluster (46.5%) shows a normal profile with a “defensive” attitude, probably due to the presence of defense mechanisms or to the fear of stigma. Conclusion. Characteristics of personality have a key role in clinical manifestations of perinatal depression; it is important to detect them to identify mothers at risk and to plan targeted therapeutic interventions. PMID:25574499

  3. On the association of young star clusters and their parental clouds: a statistical fractal analysis

    NASA Astrophysics Data System (ADS)

    Hetem, A.; Gregorio-Hetem, J.; Fernandes, B.; Santos-Silva, T.

    2014-10-01

    We present a study of 21 young star clusters aiming to characterize their association to dense clouds. The structure of the clouds was evaluated by means of the Q statistical fractal analysis, designed to compare their geometric structure with the spatial distribution of the cluster members. The sample was selected from the study by Santos-Silva and Gregorio-Hetem (2012, A&A, 547, A107) that evaluated the radial density profile of the stellar superficial distribution of the young clusters. The fractal dimension and other statistical parameters of most of the sample indicate that there is a good cloud-cluster correlation, when compared to other works based on an artificial distribution of points (Lomax et al. 2011, MNRAS, 412, 627). As presented in a previous work (Fernandes et al. 2012, A&A, 541, A95 ), the cluster NGC 6530 is the only object of our sample that presents anomalous statistical behaviour. The fractal analysis shows that this cluster has a centrally concentrated distribution of stars that differs from the substructures found in the density distribution of the cloud projected in the A_{V} map, suggesting that the original cloud geometry was changed by the cluster formation.

  4. Student academic performance analysis using fuzzy C-means clustering

    NASA Astrophysics Data System (ADS)

    Rosadi, R.; Akamal; Sudrajat, R.; Kharismawan, B.; Hambali, Y. A.

    2017-01-01

    Grade Point Average (GPA) is commonly used as an indicator of academic performance. Academic performance evaluations is a basic way to evaluate the progression of student performance, when evaluating student’s academic performance, there are occasion where the student data is grouped especially when the amounts of data is large. Thus, the pattern of data relationship within and among groups can be revealed. Grouping data can be done by using clustering method, where one of the methods is the Fuzzy C-Means algorithm. Furthermore, this algorithm is then applied to a set of student data form the Faculty of Mathematics and Natural Sciences, Padjadjaran University.

  5. Chaotic Artificial Bee Colony Used for Cluster Analysis

    NASA Astrophysics Data System (ADS)

    Zhang, Yudong; Wu, Lenan; Wang, Shuihua; Huo, Yuankai

    A new approach based on artificial bee colony (ABC) with chaotic theory was proposed to solve the partitional clustering problem. We first investigate the optimization model including both the encoding strategy and the variance ratio criterion (VRC). Second, a chaotic ABC algorithm was developed based on the Rossler attractor. Experiments on three types of artificial data of different degrees of overlapping all demonstrate the CABC is superior to both genetic algorithm (GA) and combinatorial particle swarm optimization (CPSO) in terms of robustness and computation time.

  6. Cluster analysis based on dimensional information with applications to feature selection and classification

    NASA Technical Reports Server (NTRS)

    Eigen, D. J.; Fromm, F. R.; Northouse, R. A.

    1974-01-01

    A new clustering algorithm is presented that is based on dimensional information. The algorithm includes an inherent feature selection criterion, which is discussed. Further, a heuristic method for choosing the proper number of intervals for a frequency distribution histogram, a feature necessary for the algorithm, is presented. The algorithm, although usable as a stand-alone clustering technique, is then utilized as a global approximator. Local clustering techniques and configuration of a global-local scheme are discussed, and finally the complete global-local and feature selector configuration is shown in application to a real-time adaptive classification scheme for the analysis of remote sensed multispectral scanner data.

  7. Characterization and expression analysis of the exopolysaccharide gene cluster in Lactobacillus fermentum TDS030603.

    PubMed

    Dan, Tong; Fukuda, Kenji; Sugai-Bannai, Michiko; Takakuwa, Naoya; Motoshima, Hidemasa; Urashima, Tadasu

    2009-12-01

    Part of the exopolysaccharide gene cluster of Lactobacillus fermentum TDS030603 was characterized. It consists of 11,890 base pairs and is located in the chromosomal DNA, 13 open reading frames of which were encoded. Out of the 13 open reading frames, six were found to be involved in exopolysaccharide synthesis; however, five were similar to transposase genes of other lactobacilli, and two were functionally unrelated. Expression analysis revealed that the exopolysaccharide synthesis-related genes were expressed during cultivation. Southern analysis using specific primers for the exopolysaccharide genes indicated that duplication of the gene cluster did not occur. The plasmid-cured strain maintained its capacity for exopolysaccharide production, confirming that the exopolysaccharide gene cluster of this strain is located in the chromosomal DNA, similarly to thermophilic lactic acid bacteria. Our results indicate that this exopolysaccharide gene cluster is likely to be functional, although extensive gene rearrangement occurs.

  8. [FTIR study of the influence of leaf senescence on magnoliaceae cluster analysis].

    PubMed

    Li, Lun; Liu, Gang; Ou, Quan-hong; Zhang, Li; Liu, Jian-hong; Sun, Shi-zhong

    2013-09-01

    Fourier transform infrared (FTIR) spectroscopy combined with hierarchical cluster analysis was used to study the influence of leaf senescence on magnoliaceae cluster. FTIR spectra of young, mature and old yellow leaves were obtained from 14 species trees belonging to the three magnoliaceae subtribes. Results showed that the infrared spectra of the three subtribes plant leaves were similar, only with minor differences in the absorption intensity of several peaks. Hierarchical cluster analysis was performed on the second derivative infrared spectra in the range 1800-700 cm(-1). The HCA results showed that the cluster based on mature leaves is better than that based on young and old yellow leaves. Our study suggests that it should be cautious to select leaf sample while using leaf spectra for classification.

  9. Salient concerns in using analgesia for cancer pain among outpatients: A cluster analysis study

    PubMed Central

    Meghani, Salimah H; Knafl, George J

    2017-01-01

    AIM To identify unique clusters of patients based on their concerns in using analgesia for cancer pain and predictors of the cluster membership. METHODS This was a 3-mo prospective observational study (n = 207). Patients were included if they were adults (≥ 18 years), diagnosed with solid tumors or multiple myelomas, and had at least one prescription of around-the-clock pain medication for cancer or cancer-treatment-related pain. Patients were recruited from two outpatient medical oncology clinics within a large health system in Philadelphia. A choice-based conjoint (CBC) analysis experiment was used to elicit analgesic treatment preferences (utilities). Patients employed trade-offs based on five analgesic attributes (percent relief from analgesics, type of analgesic, type of side-effects, severity of side-effects, out of pocket cost). Patients were clustered based on CBC utilities using novel adaptive statistical methods. Multiple logistic regression was used to identify predictors of cluster membership. RESULTS The analyses found 4 unique clusters: Most patients made trade-offs based on the expectation of pain relief (cluster 1, 41%). For a subset, the main underlying concern was type of analgesic prescribed, i.e., opioid vs non-opioid (cluster 2, 11%) and type of analgesic side effects (cluster 4, 21%), respectively. About one in four made trade-offs based on multiple concerns simultaneously including pain relief, type of side effects, and severity of side effects (cluster 3, 28%). In multivariable analysis, to identify predictors of cluster membership, clinical and socioeconomic factors (education, health literacy, income, social support) rather than analgesic attitudes and beliefs were found important; only the belief, i.e., pain medications can mask changes in health or keep you from knowing what is going on in your body was found significant in predicting two of the four clusters [cluster 1 (-); cluster 4 (+)]. CONCLUSION Most patients appear to be driven

  10. Advantages and Limitations of Cluster Analysis in Interpreting Regional GPS Velocity Fields in California and Elsewhere

    NASA Astrophysics Data System (ADS)

    Thatcher, W. R.; Savage, J. C.; Simpson, R.

    2012-12-01

    Regional Global Positioning System (GPS) velocity observations are providing increasingly precise mappings of actively deforming continental lithosphere. Cluster analysis, a venerable data analysis method, offers a simple, visual exploratory tool for the initial organization and investigation of GPS velocities (Simpson et al., 2012 GRL). Here we describe the application of cluster analysis to GPS velocities from three regions, the Mojave Desert and the San Francisco Bay regions in California, and the Aegean in the eastern Mediterranean. Our goal is to illustrate the strengths and shortcomings of the method in searching for spatially coherent patterns of deformation, including evidence for and against block-like behavior in these 3 regions. The deformation fields from dense regional GPS networks can often be concisely described in terms of relatively coherent blocks bounded by active faults, although the choice of blocks, their number and size, is subjective and usually guided by the distribution of known faults. Cluster analysis applied to GPS velocities provides a completely objective method for identifying groups of observations ranging in size from 10s to 100s of km in characteristic dimension based solely on the similarities of their velocity vectors. In the three regions we have studied, statistically significant clusters are almost invariably spatially coherent, fault bounded, and coincide with elastic, geologically identified structural blocks. Often, higher order clusters that are not statistically significant are also spatially coherent, suggesting the existence of additional blocks, or defining regions of other tectonic importance (e.g. zones of localized elastic strain accumulation near locked faults). These results can be used to both formulate tentative tectonic models with testable consequences and to suggest focused new measurements in under-sampled regions. Cluster analysis applied to GPS velocities has several potential limitations, aside from the

  11. Weighing the Giants - I. Weak-lensing masses for 51 massive galaxy clusters: project overview, data analysis methods and cluster images

    NASA Astrophysics Data System (ADS)

    von der Linden, Anja; Allen, Mark T.; Applegate, Douglas E.; Kelly, Patrick L.; Allen, Steven W.; Ebeling, Harald; Burchat, Patricia R.; Burke, David L.; Donovan, David; Morris, R. Glenn; Blandford, Roger; Erben, Thomas; Mantz, Adam

    2014-03-01

    This is the first in a series of papers in which we measure accurate weak-lensing masses for 51 of the most X-ray luminous galaxy clusters known at redshifts 0.15 ≲ zCl ≲ 0.7, in order to calibrate X-ray and other mass proxies for cosmological cluster experiments. The primary aim is to improve the absolute mass calibration of cluster observables, currently the dominant systematic uncertainty for cluster count experiments. Key elements of this work are the rigorous quantification of systematic uncertainties, high-quality data reduction and photometric calibration, and the `blind' nature of the analysis to avoid confirmation bias. Our target clusters are drawn from X-ray catalogues based on the ROSAT All-Sky Survey, and provide a versatile calibration sample for many aspects of cluster cosmology. We have acquired wide-field, high-quality imaging using the Subaru Telescope and Canada-France-Hawaii Telescope for all 51 clusters, in at least three bands per cluster. For a subset of 27 clusters, we have data in at least five bands, allowing accurate photometric redshift estimates of lensed galaxies. In this paper, we describe the cluster sample and observations, and detail the processing of the SuprimeCam data to yield high-quality images suitable for robust weak-lensing shape measurements and precision photometry. For each cluster, we present wide-field three-colour optical images and maps of the weak-lensing mass distribution, the optical light distribution and the X-ray emission. These provide insights into the large-scale structure in which the clusters are embedded. We measure the offsets between X-ray flux centroids and the brightest cluster galaxies in the clusters, finding these to be small in general, with a median of 20 kpc. For offsets ≲100 kpc, weak-lensing mass measurements centred on the brightest cluster galaxies agree well with values determined relative to the X-ray centroids; miscentring is therefore not a significant source of systematic

  12. Somatosensory nociceptive characteristics differentiate subgroups in people with chronic low back pain: a cluster analysis.

    PubMed

    Rabey, Martin; Slater, Helen; OʼSullivan, Peter; Beales, Darren; Smith, Anne

    2015-10-01

    The objectives of this study were to explore the existence of subgroups in a cohort with chronic low back pain (n = 294) based on the results of multimodal sensory testing and profile subgroups on demographic, psychological, lifestyle, and general health factors. Bedside (2-point discrimination, brush, vibration and pinprick perception, temporal summation on repeated monofilament stimulation) and laboratory (mechanical detection threshold, pressure, heat and cold pain thresholds, conditioned pain modulation) sensory testing were examined at wrist and lumbar sites. Data were entered into principal component analysis, and 5 component scores were entered into latent class analysis. Three clusters, with different sensory characteristics, were derived. Cluster 1 (31.9%) was characterised by average to high temperature and pressure pain sensitivity. Cluster 2 (52.0%) was characterised by average to high pressure pain sensitivity. Cluster 3 (16.0%) was characterised by low temperature and pressure pain sensitivity. Temporal summation occurred significantly more frequently in cluster 1. Subgroups were profiled on pain intensity, disability, depression, anxiety, stress, life events, fear avoidance, catastrophizing, perception of the low back region, comorbidities, body mass index, multiple pain sites, sleep, and activity levels. Clusters 1 and 2 had a significantly greater proportion of female participants and higher depression and sleep disturbance scores than cluster 3. The proportion of participants undertaking <300 minutes per week of moderate activity was significantly greater in cluster 1 than in clusters 2 and 3. Low back pain, therefore, does not appear to be homogeneous. Pain mechanisms relating to presentations of each subgroup were postulated. Future research may investigate prognoses and interventions tailored towards these subgroups.

  13. Analysis of the histone cluster in Senegalese sole (Solea senegalensis): evidence for a divergent evolution of two canonical histone clusters.

    PubMed

    Merlo, Manuel Alejandro; Iziga, Roger; Portela, Silvia; Cross, Ismael; Kosyakova, Nadezda; Liehr, Thomas; Manchado, Manuel; Rebordinos, Laureana

    2016-12-10

    The Senegalese sole (Solea senegalensis) is commercially very important, and is a priority species for aquaculture product diversification. The main histone cluster was found in two BAC clones. However, two replacement histones (H1.0 and H3.3) were found in another BAC clone. Different types of canonical H2A and H2B have been found in one species for the first time. Phylogenetic analysis demonstrated that the different H1, H2A and H2B types were all more similar to each other than to canonical histones from other species. The canonical H3 of S. senegalensis differs from subtypes H3.1 and H3.2 in humans in the residue 96, in which a serine residue is found instead of an alanine. This same polymorphism has been found only in Danio rerio. The S. senegalensis karyotype comprises 21 pairs of chromosomes, distributed in 3 metacentric pairs, 2 submetacentric pairs, 4 subtelocentric pairs and 12 acrocentric pairs. The two BAC clones that contain the clusters of canonical histones have both been mapped in the largest metacentric pair, and mFISH confirmed the co-location with the dmrt1 gene in that pair. Three chromosome markers have been identified, which in addition to those previously described, account for 18 chromosome pairs with markers in S. senegalensis.

  14. Cluster analysis of passive air sampling data based on the relative composition of persistent organic pollutants.

    PubMed

    Liu, Xiande; Wania, Frank

    2014-03-01

    The development of passive air samplers has allowed the measurement of time-integrated concentrations of persistent organic pollutants (POPs) within spatial networks on a variety of scales. Cluster analysis of POP composition may enhance the interpretation of such spatial data. Several methodological aspects of the application of cluster analysis are discussed, including the influence of a dominant pollutant, the role of PAS duplication, and comparison of regional studies. Relying on data from six regional studies in North and South America, Africa, and Asia, we illustrate here how cluster analysis can be used to extract information and gain insights into POP sources and atmospheric transport contributions. Cluster analysis allows classification of PAS samples into those with significant local source contributions and those that represent regional fingerprints. Local emissions, atmospheric transport, and seasonal cycles are identified as being among the major factors determining the variation in POP composition at many sites. By complementing cluster analysis with meteorological data such as air mass back-trajectories, terrain, as well as geographical and socio-economic aspects, a comprehensive picture of the atmospheric contamination of a region by POPs emerges.

  15. Analysis of cluster explosive synchronization in complex networks.

    PubMed

    Ji, Peng; Peron, Thomas K D M; Rodrigues, Francisco A; Kurths, Jürgen

    2014-12-01

    Correlations between intrinsic dynamics and local topology have become a new trend in the study of synchronization in complex networks. In this paper, we investigate the influence of topology on the dynamics of networks made up of second-order Kuramoto oscillators. In particular, based on mean-field calculations, we provide a detailed investigation of cluster explosive synchronization (CES) [Phys. Rev. Lett. 110, 218701 (2013)] in scale-free networks as a function of several topological properties. Moreover, we investigate the robustness of discontinuous transitions by including an additional quenched disorder, and we show that the phase coherence decreases with increasing strength of the quenched disorder. These results complement the previous findings regarding CES and also fundamentally deepen the understanding of the interplay between topology and dynamics under the constraint of correlating natural frequencies and local structure.

  16. Variable number of tandem repeats and pulsed-field gel electrophoresis cluster analysis of enterohemorrhagic Escherichia coli serovar O157 strains.

    PubMed

    Yokoyama, Eiji; Uchimura, Masako

    2007-11-01

    Ninety-five enterohemorrhagic Escherichia coli serovar O157 strains, including 30 strains isolated from 13 intrafamily outbreaks and 14 strains isolated from 3 mass outbreaks, were studied by pulsed-field gel electrophoresis (PFGE) and variable number of tandem repeats (VNTR) typing, and the resulting data were subjected to cluster analysis. Cluster analysis of the VNTR typing data revealed that 57 (60.0%) of 95 strains, including all epidemiologically linked strains, formed clusters with at least 95% similarity. Cluster analysis of the PFGE patterns revealed that 67 (70.5%) of 95 strains, including all but 1 of the epidemiologically linked strains, formed clusters with 90% similarity. The number of epidemiologically unlinked strains forming clusters was significantly less by VNTR cluster analysis than by PFGE cluster analysis. The congruence value between PFGE and VNTR cluster analysis was low and did not show an obvious correlation. With two-step cluster analysis, the number of clustered epidemiologically unlinked strains by PFGE cluster analysis that were divided by subsequent VNTR cluster analysis was significantly higher than the number by VNTR cluster analysis that were divided by subsequent PFGE cluster analysis. These results indicate that VNTR cluster analysis is more efficient than PFGE cluster analysis as an epidemiological tool to trace the transmission of enterohemorrhagic E. coli O157.

  17. Analysis of the Tribolium homeotic complex: insights into mechanisms constraining insect Hox clusters

    PubMed Central

    Ronshaugen, Matthew; Cande, Jessica; He, JianPing; Beeman, Richard W.; Levine, Michael; Brown, Susan J.; Denell, Robin E.

    2008-01-01

    The remarkable conservation of Hox clusters is an accepted but little understood principle of biology. Some organizational constraints have been identified for vertebrate Hox clusters, but most of these are thought to be recent innovations that may not apply to other organisms. Ironically, many model organisms have disrupted Hox clusters and may not be well-suited for studies of structural constraints. In contrast, the red flour beetle, Tribolium castaneum, which has a long history in Hox gene research, is thought to have a more ancestral-type Hox cluster organization. Here, we demonstrate that the Tribolium homeotic complex (HOMC) is indeed intact, with the individual Hox genes in the expected colinear arrangement and transcribed from the same strand. There is no evidence that the cluster has been invaded by non-Hox protein-coding genes, although expressed sequence tag and genome tiling data suggest that noncoding transcripts are prevalent. Finally, our analysis of several mutations affecting the Tribolium HOMC suggests that intermingling of enhancer elements with neighboring transcription units may constrain the structure of at least one region of the Tribolium cluster. This work lays a foundation for future studies of the Tribolium HOMC that may provide insights into the reasons for Hox cluster conservation. PMID:18392875

  18. A landscape-based cluster analysis using recursive search instead of a threshold parameter.

    PubMed

    Gladwin, Thomas E; Vink, Matthijs; Mars, Roger B

    2016-01-01

    Cluster-based analysis methods in neuroimaging provide control of whole-brain false positive rates without the need to conservatively correct for the number of voxels and the associated false negative results. The current method defines clusters based purely on shapes in the landscape of activation, instead of requiring the choice of a statistical threshold that may strongly affect results. Statistical significance is determined using permutation testing, combining both size and height of activation. A method is proposed for dealing with relatively small local peaks. Simulations confirm the method controls the false positive rate and correctly identifies regions of activation. The method is also illustrated using real data. •A landscape-based method to define clusters in neuroimaging data avoids the need to pre-specify a threshold to define clusters.•The implementation of the method works as expected, based on simulated and real data.•The recursive method used for defining clusters, the method used for combining clusters, and the definition of the "value" of a cluster may be of interest for future variations.

  19. Profiling nurses' job satisfaction, acculturation, work environment, stress, cultural values and coping abilities: A cluster analysis.

    PubMed

    Goh, Yong-Shian; Lee, Alice; Chan, Sally Wai-Chi; Chan, Moon Fai

    2015-08-01

    This study aimed to determine whether definable profiles existed in a cohort of nursing staff with regard to demographic characteristics, job satisfaction, acculturation, work environment, stress, cultural values and coping abilities. A survey was conducted in one hospital in Singapore from June to July 2012, and 814 full-time staff nurses completed a self-report questionnaire (89% response rate). Demographic characteristics, job satisfaction, acculturation, work environment, perceived stress, cultural values, ways of coping and intention to leave current workplace were assessed as outcomes. The two-step cluster analysis revealed three clusters. Nurses in cluster 1 (n = 222) had lower acculturation scores than nurses in cluster 3. Cluster 2 (n = 362) was a group of younger nurses who reported higher intention to leave (22.4%), stress level and job dissatisfaction than the other two clusters. Nurses in cluster 3 (n = 230) were mostly Singaporean and reported the lowest intention to leave (13.0%). Resources should be allocated to specifically address the needs of younger nurses and hopefully retain them in the profession. Management should focus their retention strategies on junior nurses and provide a work environment that helps to strengthen their intention to remain in nursing by increasing their job satisfaction.

  20. Cluster analysis for the probability of DSB site induced by electron tracks

    NASA Astrophysics Data System (ADS)

    Yoshii, Y.; Sasaki, K.; Matsuya, Y.; Date, H.

    2015-05-01

    To clarify the influence of bio-cells exposed to ionizing radiations, the densely populated pattern of the ionization in the cell nucleus is of importance because it governs the extent of DNA damage which may lead to cell lethality. In this study, we have conducted a cluster analysis of ionization and excitation events to estimate the number of double-strand breaks (DSBs) induced by electron tracks. A Monte Carlo simulation for electrons in liquid water was performed to determine the spatial location of the ionization and excitation events. The events were divided into clusters by using the density-based spatial clustering of applications with noise (DBSCAN) algorithm. The algorithm enables us to sort out the events into the groups (clusters) in which a minimum number of neighboring events are contained within a given radius. For evaluating the number of DSBs in the extracted clusters, we have introduced an aggregation index (AI). The computational results show that a sub-keV electron produces DSBs in a dense formation more effectively than higher energy electrons. The root-mean square radius (RMSR) of the cluster size is below 5 nm, which is smaller than the chromatin fiber thickness. It was found that this size of clustering events has a high possibility to cause lesions in DNA within the chromatin fiber site.

  1. The distinction of 'psychosomatogenic family types' based on parents' self reported questionnaire information: a cluster analysis.

    PubMed

    Rousseau, Sofie; Grietens, Hans; Vanderfaeillie, Johan; Ceulemans, Eva; Hoppenbrouwers, Karel; Desoete, Annemie; Van Leeuwen, Karla

    2014-06-01

    The theory of 'psychosomatogenic family types' is often used in treatment of somatizing adolescents. This study investigated the validity of distinguishing 'psychosomatogenic family types' based on parents' self-reported family features. The study included a Flemish general population sample of 12-year olds (n = 1428). We performed cluster analysis on 3 variables concerning parents' self-reported problems in family functioning. The distinguished clusters were examined for differences in marital problems, parental emotional problems, professional help for family members, demographics, and adolescents' somatization. Results showed the existence of 5 family types: 'chaotic family functioning,' 'average amount of family functioning problems,' 'few family functioning problems,' 'high amount of support and communication problems,' and 'high amount of sense of security problems' clusters. Membership of the 'chaotic family functioning' and 'average amount of family functioning problems' cluster was significantly associated with higher levels of somatization, compared with 'few family functioning problems' cluster membership. Among additional variables, only marital and parental emotional problems distinguished somatization relevant from non relevant clusters: parents in 'average amount of family functioning problems' and 'chaotic family functioning' clusters reported higher problems. The data showed that 'apparently perfect' or 'enmeshed' patterns of family functioning may not be assessed by means of parent report as adopted in this study. In addition, not only adolescents from 'extreme' types of family functioning may suffer from somatization. Further, professionals should be careful assuming that families in which parents report average to high amounts of family functioning problems also show different demographic characteristics.

  2. Structural Parameters of M81 Globular Clusters: Analysis of their Intensity Profile

    NASA Astrophysics Data System (ADS)

    Santiago-Cortés, M.; Mayya, Y. D.; Rosa-González, D.

    2014-09-01

    We present here an analysis of the surface brightness profiles on the Hubble Space Telescope (HST) F435W and F814W images for 110 Globular Clusters (GCs) in M81. The structural parameters for each of these clusters were obtained by fitting a King model to the observed profiles. The profiles are well-fitted by the King model in the majority of the GCs. We used these structural parameters to classify the GCs based on their halo and core properties. Based on the physical extent of the halo, measured as the isophotal radius at μ_I = 24 mag/arcsec^2 , we divided the clusters into two groups — compact and classical. By analyzing the core properties, we found 7 cuspy clusters, with properties similar to the cuspy clusters found in the Milky Way. In addition, we found 2 clusters that have a blue excess in the core, similar to the brightest GC in M81. We show that all clusters at galactocentric distance less than 4 kpc are tidally limited in M81.

  3. Standardized Effect Size Measures for Mediation Analysis in Cluster-Randomized Trials

    ERIC Educational Resources Information Center

    Stapleton, Laura M.; Pituch, Keenan A.; Dion, Eric

    2015-01-01

    This article presents 3 standardized effect size measures to use when sharing results of an analysis of mediation of treatment effects for cluster-randomized trials. The authors discuss 3 examples of mediation analysis (upper-level mediation, cross-level mediation, and cross-level mediation with a contextual effect) with demonstration of the…

  4. Task Analysis for Health Occupations. Cluster: Nursing. Occupation: Home Health Aide. Education for Employment Task Lists.

    ERIC Educational Resources Information Center

    Lake County Area Vocational Center, Grayslake, IL.

    This document contains a task analysis for health occupations (home health aid) in the nursing cluster. For each task listed, occupation, duty area, performance standard, steps, knowledge, attitudes, safety, equipment/supplies, source of analysis, and Illinois state goals for learning are listed. For the duty area of "providing therapeutic…

  5. Task Analysis for Health Occupations. Cluster: Dental Assisting. Occupation: Dental Assistant. Education for Employment Task Lists.

    ERIC Educational Resources Information Center

    Lathrop, Janice

    This document contains a task analysis for health occupations (dental assistant) in the dental assisting cluster. For each task listed, occupation, duty area, performance standard, steps, knowledge, attitudes, safety, equipment/supplies, source of analysis, and Illinois state goals for learning are listed. For the duty area of "providing…

  6. Importance of Viral Sequence Length and Number of Variable and Informative Sites in Analysis of HIV Clustering.

    PubMed

    Novitsky, Vlad; Moyo, Sikhulile; Lei, Quanhong; DeGruttola, Victor; Essex, M

    2015-05-01

    To improve the methodology of HIV cluster analysis, we addressed how analysis of HIV clustering is associated with parameters that can affect the outcome of viral clustering. The extent of HIV clustering and tree certainty was compared between 401 HIV-1C near full-length genome sequences and subgenomic regions retrieved from the LANL HIV Database. Sliding window analysis was based on 99 windows of 1,000 bp and 45 windows of 2,000 bp. Potential associations between the extent of HIV clustering and sequence length and the number of variable and informative sites were evaluated. The near full-length genome HIV sequences showed the highest extent of HIV clustering and the highest tree certainty. At the bootstrap threshold of 0.80 in maximum likelihood (ML) analysis, 58.9% of near full-length HIV-1C sequences but only 15.5% of partial pol sequences (ViroSeq) were found in clusters. Among HIV-1 structural genes, pol showed the highest extent of clustering (38.9% at a bootstrap threshold of 0.80), although it was significantly lower than in the near full-length genome sequences. The extent of HIV clustering was significantly higher for sliding windows of 2,000 bp than 1,000 bp. We found a strong association between the sequence length and proportion of HIV sequences in clusters, and a moderate association between the number of variable and informative sites and the proportion of HIV sequences in clusters. In HIV cluster analysis, the extent of detectable HIV clustering is directly associated with the length of viral sequences used, as well as the number of variable and informative sites. Near full-length genome sequences could provide the most informative HIV cluster analysis. Selected subgenomic regions with a high extent of HIV clustering and high tree certainty could also be considered as a second choice.

  7. 3D Plasma Clusters: Analysis of dynamical evolution and individual particle interaction

    SciTech Connect

    Antonova, T.; Thomas, H. M.; Morfill, G. E.; Annaratone, B. M.

    2008-09-07

    3D plasma clusters (up to 100 particles) have been built inside small (32 mm{sup 3}) plasma volume in gravity. It has been estimated that the external confinement has a negligible influence on the processes inside the clusters. At such conditions the analysis of dynamical evolution and individual particle interactions have shown that the binary interaction among particles in addition to the repelling Coulomb force exhibits also an attractive part. The tendency of the systems to approach the state with minimum energy by rearranging particles inside has been detected. The measured 63 particles' cluster vibrations are in close agreement with vibrations of a drop with surface tension. This indicates that even a 63 particle cluster already exhibits properties normally associated with the cooperative regime.

  8. Insights into secondary metabolism from a global analysis of prokaryotic biosynthetic gene clusters

    PubMed Central

    Cimermancic, Peter; Medema, Marnix H.; Claesen, Jan; Kurita, Kenji; Wieland Brown, Laura C.; Mavrommatis, Konstantinos; Pati, Amrita; Godfrey, Paul A.; Koehrsen, Michael; Clardy, Jon; Birren, Bruce W.; Takano, Eriko; Sali, Andrej; Linington, Roger G.; Fischbach, Michael A.

    2014-01-01

    Summary Although biosynthetic gene clusters (BGCs) have been discovered for hundreds of bacterial metabolites, our knowledge of their diversity remains limited. Here, we used a novel algorithm to systematically identify BGCs in the extensive extant microbial sequencing data. Network analysis of the predicted BGCs revealed large gene cluster families, the vast majority uncharacterized. We experimentally characterized the most prominent family, consisting of two subfamilies of hundreds of BGCs distributed throughout the Proteobacteria; their products are aryl polyenes, lipids with an aryl head group conjugated to a polyene tail. We identified a distant relationship to a third subfamily of aryl polyene BGCs, and together the three subfamilies represent the largest known family of biosynthetic gene clusters, with more than 1,000 members. Although these clusters are widely divergent in sequence, their small molecule products are remarkably conserved, indicating for the first time the important roles these compounds play in Gram-negative cell biology. PMID:25036635

  9. Functional Interference Clusters in Cancer Patients With Bone Metastases: A Secondary Analysis of RTOG 9714

    SciTech Connect

    Chow, Edward; James, Jennifer; Barsevick, Andrea; Hartsell, William; Ratcliffe, Sarah; Scarantino, Charles; Ivker, Robert; Roach, Mack; Suh, John; Petersen, Ivy; Konski, Andre; Demas, William; Bruner, Deborah

    2010-04-15

    Purpose: To explore the relationships (clusters) among the functional interference items in the Brief Pain Inventory (BPI) in patients with bone metastases. Methods: Patients enrolled in the Radiation Therapy Oncology Group (RTOG) 9714 bone metastases study were eligible. Patients were assessed at baseline and 4, 8, and 12 weeks after randomization for the palliative radiotherapy with the BPI, which consists of seven functional items: general activity, mood, walking ability, normal work, relations with others, sleep, and enjoyment of life. Principal component analysis with varimax rotation was used to determine the clusters between the functional items at baseline and the follow-up. Cronbach's alpha was used to determine the consistency and reliability of each cluster at baseline and follow-up. Results: There were 448 male and 461 female patients, with a median age of 67 years. There were two functional interference clusters at baseline, which accounted for 71% of the total variance. The first cluster (physical interference) included normal work and walking ability, which accounted for 58% of the total variance. The second cluster (psychosocial interference) included relations with others and sleep, which accounted for 13% of the total variance. The Cronbach's alpha statistics were 0.83 and 0.80, respectively. The functional clusters changed at week 12 in responders but persisted through week 12 in nonresponders. Conclusion: Palliative radiotherapy is effective in reducing bone pain. Functional interference component clusters exist in patients treated for bone metastases. These clusters changed over time in this study, possibly attributable to treatment. Further research is needed to examine these effects.

  10. Preliminary Cluster Analysis For Several Representatives Of Genus Kerivoula (Chiroptera: Vespertilionidae) in Borneo

    NASA Astrophysics Data System (ADS)

    Hasan, Noor Haliza; Abdullah, M. T.

    2008-01-01

    The aim of the study is to use cluster analysis on morphometric parameters within the genus Kerivoula to produce a dendrogram and to determine the suitability of this method to describe the relationship among species within this genus. A total of 15 adult male individuals from genus Kerivoula taken from sampling trips around Borneo and specimens kept at the zoological museum of Universiti Malaysia Sarawak were examined. A total of 27 characters using dental, skull and external body measurements were recorded. Clustering analysis illustrated the grouping and morphometric relationships between the species of this genus. It has clearly separated each species from each other despite the overlapping of measurements of some species within the genus. Cluster analysis provides an alternative approach to make a preliminary identification of a species.

  11. Dual mode use requirements analysis for the institutional cluster.

    SciTech Connect

    Leland, Robert W.

    2003-09-01

    This paper analyzes what additional costs would be incurred in supporting dual-mode, i.e. both classified and unclassified use of the Institutional Computing (IC) hardware. The following five options are considered: periods processing in which a fraction of the system alternates in time between classified and unclassified modes, static split in which the system is constructed as a set of smaller clusters which remain in one mode or the other, re-configurable split in which the system is constructed in a split fashion but a mechanism is provided to reconfigure it very infrequently, red/black switching in which a mechanism is provided to switch sections of the system between modes frequently, and complementary operation in which parts of the system are operated entirely in one mode at one geographical site and entirely in the other mode at the other geographical site and other systems are repartitioned to balance work load. These options are evaluated against eleven criteria such as disk storage costs, distance computing costs, reductions in capability and capacity as a result of various factors etc. The evaluation is both qualitative and quantitative, and is captured in various summary tables.

  12. Analysis of cardiac tissue by gold cluster ion bombardment

    NASA Astrophysics Data System (ADS)

    Aranyosiova, M.; Chorvatova, A.; Chorvat, D.; Biro, Cs.; Velic, D.

    2006-07-01

    Specific molecules in cardiac tissue of spontaneously hypertensive rats are studied by using time-of-flight secondary ion mass spectrometry (TOF-SIMS). The investigation determines phospholipids, cholesterol, fatty acids and their fragments in the cardiac tissue, with special focus on cardiolipin. Cardiolipin is a unique phospholipid typical for cardiomyocyte mitochondrial membrane and its decrease is involved in pathologic conditions. In the positive polarity, the fragments of phosphatydilcholine are observed in the mass region of 700-850 u. Peaks over mass 1400 u correspond to intact and cationized molecules of cardiolipin. In animal tissue, cardiolipin contains of almost exclusively 18 carbon fatty acids, mostly linoleic acid. Linoleic acid at 279 u, other fatty acids, and phosphatidylglycerol fragments, as precursors of cardiolipin synthesis, are identified in the negative polarity. These data demonstrate that SIMS technique along with Au 3+ cluster primary ion beam is a good tool for detection of higher mass biomolecules providing approximately 10 times higher yield in comparison with Au +.

  13. Differentiating Procrastinators from Each Other: A Cluster Analysis.

    PubMed

    Rozental, Alexander; Forsell, Erik; Svensson, Andreas; Forsström, David; Andersson, Gerhard; Carlbring, Per

    2015-01-01

    Procrastination refers to the tendency to postpone the initiation and completion of a given course of action. Approximately one-fifth of the adult population and half of the student population perceive themselves as being severe and chronic procrastinators. Albeit not a psychiatric diagnosis, procrastination has been shown to be associated with increased stress and anxiety, exacerbation of illness, and poorer performance in school and work. However, despite being severely debilitating, little is known about the population of procrastinators in terms of possible subgroups, and previous research has mainly investigated procrastination among university students. The current study examined data from a screening process recruiting participants to a randomized controlled trial of Internet-based cognitive behavior therapy for procrastination (Rozental et al., in press). In total, 710 treatment-seeking individuals completed self-report measures of procrastination, depression, anxiety, and quality of life. The results suggest that there might exist five separate subgroups, or clusters, of procrastinators: "Mild procrastinators" (24.93%), "Average procrastinators" (27.89%), "Well-adjusted procrastinators" (13.94%), "Severe procrastinators" (21.69%), and "Primarily depressed" (11.55%). Hence, there seems to be marked differences among procrastinators in terms of levels of severity, as well as a possible subgroup for which procrastinatory problems are primarily related to depression. Tailoring the treatment interventions to the specific procrastination profile of the individual could thus become important, as well as screening for comorbid psychiatric diagnoses in order to target difficulties associated with, for instance, depression.

  14. Clustering of ant communities and indicator species analysis using self-organizing maps.

    PubMed

    Park, Sang-Hyun; Hosoishi, Shingo; Ogata, Kazuo; Kuboki, Yuzuru

    2014-09-01

    To understand the complex relationships that exist between ant assemblages and their habitats, we performed a self-organizing map (SOM) analysis to clarify the interactions among ant diversity, spatial distribution, and land use types in Fukuoka City, Japan. A total of 52 species from 12 study sites with nine land use types were collected from 1998 to 2012. A SOM was used to classify the collected data into three clusters based on the similarities between the ant communities. Consequently, each cluster reflected both the species composition and habitat characteristics in the study area. A detrended correspondence analysis (DCA) corroborated these findings, but removal of unique and duplicate species from the dataset in order to avoid sampling errors had a marked effect on the results; specifically, the clusters produced by DCA before and after the exclusion of specific data points were very different, while the clusters produced by the SOM were consistent. In addition, while the indicator value associated with SOMs clearly illustrated the importance of individual species in each cluster, the DCA scatterplot generated for species was not clear. The results suggested that SOM analysis was better suited for understanding the relationships between ant communities and species and habitat characteristics.

  15. Tracking undergraduate student achievement in a first-year physiology course using a cluster analysis approach.

    PubMed

    Brown, S J; White, S; Power, N

    2015-12-01

    A cluster analysis data classification technique was used on assessment scores from 157 undergraduate nursing students who passed 2 successive compulsory courses in human anatomy and physiology. Student scores in five summative assessment tasks, taken in each of the courses, were used as inputs for a cluster analysis procedure. We aimed to group students into high-achieving (HA) and low-achieving (LA) clusters and to determine the ability of each summative assessment task to discriminate between HA and LA students. The two clusters identified in each semester were described as HA (n = 42) and LA (n = 115) in semester 1 (HA1 and LA1, respectively) and HA (n = 91) and LA (n = 42) in semester 2 (HA2 and LA2, respectively). In both semesters, HA and LA means for all inputs were different (all P < 0.001). Nineteen students moved from the HA1 group into the LA2 group, whereas 68 students moved from the LA1 group into the HA2 group. The overall order of importance of inputs that determined group membership was different in semester 1 compared with semester 2; in addition, the within-cluster order of importance in LA groups was different compared with HA groups. This method of analysis may 1) identify students who need extra instruction, 2) identify which assessment is more effective in discriminating between HA and LA students, and 3) provide quantitative evidence to track student achievement.

  16. Comprehensive behavioral analysis of cluster of differentiation 47 knockout mice.

    PubMed

    Koshimizu, Hisatsugu; Takao, Keizo; Matozaki, Takashi; Ohnishi, Hiroshi; Miyakawa, Tsuyoshi

    2014-01-01

    Cluster of differentiation 47 (CD47) is a member of the immunoglobulin superfamily which functions as a ligand for the extracellular region of signal regulatory protein α (SIRPα), a protein which is abundantly expressed in the brain. Previous studies, including ours, have demonstrated that both CD47 and SIRPα fulfill various functions in the central nervous system (CNS), such as the modulation of synaptic transmission and neuronal cell survival. We previously reported that CD47 is involved in the regulation of depression-like behavior of mice in the forced swim test through its modulation of tyrosine phosphorylation of SIRPα. However, other potential behavioral functions of CD47 remain largely unknown. In this study, in an effort to further investigate functional roles of CD47 in the CNS, CD47 knockout (KO) mice and their wild-type littermates were subjected to a battery of behavioral tests. CD47 KO mice displayed decreased prepulse inhibition, while the startle response did not differ between genotypes. The mutants exhibited slightly but significantly decreased sociability and social novelty preference in Crawley's three-chamber social approach test, whereas in social interaction tests in which experimental and stimulus mice have direct contact with each other in a freely moving setting in a novel environment or home cage, there were no significant differences between the genotypes. While previous studies suggested that CD47 regulates fear memory in the inhibitory avoidance test in rodents, our CD47 KO mice exhibited normal fear and spatial memory in the fear conditioning and the Barnes maze tests, respectively. These findings suggest that CD47 is potentially involved in the regulation of sensorimotor gating and social behavior in mice.

  17. A cluster analysis to investigating nurses' knowledge, attitudes, and skills regarding the clinical management system.

    PubMed

    Chan, M F

    2007-01-01

    Nurses' knowledge, attitudes, and skills regarding the Clinical Management System are explored by identifying profiles of nurses working in Hong Kong. A total of 282 nurses from four hospitals completed a self-reported questionnaire during the period from December 2004 to May 2005. Two-step cluster analysis yielded two clusters. The first cluster (n = 159, 56.4%) was labeled "negative attitudes, less skillful, and average knowledge" group. The second cluster (n = 123, 43.6%) was labeled "positive attitudes, good knowledge, but less skillful." There was a positive correlation in cluster 1 for nurses' knowledge and attitudes (rs = 0.28) and in cluster 2 for nurses' skills and attitudes (rs = 0.25) toward computerization. The study showed that senior and more highly educated nurses generally held more positive attitudes to computerization, whereas the attitudes among younger and less well educated nurses generally were more negative. Such findings should be used to formulate strategies to encourage nurses to resolve actual problems following computer training and to increase the depth and breadth of nurses' computer knowledge and skills and improve their attitudes toward computerization.

  18. Comparative genomic analysis of sixty mycobacteriophage genomes: Genome clustering, gene acquisition and gene size

    PubMed Central

    Hatfull, Graham F.; Jacobs-Sera, Deborah; Lawrence, Jeffrey G.; Pope, Welkin H.; Russell, Daniel A.; Ko, Ching-Chung; Weber, Rebecca J.; Patel, Manisha C.; Germane, Katherine L.; Edgar, Robert H.; Hoyte, Natasha N.; Bowman, Charles A.; Tantoco, Anthony T.; Paladin, Elizabeth C.; Myers, Marlana S.; Smith, Alexis L.; Grace, Molly S.; Pham, Thuy T.; O'Brien, Matthew B.; Vogelsberger, Amy M.; Hryckowian, Andrew J.; Wynalek, Jessica L.; Donis-Keller, Helen; Bogel, Matt W.; Peebles, Craig L.; Cresawn, Steve G.; Hendrix, Roger W.

    2010-01-01

    Mycobacteriophages are viruses that infect mycobacterial hosts. Expansion of a collection of sequenced phage genomes to a total of sixty – all infecting a common bacterial host – provides further insight into their diversity and evolution. Of the sixty phage genomes, 55 can be grouped into nine clusters according to their nucleotide sequence similarities, five of which can be further divided into subclusters; five genomes do not cluster with other phages. The sequence diversity between genomes within a cluster varies greatly; for example, the six genomes in cluster D share more than 97.5% average nucleotide similarity with each other. In contrast, similarity between the two genomes in Cluster I is barely detectable by diagonal plot analysis. The total of 6,858 predicted ORFs have been grouped into 1523 phamilies (phams) of related sequences, 46% of which possess only a single member. Only 18.8% of the phams have sequence similarity to non-mycobacteriophage database entries and fewer than 10% of all phams can be assigned functions based on database searching or synteny. Genome clustering facilitates the identification of genes that are in greatest genetic flux and are more likely to have been exchanged horizontally in relatively recent evolutionary time. Although mycobacteriophage genes exhibit smaller average size than genes of their host (205 residues compared to 315), phage genes in higher flux average only ∼100 amino acids, suggesting that the primary units of genetic exchange correspond to single protein domains. PMID:20064525

  19. Adults' Physical Activity Patterns across Life Domains: Cluster Analysis with Replication

    PubMed Central

    Rovniak, Liza S.; Sallis, James F.; Saelens, Brian E.; Frank, Lawrence D.; Marshall, Simon J.; Norman, Gregory J.; Conway, Terry L.; Cain, Kelli L.; Hovell, Melbourne F.

    2010-01-01

    Objective Identifying adults' physical activity patterns across multiple life domains could inform the design of interventions and policies. Design Cluster analysis was conducted with adults in two US regions (Baltimore-Washington DC, n = 702; Seattle-King County, n = 987) to identify different physical activity patterns based on adults' reported physical activity across four life domains: leisure, occupation, transport, and home. Objectively measured physical activity, and psychosocial and built (physical) environment characteristics of activity patterns were examined. Main Outcome Measures Accelerometer-measured activity, reported domain-specific activity, psychosocial characteristics, built environment, body mass index (BMI). Results Three clusters replicated (kappa = .90-.93) across both regions: Low Activity, Active Leisure, and Active Job. The Low Activity and Active Leisure adults were demographically similar, but Active Leisure adults had the highest psychosocial and built environment support for activity, highest accelerometer-measured activity, and lowest BMI. Compared to the other clusters, the Active Job cluster had lower socioeconomic status and intermediate accelerometer-measured activity. Conclusion Adults can be clustered into groups based on their patterns of accumulating physical activity across life domains. Differences in psychosocial and built environment support between the identified clusters suggest that tailored interventions for different subgroups may be beneficial. PMID:20836604

  20. Functional analysis of the upstream regulatory region of chicken miR-17-92 cluster.

    PubMed

    Min, Cheng; Wenjian, Zhang; Tianyu, Xing; Xiaohong, Yan; Yumao, Li; Hui, Li; Ning, Wang

    2016-08-01

    miR-17-92 cluster plays important roles in cell proliferation, differentiation, apoptosis, animal development and tumorigenesis. The transcriptional regulation of miR-17-92 cluster has been extensively studied in mammals, but not in birds. To date, avian miR-17-92 cluster genomic structure has not been fully determined. The promoter location and sequence of miR-17-92 cluster have not been determined, due to the existence of a genomic gap sequence upstream of miR-17-92 cluster in all the birds whose genomes have been sequenced. In this study, genome walking was used to close the genomic gap upstream of chicken miR-17-92 cluster. In addition, bioinformatics analysis, reporter gene assay and truncation mutagenesis were used to investigate functional role of the genomic gap sequence. Genome walking analysis showed that the gap region was 1704 bp long, and its GC content was 80.11%. Bioinformatics analysis showed that in the gap region, there was a 200 bp conserved sequence among the tested 10 species (Gallus gallus, Homo sapiens, Pan troglodytes, Bos taurus, Sus scrofa, Rattus norvegicus, Mus musculus, Possum, Danio rerio, Rana nigromaculata), which is core promoter region of mammalian miR-17-92 host gene (MIR17HG). Promoter luciferase reporter gene vector of the gap region was constructed and reporter assay was performed. The result showed that the promoter activity of pGL3-cMIR17HG (-4228/-2506) was 417 times than that of negative control (empty pGL3 basic vector), suggesting that chicken miR-17-92 cluster promoter exists in the gap region. To further gain insight into the promoter structure, two different truncations for the cloned gap sequence were generated by PCR. One had a truncation of 448 bp at the 5'-end and the other had a truncation of 894 bp at the 3'-end. Further reporter analysis showed that compared with the promoter activity of pGL3-cMIR17HG (-4228/-2506), the reporter activities of the 5'-end truncation and the 3'-end truncation were reduced by 19

  1. Cluster Analysis of Velocity Field Derived from Dense GNSS Network of Japan

    NASA Astrophysics Data System (ADS)

    Takahashi, A.; Hashimoto, M.

    2015-12-01

    Dense GNSS networks have been widely used to observe crustal deformation. Simpson et al. (2012) and Savage and Simpson (2013) have conducted cluster analyses of GNSS velocity field in the San Francisco Bay Area and Mojave Desert, respectively. They have successfully found velocity discontinuities. They also showed an advantage of cluster analysis for classifying GNSS velocity field. Since in western United States, strike-slip events are dominant, geometry is simple. However, the Japanese Islands are tectonically complicated due to subduction of oceanic plates. There are many types of crustal deformation such as slow slip event and large postseismic deformation. We propose a modified clustering method of GNSS velocity field in Japan to separate time variant and static crustal deformation. Our modification is performing cluster analysis every several months or years, then qualifying cluster member similarity. If a GNSS station moved differently from its neighboring GNSS stations, the station will not belong to in the cluster which includes its surrounding stations. With this method, time variant phenomena were distinguished. We applied our method to GNSS data of Japan from 1996 to 2015. According to the analyses, following conclusions were derived. The first is the clusters boundaries are consistent with known active faults. For examples, the Arima-Takatsuki-Hanaore fault system and the Shimane-Tottori segment proposed by Nishimura (2015) are recognized, though without using prior information. The second is improving detectability of time variable phenomena, such as a slow slip event in northern part of Hokkaido region detected by Ohzono et al. (2015). The last one is the classification of postseismic deformation caused by large earthquakes. The result suggested velocity discontinuities in postseismic deformation of the Tohoku-oki earthquake. This result implies that postseismic deformation is not continuously decaying proportional to distance from its epicenter.

  2. Cluster Analysis of Indonesian Province Based on Household Primary Cooking Fuel Using K-Means

    NASA Astrophysics Data System (ADS)

    Huda, S. N.

    2017-03-01

    Each household definitely provides installations for cooking. Kerosene, which is refined from petroleum products once dominated types of primary fuel for cooking in Indonesia, whereas kerosene has an expensive cost and small efficiency. Other household use LPG as their primary cooking fuel. However, LPG supply is also limited. In addition, with a very diverse environments and cultures in Indonesia led to diversity of the installation type of cooking, such as wood-burning stove brazier. The government is also promoting alternative fuels, such as charcoal briquettes, and fuel from biomass. The use of other fuels is part of the diversification of energy that is expected to reduce community dependence on petroleum-based fuels. The use of various fuels in cooking that vary from one region to another reflects the distribution of fuel basic use by household. By knowing the characteristics of each province, the government can take appropriate policies to each province according each character. Therefore, it would be very good if there exist a cluster analysis of all provinces in Indonesia based on the type of primary cooking fuel in household. Cluster analysis is done using K-Means method with K ranging from 2-5. Cluster results are validated using Silhouette Coefficient (SC). The results show that the highest SC achieved from K = 2 with SC value 0.39135818388151. Two clusters reflect provinces in Indonesia, one is a cluster of more traditional provinces and the other is a cluster of more modern provinces. The cluster results are then shown in a map using Google Map API.

  3. Indentifying the major air pollutants base on factor and cluster analysis, a case study in 74 Chinese cities

    NASA Astrophysics Data System (ADS)

    Zhang, Jing; Zhang, Lan-yue; Du, Ming; Zhang, Wei; Huang, Xin; Zhang, Ya-qi; Yang, Yue-yi; Zhang, Jian-min; Deng, Shi-huai; Shen, Fei; Li, Yuan-wei; Xiao, Hong

    2016-11-01

    This article investigated the major air pollutants and its spatial and seasonal distribution in 74 Chinese cities. Factor analysis and Cluster analysis are employed to indentify major factors of air pollutants. The following results are obtained (1) major factors are obtained in spring, summer, autumn, and winter. The first factor in spring includes NO2, PM10, CO, and PM2.5; the first factor in summer and autumn includes PM10, PM2.5, CO and SO2; in winter, the first factor includes NO2, PM10, PM2.5, and SO2. (2) In spring, cities of cluster 5 are the severest polluted by emission sources of SO2, CO, PM10, and PM2.5; the emission sources of O3 would significantly influence the air quality in cities of cluster 2; the emission sources of NO2 could significantly influence the air quality in cities of cluster 3 and cluster 5. (3) In summer, cities of cluster 5 are the severest polluted by automotive emissions and coal flue gas. Cities of cluster 1 are the lightest polluted. Cities of cluster 3 and cluster 2 are polluted by emission sources of SO2 and O3. (4) In Autumn, cities of cluster 3 and 4 are the severest polluted by the emission sources of SO2, CO, PM10, and PM2.5; the emission sources of NO2 would significantly influence the air quality in cities of cluster 5; the emission sources of O3 could significantly influence the air quality in cities of cluster 1 and cluster 4. (5) In winter, cities of cluster 5 are the severest polluted by the emission sources of SO2, CO, PM10, PM2.5, and CO; the emission sources of O3 could significantly influence the air quality in cities of cluster 1 and cluster 5.

  4. Field of Study Choice: Using Conjoint Analysis and Clustering

    ERIC Educational Resources Information Center

    Shtudiner, Ze'ev; Zwilling, Moti; Kantor, Jeffrey

    2017-01-01

    Purpose: The purpose of this paper is to measure student's preferences regarding various attributes that affect their decision process while choosing a higher education area of study. Design/ Methodology/Approach: The paper exhibits two different models which shed light on the perceived value of each examined area of study: conjoint analysis and…

  5. Analysis of traffic accidents on rural highways using Latent Class Clustering and Bayesian Networks.

    PubMed

    de Oña, Juan; López, Griselda; Mujalli, Randa; Calvo, Francisco J

    2013-03-01

    One of the principal objectives of traffic accident analyses is to identify key factors that affect the severity of an accident. However, with the presence of heterogeneity in the raw data used, the analysis of traffic accidents becomes difficult. In this paper, Latent Class Cluster (LCC) is used as a preliminary tool for segmentation of 3229 accidents on rural highways in Granada (Spain) between 2005 and 2008. Next, Bayesian Networks (BNs) are used to identify the main factors involved in accident severity for both, the entire database (EDB) and the clusters previously obtained by LCC. The results of these cluster-based analyses are compared with the results of a full-data analysis. The results show that the combined use of both techniques is very interesting as it reveals further information that would not have been obtained without prior segmentation of the data. BN inference is used to obtain the variables that best identify accidents with killed or seriously injured. Accident type and sight distance have been identify in all the cases analysed; other variables such as time, occupant involved or age are identified in EDB and only in one cluster; whereas variables vehicles involved, number of injuries, atmospheric factors, pavement markings and pavement width are identified only in one cluster.

  6. Gene microarray data analysis using parallel point-symmetry-based clustering.

    PubMed

    Sarkar, Anasua; Maulik, Ujjwal

    2015-01-01

    Identification of co-expressed genes is the central goal in microarray gene expression analysis. Point-symmetry-based clustering is an important unsupervised learning technique for recognising symmetrical convex- or non-convex-shaped clusters. To enable fast clustering of large microarray data, we propose a distributed time-efficient scalable approach for point-symmetry-based K-Means algorithm. A natural basis for analysing gene expression data using symmetry-based algorithm is to group together genes with similar symmetrical expression patterns. This new parallel implementation also satisfies linear speedup in timing without sacrificing the quality of clustering solution on large microarray data sets. The parallel point-symmetry-based K-Means algorithm is compared with another new parallel symmetry-based K-Means and existing parallel K-Means over eight artificial and benchmark microarray data sets, to demonstrate its superiority, in both timing and validity. The statistical analysis is also performed to establish the significance of this message-passing-interface based point-symmetry K-Means implementation. We also analysed the biological relevance of clustering solutions.

  7. Hierarchical cluster analysis for exposure assessment of workers in the Semiconductor Health Study.

    PubMed

    Hines, C J; Selvin, S; Samuels, S J; Hammond, S K; Woskie, S R; Hallock, M F; Schenker, M B

    1995-12-01

    The fabrication of integrated circuits in the semiconductor industry involves worker exposures to multiple chemical and physical agents. The potential for a high degree of correlation among exposure variables was of concern in the Semiconductor Health Study. Hierarchical cluster analysis was used to identify groups or "clusters" of correlated variables. Several variations of hierarchical cluster analysis were performed on 14 chemical and physical agents, using exposure data on 882 subjects from the historical cohort of the epidemiological studies. Similarity between agent pairs was determined by calculating two metrics of dissimilarity, and hierarchical trees were constructed using three clustering methods. Among subjects exposed to ethylene-based glycol ethers (EGE), xylene, or n-butyl acetate (nBA), 83% were exposed to EGE and xylene, 86% to EGE and nBA, and 94% to xylene and nBA, suggesting that exposures to EGE, xylene, and nBA were highly correlated. A high correlation was also found for subjects exposed to boron and phosphorus (80%). The trees also revealed cluster groups containing agents associated with work-group exposure categories developed for the epidemiologic analyses.

  8. Analysis of the clustering of inertial particles in turbulent flows

    NASA Astrophysics Data System (ADS)

    Esmaily-Moghadam, Mahdi; Mani, Ali

    2016-12-01

    An asymptotic solution is derived for the motion of inertial particles exposed to Stokes drag in an unsteady random flow. This solution provides an estimate for the sum of Lyapunov exponents as a function of the Stokes number and Lagrangian strain- and rotation-rate autocovariance functions. The sum of exponents in a Lagrangian framework is the rate of contraction of clouds of particles, and in an Eulerian framework, it is the concentration-weighted divergence of the particle velocity field. Previous literature offers an estimate of the divergence of the particle velocity field, which is applicable only in the limit of small Stokes numbers [Robinson, Comm. Pure Appl. Math. 9, 69 (1956), 10.1002/cpa.3160090105 and Maxey, J. Fluid Mech. 174, 441 (1987), 10.1017/S0022112087000193] (R-M). In addition to reproducing R-M at this limit, our analysis provides a first-order correction to R-M at larger Stokes numbers. Our analysis is validated by a directly computed rate of contraction of clouds of particles from simulations of particles in homogeneous isotropic turbulence over a broad range of Stokes numbers. Our analysis and R-M predictions agree well with the direct computations at the limit of small Stokes numbers. At large Stokes numbers, in contrast to R-M, our model predictions remain bounded. In spite of an improvement over R-M, our analysis fails to predict the expansion of high Stokes clouds observed in the direct computations. Consistent with the general trend of particle segregation versus Stokes number, our analysis shows a maximum rate of contraction at an intermediate Stokes number of O (1 ) and minimal rates of contraction at small and large Stokes numbers.

  9. Student Motivation and Learning in Mathematics and Science: A Cluster Analysis

    ERIC Educational Resources Information Center

    Ng, Betsy L. L.; Liu, W. C.; Wang, John C. K.

    2016-01-01

    The present study focused on an in-depth understanding of student motivation and self-regulated learning in mathematics and science through cluster analysis. It examined the different learning profiles of motivational beliefs and self-regulatory strategies in relation to perceived teacher autonomy support, basic psychological needs (i.e. autonomy,…

  10. Spectral analysis of A and F dwarf members of the open cluster M6: preliminary results

    NASA Astrophysics Data System (ADS)

    Kılıçoǧlu, T.; Monier, R.; Fossati, L.

    2010-12-01

    We present the first abundance analysis of CD-32 13109 (NGC 6405 47), member of the M6 open cluster. The photospheric abundances of 14 chemical elements were determined by comparing synthetic spectra and observed spectra of the star. Findings show that this star should be an Am star.

  11. Comparative Evaluation of Two Superior Stopping Rules for Hierarchical Cluster Analysis.

    ERIC Educational Resources Information Center

    Atlas, Robert S.; Overall, John E.

    1994-01-01

    A split-sample replication stopping rule for hierarchical cluster analysis is compared with the internal criterion previously found superior by Milligan and Cooper (1985) in their comparison of 30 different procedures. Situations under which the methods are equivalent or not equally useful are discussed. (SLD)

  12. Posterior AD-Type Pathology: Cognitive Subtypes Emerging from a Cluster Analysis

    PubMed Central

    Cappa, Antonella; Ciccarelli, Nicoletta; Silveri, Maria Caterina

    2014-01-01

    Background. “Posterior shift” of the neuropathological changes of Alzheimer's disease (AD) produces a syndrome (posterior cortical atrophy) (PCA) dominated by high-level visual deficits. Objective. To explore in patients with AD-type pathology whether a data-driven analysis (cluster analysis) based on neuropsychological findings resulted in the emergence of different subgroups of patients; in particular to find out whether it was possible to identify patients with visuospatial deficits consistent with the hypothesis that PCA is a “dorsal stream” syndrome or, rather, whether there were subgroups of patients with different types of impairment within the high-level visual domain. Methods. 23 PCA and 16 DAT patients were studied. By a principal component analysis performed on a wide range of neuropsychological tasks, 15 variables were obtained that loaded onto five main factors (memory, language, perceptual, visuospatial, and calculation) which entered a hierarchical cluster analysis. Results. Four clusters of cognitive impairment emerged: visuospatial/perceptual, memory, perceptual/calculation, and language. Only in the first cluster a visuospatial deficit clearly emerged. Conclusions. AD pathology produces not only variants dominated by memory (DAT) and, to a lesser extent, visuospatial deficit (PCA), but also other distinct syndromic subtypes with disorders in visual perception and language which reflect a different vulnerability of specific functional networks. PMID:24994944

  13. [Current service invention patents and growth pathways on basis of cluster analysis].

    PubMed

    Yang, Xu-jie; Xiao, Shi-ying

    2012-09-01

    This study aims for enhancing quantity and quality of patents of traditional Chinese medicine compounds of traditional Chinese medicine enterprises, traditional Chinese medicine colleges and relevant institutions while building an efficient pathway for patent protection using simple statistics and cluster analysis, with service invention patent holders of traditional Chinese medicine compounds as the study object.

  14. Clustered Stomates in "Begonia": An Exercise in Data Collection & Statistical Analysis of Biological Space

    ERIC Educational Resources Information Center

    Lau, Joann M.; Korn, Robert W.

    2007-01-01

    In this article, the authors present a laboratory exercise in data collection and statistical analysis in biological space using clustered stomates on leaves of "Begonia" plants. The exercise can be done in middle school classes by students making their own slides and seeing imprints of cells, or at the high school level through collecting data of…

  15. Structure of Hierarchic Clusterings: Implications for Information Retrieval and for Multivariate Data Analysis.

    ERIC Educational Resources Information Center

    Murtagh, F.

    1984-01-01

    Using examples of data from the areas of information retrieval and of multivariate data analysis, six hierarchic clustering algorithms (single link, median, centroid, group average, complete link, Wards's) are examined and evaluated by using three proposed coefficients of hierarchic structure. Nine references are cited. (EJS)

  16. A Cluster Analysis of the Circumstances of Death in Suicides in Hong Kong

    ERIC Educational Resources Information Center

    Chen, Eric Y. H.; Chan, Wincy S. C.; Chan, Sandra S. M.; Liu, Ka Y.; Chan, Cecilia L. W.; Wong, Paul W. C.; Law, Y. W.; Yip, Paul S. F.

    2007-01-01

    Classification of suicides is essential for clinicians to better identify self-harm patients with future suicidal risks. This study examined potential subtypes of suicide in a psychological autopsy sample (N = 148) in Hong Kong. Hierarchical cluster analysis extracted two subgroups of subjects in terms of expressed deliberation assessed by the…

  17. Cluster Analysis of Assessment in Anatomy and Physiology for Health Science Undergraduates

    ERIC Educational Resources Information Center

    Brown, Stephen; White, Sue; Power, Nicola

    2016-01-01

    Academic content common to health science programs is often taught to a mixed group of students; however, content assessment may be consistent for each discipline. This study used a retrospective cluster analysis on such a group, first to identify high and low achieving students, and second, to determine the distribution of students within…

  18. Multiscale deep drawing analysis of dual-phase steels using grain cluster-based RGC scheme

    NASA Astrophysics Data System (ADS)

    Tjahjanto, D. D.; Eisenlohr, P.; Roters, F.

    2015-06-01

    Multiscale modelling and simulation play an important role in sheet metal forming analysis, since the overall material responses at macroscopic engineering scales, e.g. formability and anisotropy, are strongly influenced by microstructural properties, such as grain size and crystal orientations (texture). In the present report, multiscale analysis on deep drawing of dual-phase steels is performed using an efficient grain cluster-based homogenization scheme. The homogenization scheme, called relaxed grain cluster (RGC), is based on a generalization of the grain cluster concept, where a (representative) volume element consists of p  ×  q  ×  r (hexahedral) grains. In this scheme, variation of the strain or deformation of individual grains is taken into account through the, so-called, interface relaxation, which is formulated within an energy minimization framework. An interfacial penalty term is introduced into the energy minimization framework in order to account for the effects of grain boundaries. The grain cluster-based homogenization scheme has been implemented and incorporated into the advanced material simulation platform DAMASK, which purposes to bridge the macroscale boundary value problems associated with deep drawing analysis to the micromechanical constitutive law, e.g. crystal plasticity model. Standard Lankford anisotropy tests are performed to validate the model parameters prior to the deep drawing analysis. Model predictions for the deep drawing simulations are analyzed and compared to the corresponding experimental data. The result shows that the predictions of the model are in a very good agreement with the experimental measurement.

  19. Student Motivational Profiles in an Introductory MIS Course: An Exploratory Cluster Analysis

    ERIC Educational Resources Information Center

    Nelson, Klara

    2014-01-01

    This study profiles students in an introductory MIS course according to a variety of variables associated with choice of academic major. The data were collected through a survey administered to 12 sections of the course. A two-step cluster analysis was performed with gender as a categorical variable and students' perceptions of task value…

  20. Exploring the Relationship between Autism Spectrum Disorder and Epilepsy Using Latent Class Cluster Analysis

    ERIC Educational Resources Information Center

    Cuccaro, Michael L.; Tuchman, Roberto F.; Hamilton, Kara L.; Wright, Harry H.; Abramson, Ruth K.; Haines, Jonathan L.; Gilbert, John R.; Pericak-Vance, Margaret

    2012-01-01

    Epilepsy co-occurs frequently in autism spectrum disorders (ASD). Understanding this co-occurrence requires a better understanding of the ASD-epilepsy phenotype (or phenotypes). To address this, we conducted latent class cluster analysis (LCCA) on an ASD dataset (N = 577) which included 64 individuals with epilepsy. We identified a 5-cluster…

  1. 2 x 2 Achievement Goals and Achievement Emotions: A Cluster Analysis of Students' Motivation

    ERIC Educational Resources Information Center

    Jang, Leong Yeok; Liu, Woon Chia

    2012-01-01

    This study sought to better understand the adoption of multiple achievement goals at an intra-individual level, and its links to emotional well-being, learning, and academic achievement. Participants were 480 Secondary Two students (aged between 13 and 14 years) from two coeducational government schools. Hierarchical cluster analysis revealed the…

  2. Molecular Clustering Interrelationships and Carbohydrate Conformation in Hull and Seeds Among Barley Cultivars

    SciTech Connect

    N Liu; P Yu

    2011-12-31

    The objective of this study was to use molecular spectral analyses with the diffuse reflectance Fourier transform infrared spectroscopy (DRIFT) bioanlytical technique to study carbohydrate conformation features, molecular clustering and interrelationships in hull and seed among six barley cultivars (AC Metcalfe, CDC Dolly, McLeod, CDC Helgason, CDC Trey, CDC Cowboy), which had different degradation kinetics in rumen. The molecular structure spectral analyses in both hull and seed involved the fingerprint regions of ca. 1536-1484 cm{sup -1} (attributed mainly to aromatic lignin semicircle ring stretch), ca. 1293-1212 cm{sup -1} (attributed mainly to cellulosic compounds in the hull), ca. 1269-1217 cm{sup -1} (attributed mainly to cellulosic compound in the seeds), and ca. 1180-800 cm{sup -1} (attributed mainly to total CHO C-O stretching vibrations) together with an agglomerative hierarchical cluster (AHCA) and principal component spectral analyses (PCA). The results showed that the DRIFT technique plus AHCA and PCA molecular analyses were able to reveal carbohydrate conformation features and identify carbohydrate molecular structure differences in both hull and seeds among the barley varieties. The carbohydrate molecular spectral analyses at the region of ca. 1185-800 cm{sup -1} together with the AHCA and PCA were able to show that the barley seed inherent structures exhibited distinguishable differences among the barley varieties. CDC Helgason had differences from AC Metcalfe, MeLeod, CDC Cowboy and CDC Dolly in carbohydrate conformation in the seed. Clear molecular cluster classes could be distinguished and identified in AHCA analysis and the separate ellipses could be grouped in PCA analysis. But CDC Helgason had no distinguished differences from CDC Trey in carbohydrate conformation. These carbohydrate conformation/structure difference could partially explain why the varieties were different in digestive behaviors in animals. The molecular spectroscopy

  3. Unraveling the dha cluster in Citrobacter werkmanii: comparative genomic analysis of bacterial 1,3-propanediol biosynthesis clusters.

    PubMed

    Maervoet, Veerle E T; De Maeseneire, Sofie L; Soetaert, Wim K; De Mey, Marjan

    2014-04-01

    In natural 1,3-propanediol (PDO) producing microorganisms such as Klebsiella pneumoniae, Citrobacter freundii and Clostridium sp., the genes coding for PDO producing enzymes are grouped in a dha cluster. This article describes the dha cluster of a novel candidate for PDO production, Citrobacter werkmanii DSM17579 and compares the cluster to the currently known PDO clusters of Enterobacteriaceae and Clostridiaceae. Moreover, we attribute a putative function to two previously unannotated ORFs, OrfW and OrfY, both in C. freundii and in C. werkmanii: both proteins might form a complex and support the glycerol dehydratase by converting cob(I)alamin to the glycerol dehydratase cofactor coenzyme B12. Unraveling this biosynthesis cluster revealed high homology between the deduced amino acid sequence of the open reading frames of C. werkmanii DSM17579 and those of C. freundii DSM30040 and K. pneumoniae MGH78578, i.e., 96 and 87.5 % identity, respectively. On the other hand, major differences between the clusters have also been discovered. For example, only one dihydroxyacetone kinase (DHAK) is present in the dha cluster of C. werkmanii DSM17579, while two DHAK enzymes are present in the cluster of K. pneumoniae MGH78578 and Clostridium butyricum VPI1718.

  4. Cluster analysis of resting-state fMRI time series.

    PubMed

    Mezer, Aviv; Yovel, Yossi; Pasternak, Ofer; Gorfine, Tali; Assaf, Yaniv

    2009-05-01

    Functional MRI (fMRI) has become one of the leading methods for brain mapping in neuroscience. Recent advances in fMRI analysis were used to define the default state of brain activity, functional connectivity and basal activity. Basal activity measured with fMRI raised tremendous interest among neuroscientists since synchronized brain activity pattern could be retrieved while the subject rests (resting state fMRI). During recent years, a few signal processing schemes have been suggested to analyze the resting state blood oxygenation level dependent (BOLD) signal from simple correlations to spectral decomposition. In most of these analysis schemes, the question asked was which brain areas "behave" in the time domain similarly to a pre-specified ROI. In this work we applied short time frequency analysis and clustering to study the spatial signal characteristics of resting state fMRI time series. Such analysis revealed that clusters of similar BOLD fluctuations are found in the cortex but also in the white matter and sub-cortical gray matter regions (thalamus). We found high similarities between the BOLD clusters and the neuro-anatomical appearance of brain regions. Additional analysis of the BOLD time series revealed a strong correlation between head movements and clustering quality. Experiments performed with T1-weighted time series also provided similar quality of clustering. These observations led us to the conclusion that non-functional contributions to the BOLD time series can also account for symmetric appearance of signal fluctuations. These contributions may include head motions, the underling microvasculature anatomy and cellular morphology.

  5. Module Cluster: IFE - 001.00 (GSC) Basic Terminology and Analysis of Writings Concerned with Educational Issues.

    ERIC Educational Resources Information Center

    Zahn, R. D.

    This document is one of the module clusters developed for the Camden Teacher Corps project. The purpose of this module cluster is to enable students to define and use basic terminology in the discussion and analysis of educational issues, to use various approaches in studying an issue, and to apply critical analysis skills to written and spoken…

  6. The Feasibility of Using Cluster Analysis to Examine Log Data from Educational Video Games. CRESST Report 790

    ERIC Educational Resources Information Center

    Kerr, Deirdre; Chung, Gregory K. W. K.; Iseli, Markus R.

    2011-01-01

    Analyzing log data from educational video games has proven to be a challenging endeavor. In this paper, we examine the feasibility of using cluster analysis to extract information from the log files that is interpretable in both the context of the game and the context of the subject area. If cluster analysis can be used to identify patterns of…

  7. Investigating nurses' knowledge, attitudes, and skills patterns towards clinical management system: results of a cluster analysis.

    PubMed

    Chan, M F

    2006-09-01

    To determine whether definable subtypes exist within a cohort of Hong Kong nurses as related to the clinical management system use in their clinical practices based on their knowledge, attitudes, skills, and background factors. Data were collected using a structured questionnaire. The sample of 242 registered nurses was recruited from three hospitals in Hong Kong. The study employs personal and demographic variables, knowledge, attitudes, and skills scale. A cluster analysis yielded two clusters. Each cluster represents a different profile of Hong Kong nurses on the clinical management system use in their clinical practices. The first group (Cluster 1) was labeled 'lower attitudes, less skilful and average knowledge' group, and represented 55.4% of the total respondents. The second group (Cluster 2) was labeled as 'positive attitudes, good knowledge but less skilful'. They comprised almost 44.6% of this nursing sample. Cluster 2 had more older nurses, the majority were educated to the baccalaureate or above level, with more than 10 years working experience, and they held a more senior ranking then Cluster 1. A clear profile of Hong Kong nurses may benefit healthcare professionals in making appropriate education or assistance to prompt the use of the clinical management system by nurses an officially recognized profession. The findings were useful in determining nurse-users' specific needs and their preferences for modification of the clinical management system. Such findings should be used to formulate strategies to encourage nurses to resolve actual problems following computer training and to increase the depth and breadth of nurses' knowledge, attitudes, and skills toward such system.

  8. Market segmentation for multiple option healthcare delivery systems--an application of cluster analysis.

    PubMed

    Jarboe, G R; Gates, R H; McDaniel, C D

    1990-01-01

    Healthcare providers of multiple option plans may be confronted with special market segmentation problems. This study demonstrates how cluster analysis may be used for discovering distinct patterns of preference for multiple option plans. The availability of metric, as opposed to categorical or ordinal, data provides the ability to use sophisticated analysis techniques which may be superior to frequency distributions and cross-tabulations in revealing preference patterns.

  9. Fault Reactivation Analysis Using Microearthquake Clustering Based on Signal-to-Noise Weighted Waveform Similarity

    NASA Astrophysics Data System (ADS)

    Grund, Michael; Groos, Jörn C.; Ritter, Joachim R. R.

    2016-07-01

    The cluster formation of about 2000 induced microearthquakes (mostly M L < 2) is studied using a waveform similarity technique based on cross-correlation and a subsequent equivalence class approach. All events were detected within two separated but neighbouring seismic volumes close to the geothermal powerplants near Landau and Insheim in the Upper Rhine Graben, SW Germany between 2006 and 2013. Besides different sensors, sampling rates and individual data gaps, mainly low signal-to-noise ratios (SNR) of the recordings at most station sites provide a complication for the determination of a precise waveform similarity analysis of the microseismic events in this area. To include a large number of events for such an analysis, a newly developed weighting approach was implemented in the waveform similarity analysis which directly considers the individual SNRs across the whole seismic network. The application to both seismic volumes leads to event clusters with high waveform similarities within short (seconds to hours) and long (months to years) time periods covering two magnitude ranges. The estimated relative hypocenter locations are spatially concentrated for each single cluster and mirror the orientations of mapped faults as well as interpreted rupture planes determined from fault plane solutions. Depending on the waveform cross-correlation coefficient threshold, clusters can be resolved in space to as little as one dominant wavelength. The interpretation of these observations implies recurring fault reactivations by fluid injection with very similar faulting mechanisms during different time periods between 2006 and 2013.

  10. Alteration mapping at Goldfield, Nevada, by cluster and discriminant analysis of LANDSAT digital data

    NASA Technical Reports Server (NTRS)

    Ballew, G.

    1977-01-01

    The ability of Landsat multispectral digital data to differentiate among 62 combinations of rock and alteration types at the Goldfield mining district of Western Nevada was investigated by using statistical techniques of cluster and discriminant analysis. Multivariate discriminant analysis was not effective in classifying each of the 62 groups, with classification results essentially the same whether data of four channels alone or combined with six ratios of channels were used. Bivariate plots of group means revealed a cluster of three groups including mill tailings, basalt and all other rock and alteration types. Automatic hierarchical clustering based on the fourth dimensional Mahalanobis distance between group means of 30 groups having five or more samples was performed. The results of the cluster analysis revealed hierarchies of mill tailings vs. natural materials, basalt vs. non-basalt, highly reflectant rocks vs. other rocks and exclusively unaltered rocks vs. predominantly altered rocks. The hierarchies were used to determine the order in which sets of multiple discriminant analyses were to be performed and the resulting discriminant functions were used to produce a map of geology and alteration which has an overall accuracy of 70 percent for discriminating exclusively altered rocks from predominantly altered rocks.

  11. Analysis of a continuous-variable quadripartite cluster state from a single optical parametric oscillator

    SciTech Connect

    Midgley, S. L. W.; Olsen, M. K.; Bradley, A. S.; Pfister, O.

    2010-11-15

    We examine the feasibility of generating continuous-variable multipartite entanglement in an intracavity concurrent downconversion scheme that has been proposed for the generation of cluster states by Menicucci et al. [Phys. Rev. Lett. 101, 130501 (2008)]. By calculating optimized versions of the van Loock-Furusawa correlations we demonstrate genuine quadripartite entanglement and investigate the degree of entanglement present. Above the oscillation threshold the basic cluster state geometry under consideration suffers from phase diffusion. We alleviate this problem by incorporating a small injected signal into our analysis. Finally, we investigate squeezed joint operators. While the squeezed joint operators approach zero in the undepleted regime, we find that this is not the case when we consider the full interaction Hamiltonian and the presence of a cavity. In fact, we find that the decay of these operators is minimal in a cavity, and even depletion alone inhibits cluster state formation.

  12. [Investigation of fuzzy-clustering in octane number prediction model based on detailed hydrocarbon analysis data].

    PubMed

    Liu, Yingrong; Xu, Yupeng; Yang, Haiying

    2004-09-01

    A method to establish octane number prediction model based on detailed hydrocarbon analysis (DHA) data is presented. The techniques of fuzzy-clustering and the Euclidian distance are employed to select the samples needed in pattern establishment. One hundred and fifty gasoline samples and an amount of 140 characteristic components in the DHA chromatogram of each sample are used for the fuzzy-clustering research. It is found that the 3 - 10 samples, which have the nearest Euclidian distance ( < 1.5) to the prediction sample in the same cluster, are enough to build the octane number prediction model. The experimental results proved that the model obtained according to the above method has more predictable accuracy, wider application range and higher data resource utility compared with the current prediction method.

  13. 3D Building Models Segmentation Based on K-Means++ Cluster Analysis

    NASA Astrophysics Data System (ADS)

    Zhang, C.; Mao, B.

    2016-10-01

    3D mesh model segmentation is drawing increasing attentions from digital geometry processing field in recent years. The original 3D mesh model need to be divided into separate meaningful parts or surface patches based on certain standards to support reconstruction, compressing, texture mapping, model retrieval and etc. Therefore, segmentation is a key problem for 3D mesh model segmentation. In this paper, we propose a method to segment Collada (a type of mesh model) 3D building models into meaningful parts using cluster analysis. Common clustering methods segment 3D mesh models by K-means, whose performance heavily depends on randomized initial seed points (i.e., centroid) and different randomized centroid can get quite different results. Therefore, we improved the existing method and used K-means++ clustering algorithm to solve this problem. Our experiments show that K-means++ improves both the speed and the accuracy of K-means, and achieve good and meaningful results.

  14. Cluster Method Analysis of K. S. C. Image

    NASA Technical Reports Server (NTRS)

    Rodriguez, Joe, Jr.; Desai, M.

    1997-01-01

    Information obtained from satellite-based systems has moved to the forefront as a method in the identification of many land cover types. Identification of different land features through remote sensing is an effective tool for regional and global assessment of geometric characteristics. Classification data acquired from remote sensing images have a wide variety of applications. In particular, analysis of remote sensing images have special applications in the classification of various types of vegetation. Results obtained from classification studies of a particular area or region serve towards a greater understanding of what parameters (ecological, temporal, etc.) affect the region being analyzed. In this paper, we make a distinction between both types of classification approaches although, focus is given to the unsupervised classification method using 1987 Thematic Mapped (TM) images of Kennedy Space Center.

  15. Links between patterns of racial socialization and discrimination experiences and psychological adjustment: a cluster analysis.

    PubMed

    Ajayi, Alex A; Syed, Moin

    2014-10-01

    This study used a person-oriented analytic approach to identify meaningful patterns of barriers-focused racial socialization and perceived racial discrimination experiences in a sample of 295 late adolescents. Using cluster analysis, three distinct groups were identified: Low Barrier Socialization-Low Discrimination, High Barrier Socialization-Low Discrimination, and High Barrier Socialization-High Discrimination clusters. These groups were substantively unique in terms of the frequency of racial socialization messages about bias preparation and out-group mistrust its members received and their actual perceived discrimination experiences. Further, individuals in the High Barrier Socialization-High Discrimination cluster reported significantly higher depressive symptoms than those in the Low Barrier Socialization-Low Discrimination and High Barrier Socialization-Low Discrimination clusters. However, no differences in adjustment were observed between the Low Barrier Socialization-Low Discrimination and High Barrier Socialization-Low Discrimination clusters. Overall, the findings highlight important individual differences in how young people of color experience their race and how these differences have significant implications on psychological adjustment.

  16. Proteomic Analysis of Protein-Protein Interactions within the CSD Fe-S Cluster Biogenesis System

    PubMed Central

    Bolstad, Heather M.; Botelho, Danielle J.; Wood, Matthew J.

    2010-01-01

    Fe-S cluster biogenesis is of interest to many fields, including bioenergetics and gene regulation. The CSD system is one of three Fe-S cluster biogenesis systems in E. coli and is comprised of the cysteine desulfurase CsdA, the sulfur acceptor protein CsdE, and the E1-like protein CsdL. The biological role, biochemical mechanism, and protein targets of the system remain uncharacterized. Here we present that the active site CsdE C61 has a lowered pKa value of 6.5, which is nearly identical to that of C51 in the homologous SufE protein and which is likely critical for its function. We observed that CsdE forms disulfide bonds with multiple proteins and identified the proteins that copurify with CsdE. The identification of Fe-S proteins and both putative and established Fe-S cluster assembly (ErpA, glutaredoxin-3, glutaredoxin-4) and sulfur trafficking (CsdL, YchN) proteins supports the two-pathway model, in which the CSD system is hypothesized to synthesize both Fe-S clusters and other sulfur-containing cofactors. We suggest that the identified Fe-S cluster assembly proteins may be the scaffold and/or shuttle proteins for the CSD system. By comparison with previous analysis of SufE, we demonstrate that there is some overlap in the CsdE and SufE interactomes. PMID:20734996

  17. Exploring the application of latent class cluster analysis for investigating pedestrian crash injury severities in Switzerland.

    PubMed

    Sasidharan, Lekshmi; Wu, Kun-Feng; Menendez, Monica

    2015-12-01

    One of the major challenges in traffic safety analyses is the heterogeneous nature of safety data, due to the sundry factors involved in it. This heterogeneity often leads to difficulties in interpreting results and conclusions due to unrevealed relationships. Understanding the underlying relationship between injury severities and influential factors is critical for the selection of appropriate safety countermeasures. A method commonly employed to address systematic heterogeneity is to focus on any subgroup of data based on the research purpose. However, this need not ensure homogeneity in the data. In this paper, latent class cluster analysis is applied to identify homogenous subgroups for a specific crash type-pedestrian crashes. The manuscript employs data from police reported pedestrian (2009-2012) crashes in Switzerland. The analyses demonstrate that dividing pedestrian severity data into seven clusters helps in reducing the systematic heterogeneity of the data and to understand the hidden relationships between crash severity levels and socio-demographic, environmental, vehicle, temporal, traffic factors, and main reason for the crash. The pedestrian crash injury severity models were developed for the whole data and individual clusters, and were compared using receiver operating characteristics curve, for which results favored clustering. Overall, the study suggests that latent class clustered regression approach is suitable for reducing heterogeneity and revealing important hidden relationships in traffic safety analyses.

  18. Cluster-based analysis for personalized stress evaluation using physiological signals.

    PubMed

    Xu, Qianli; Nwe, Tin Lay; Guan, Cuntai

    2015-01-01

    Technology development in wearable sensors and biosignal processing has made it possible to detect human stress from the physiological features. However, the intersubject difference in stress responses presents a major challenge for reliable and accurate stress estimation. This research proposes a novel cluster-based analysis method to measure perceived stress using physiological signals, which accounts for the intersubject differences. The physiological data are collected when human subjects undergo a series of task-rest cycles, incurring varying levels of stress that is indicated by an index of the State Trait Anxiety Inventory. Next, a quantitative measurement of stress is developed by analyzing the physiological features in two steps: 1) a k -means clustering process to divide subjects into different categories (clusters), and 2) cluster-wise stress evaluation using the general regression neural network. Experimental results show a significant improvement in evaluation accuracy as compared to traditional methods without clustering. The proposed method is useful in developing intelligent, personalized products for human stress management.

  19. A cluster analysis of neuronal activity in the dorsal premotor cortical area for neuroprosthetic control.

    PubMed

    Ye, N; Roontiva, A; He, J

    2008-01-01

    With the use of the neuronal data acquisition technology, millisecond-level multi-electrode data from several regions of the premotor area were obtained from two rhesus monkeys trained to perform arm-reach tasks with visual cues in virtual reality. In each trial, animals were required to select and perform one of the four possible arm reaching movements to the target on the top-left or top-right of the virtual reality space. They were also required to decide whether they would move their arms straight to the target or curve them in order to avoid the obstacle that was presented. After the acquired neuronal signals were processed, unsupervised Hierarchical clustering and K-means clustering were performed to uncover the similarity and difference in the average firing rate of spike train data between neurons and phases for each experiment condition. The clustering results indicate the similarity of neuronal data in the movement planning and actual movement phases, and the difference of such data from the data in information processing phases. Furthermore, the clustering results show that when the target location is on the right, the move planning started earlier. The analysis of variance (ANOVA) on the neuronal data confirms the results from the hierarchical clustering.

  20. Integrated simultaneous analysis of different biomedical data types with exact weighted bi-cluster editing.

    PubMed

    Sun, Peng; Guo, Jiong; Baumbach, Jan

    2012-07-17

    The explosion of biological data has largely influenced the focus of today’s biology research. Integrating and analysing large quantity of data to provide meaningful insights has become the main challenge to biologists and bioinformaticians. One major problem is the combined data analysis of data from different types, such as phenotypes and genotypes. This data is modelled as bi-partite graphs where nodes correspond to the different data points, mutations and diseases for instance, and weighted edges relate to associations between them. Bi-clustering is a special case of clustering designed for partitioning two different types of data simultaneously. We present a bi-clustering approach that solves the NP-hard weighted bi-cluster editing problem by transforming a given bi-partite graph into a disjoint union of bi-cliques. Here we contribute with an exact algorithm that is based on fixed-parameter tractability. We evaluated its performance on artificial graphs first. Afterwards we exemplarily applied our Java implementation to data of genome-wide association studies (GWAS) data aiming for discovering new, previously unobserved geno-to-pheno associations. We believe that our results will serve as guidelines for further wet lab investigations. Generally our software can be applied to any kind of data that can be modelled as bi-partite graphs. To our knowledge it is the fastest exact method for weighted bi-cluster editing problem.

  1. A cluster analysis of the neurons of the rat interpeduncular nucleus.

    PubMed Central

    Gioia, M; Vizzotto, L; Bianchi, R

    1994-01-01

    The morphometric characteristics of the neurons of the interpeduncular nucleus (IPN) in the rat were investigated by cluster analysis in order to identify neuronal groups which are morphometrically homogeneous, and to define their position and density in the IPN subnuclei. Two clusters of cells were detected. Cluster 1 neurons had a larger perikaryal size with a mean cross-sectional area of 170 microns2 and a high nuclear/cytoplasmic ratio. They were located mainly in the pars dorsalis (37%) and pars medialis (34%) rather than in the pars lateralis (29%). Cluster 1 neurons were also more frequent at the rostral (31%) and caudal (57%) poles than in the central part of the IPN. Cluster 2 cells showed a smaller mean perikaryal area (110 microns2), a small nucleus and abundant cytoplasm. They were equally distributed throughout the whole IPN. These findings suggest the existence of a magnocellular region at the rostral pole of the IPN which has not been described previously. The presence of IPN regions endowed with specific cytoarchitectural characteristics is discussed with respect to the complex neurochemical organisation of the nucleus. Images Fig. 1 Fig. 2 Fig. 4 PMID:7649781

  2. The heterogeneity of headache patients who self-medicate: a cluster analysis approach.

    PubMed

    Mehuys, Els; Paemeleire, Koen; Crombez, Geert; Adriaens, Els; Van Hees, Thierry; Demarche, Sophie; Christiaens, Thierry; Van Bortel, Luc; Van Tongelen, Inge; Remon, Jean-Paul; Boussery, Koen

    2016-07-01

    Patients with headache often self-treat their condition with over-the-counter analgesics. However, overuse of analgesics can cause medication-overuse headache. The present study aimed to identify subgroups of individuals with headache who self-medicate, as this could be helpful to tailor intervention strategies for prevention of medication-overuse headache. Patients (n = 1021) were recruited from 202 community pharmacies and completed a self-administered questionnaire. A hierarchical cluster analysis was used to group patients as a function of sociodemographics, pain, disability, and medication use for pain. Three patient clusters were identified. Cluster 1 (n = 498, 48.8%) consisted of relatively young individuals, and most of them suffered from migraine. They reported the least number of other pain complaints and the lowest prevalence of medication overuse (MO; 16%). Cluster 2 (n = 301, 29.5%) included older persons with mainly non-migraine headache, a low disability, and on average pain in 2 other locations. Prevalence of MO was 40%. Cluster 3 (n = 222, 21.7%) mostly consisted of patients with migraine who also report pain in many other locations. These patients reported a high disability and a severe limitation of activities. They also showed the highest rates of MO (73%).

  3. Phenotype Clustering of Breast Epithelial Cells in Confocal Imagesbased on Nuclear Protein Distribution Analysis

    SciTech Connect

    Long, Fuhui; Peng, Hanchuan; Sudar, Damir; Levievre, Sophie A.; Knowles, David W.

    2006-09-05

    Background: The distribution of the chromatin-associatedproteins plays a key role in directing nuclear function. Previously, wedeveloped an image-based method to quantify the nuclear distributions ofproteins and showed that these distributions depended on the phenotype ofhuman mammary epithelial cells. Here we describe a method that creates ahierarchical tree of the given cell phenotypes and calculates thestatistical significance between them, based on the clustering analysisof nuclear protein distributions. Results: Nuclear distributions ofnuclear mitotic apparatus protein were previously obtained fornon-neoplastic S1 and malignant T4-2 human mammary epithelial cellscultured for up to 12 days. Cell phenotype was defined as S1 or T4-2 andthe number of days in cultured. A probabilistic ensemble approach wasused to define a set of consensus clusters from the results of multipletraditional cluster analysis techniques applied to the nucleardistribution data. Cluster histograms were constructed to show how cellsin any one phenotype were distributed across the consensus clusters.Grouping various phenotypes allowed us to build phenotype trees andcalculate the statistical difference between each group. The resultsshowed that non-neoplastic S1 cells could be distinguished from malignantT4-2 cells with 94.19 percent accuracy; that proliferating S1 cells couldbe distinguished from differentiated S1 cells with 92.86 percentaccuracy; and showed no significant difference between the variousphenotypes of T4-2 cells corresponding to increasing tumor sizes.Conclusion: This work presents a cluster analysis method that canidentify significant cell phenotypes, based on the nuclear distributionof specific proteins, with high accuracy.

  4. Validation of hierarchical cluster analysis for identification of bacterial species using 42 bacterial isolates

    NASA Astrophysics Data System (ADS)

    Ghebremedhin, Meron; Yesupriya, Shubha; Luka, Janos; Crane, Nicole J.

    2015-03-01

    Recent studies have demonstrated the potential advantages of the use of Raman spectroscopy in the biomedical field due to its rapidity and noninvasive nature. In this study, Raman spectroscopy is applied as a method for differentiating between bacteria isolates for Gram status and Genus species. We created models for identifying 28 bacterial isolates using spectra collected with a 785 nm laser excitation Raman spectroscopic system. In order to investigate the groupings of these samples, partial least squares discriminant analysis (PLSDA) and hierarchical cluster analysis (HCA) was implemented. In addition, cluster analyses of the isolates were performed using various data types consisting of, biochemical tests, gene sequence alignment, high resolution melt (HRM) analysis and antimicrobial susceptibility tests of minimum inhibitory concentration (MIC) and degree of antimicrobial resistance (SIR). In order to evaluate the ability of these models to correctly classify bacterial isolates using solely Raman spectroscopic data, a set of 14 validation samples were tested using the PLSDA models and consequently the HCA models. External cluster evaluation criteria of purity and Rand index were calculated at different taxonomic levels to compare the performance of clustering using Raman spectra as well as the other datasets. Results showed that Raman spectra performed comparably, and in some cases better than, the other data types with Rand index and purity values up to 0.933 and 0.947, respectively. This study clearly demonstrates that the discrimination of bacterial species using Raman spectroscopic data and hierarchical cluster analysis is possible and has the potential to be a powerful point-of-care tool in clinical settings.

  5. [Optimization of cluster analysis based on drug resistance profiles of MRSA isolates].

    PubMed

    Tani, Hiroya; Kishi, Takahiko; Gotoh, Minehiro; Yamagishi, Yuka; Mikamo, Hiroshige

    2015-12-01

    We examined 402 methicillin-resistant Staphylococcus aureus (MRSA) strains isolated from clinical specimens in our hospital between November 19, 2010 and December 27, 2011 to evaluate the similarity between cluster analysis of drug susceptibility tests and pulsed-field gel electrophoresis (PFGE). The results showed that the 402 strains tested were classified into 27 PFGE patterns (151 subtypes of patterns). Cluster analyses of drug susceptibility tests with the cut-off distance yielding a similar classification capability showed favorable results--when the MIC method was used, and minimum inhibitory concentration (MIC) values were used directly in the method, the level of agreement with PFGE was 74.2% when 15 drugs were tested. The Unweighted Pair Group Method with Arithmetic mean (UPGMA) method was effective when the cut-off distance was 16. Using the SIR method in which susceptible (S), intermediate (I), and resistant (R) were coded as 0, 2, and 3, respectively, according to the Clinical and Laboratory Standards Institute (CLSI) criteria, the level of agreement with PFGE was 75.9% when the number of drugs tested was 17, the method used for clustering was the UPGMA, and the cut-off distance was 3.6. In addition, to assess the reproducibility of the results, 10 strains were randomly sampled from the overall test and subjected to cluster analysis. This was repeated 100 times under the same conditions. The results indicated good reproducibility of the results, with the level of agreement with PFGE showing a mean of 82.0%, standard deviation of 12.1%, and mode of 90.0% for the MIC method and a mean of 80.0%, standard deviation of 13.4%, and mode of 90.0% for the SIR method. In summary, cluster analysis for drug susceptibility tests is useful for the epidemiological analysis of MRSA.

  6. Globular Cluster Abundances from High-resolution, Integrated-light Spectroscopy. II. Expanding the Metallicity Range for Old Clusters and Updated Analysis Techniques

    NASA Astrophysics Data System (ADS)

    Colucci, Janet E.; Bernstein, Rebecca A.; McWilliam, Andrew

    2017-01-01

    We present abundances of globular clusters (GCs) in the Milky Way and Fornax from integrated-light (IL) spectra. Our goal is to evaluate the consistency of the IL analysis relative to standard abundance analysis for individual stars in those same clusters. This sample includes an updated analysis of seven clusters from our previous publications and results for five new clusters that expand the metallicity range over which our technique has been tested. We find that the [Fe/H] measured from IL spectra agrees to ∼0.1 dex for GCs with metallicities as high as [Fe/H] = ‑0.3, but the abundances measured for more metal-rich clusters may be underestimated. In addition we systematically evaluate the accuracy of abundance ratios, [X/Fe], for Na i, Mg i, Al i, Si i, Ca i, Ti i, Ti ii, Sc ii, V i, Cr i, Mn i, Co i, Ni i, Cu i, Y ii, Zr i, Ba ii, La ii, Nd ii, and Eu ii. The elements for which the IL analysis gives results that are most similar to analysis of individual stellar spectra are Fe i, Ca i, Si i, Ni i, and Ba ii. The elements that show the greatest differences include Mg i and Zr i. Some elements show good agreement only over a limited range in metallicity. More stellar abundance data in these clusters would enable more complete evaluation of the IL results for other important elements. This paper includes data gathered with the 6.5 m Magellan Telescopes located at Las Campanas Observatory, Chile.

  7. Clustering Analysis of OFFICER'S Behaviours in London Police Foot Patrol Activities

    NASA Astrophysics Data System (ADS)

    Shen, J.; Cheng, T.

    2015-07-01

    In this small paper we aim at presenting a framework of conceptual representation and clustering analysis of police officers' patrol pattern obtained from mining their raw movement trajectory data. This have been achieved by a model developed to accounts for the spatio-temporal dynamics human movements by incorporating both the behaviour features of the travellers and the semantic meaning of the environment they are moving in. Hence, the similarity metric of traveller behaviours is jointly defined according to the stay time allocation in each Spatio-temporal region of interests (ST-ROI) to support clustering analysis of patrol behaviours. The proposed framework enables the analysis of behaviour and preferences on higher level based on raw moment trajectories. The model is firstly applied to police patrol data provided by the Metropolitan Police and will be tested by other type of dataset afterwards.

  8. Cluster Analysis of vents in monogenetic volcanic fields, Lunar Crater Volcanic Field (Nevada)

    NASA Astrophysics Data System (ADS)

    Tadini, A.; Cortes, J. A.; Valentine, G. A.; Johnson, P. J.; Tibaldi, A.; Bonali, F. L.

    2012-12-01

    are very sensitive to temporal and magma flux variation. An empirical approach such as the one proposed here seems to be a more robust technique. [1] Jerram et al., (1996) Contrib. to Min. and Petrol. 125: 60-74. [2] Chiasera and Cortés (2011) Journal of Volc. And Geotherm. Res. 207(3-4): 83-92. [3] Everitt et al., (2001) Cluster Analysis, Oxford University Press. [4] Baloga et al., (2007) Journal of Geophys. Res. 112, E03002, doi:10.1029/2005JE002652

  9. Arthropod monitoring for fine-scale habitat analysis: A case study of the El Segundo sand dunes

    SciTech Connect

    Mattoni, R.; Longcore, T.; Novotny, V.

    2000-04-01

    Arthropod communities from several habitats on and adjacent to the El Segundo dunes (Los Angeles County, CA) were sampled using pitfall and yellow pan traps to evaluate their possible use as indicators of restoration success. Communities were ordinated and clustered using correspondence analysis, detrended correspondence analysis, two-way indicator species analysis, and Ward's method of agglomerative clustering. The results showed high repeatability among replicates within any sampling arena that permits discrimination of (1) degraded and relatively undisturbed habitat, (2) different dune habitat types, and (3) annual change. Canonical correspondence analysis showed a significant effect of disturbance history on community composition that explained 5--20% of the variation. Replicates of pitfall and yellow pan traps on single sites clustered together reliably when species abundance was considered, whereas clusters using only species incidence did not group replicates as consistently. The broad taxonomic approach seems appropriate for habitat evaluation and monitoring of restoration projects as an alternative to assessments geared to single species or even single families.

  10. Joint Analysis of Cluster Observations. II. Chandra/XMM-Newton X-Ray and Weak Lensing Scaling Relations for a Sample of 50 Rich Clusters of Galaxies

    NASA Astrophysics Data System (ADS)

    Mahdavi, Andisheh; Hoekstra, Henk; Babul, Arif; Bildfell, Chris; Jeltema, Tesla; Henry, J. Patrick

    2013-04-01

    We present a study of multiwavelength X-ray and weak lensing scaling relations for a sample of 50 clusters of galaxies. Our analysis combines Chandra and XMM-Newton data using an energy-dependent cross-calibration. After considering a number of scaling relations, we find that gas mass is the most robust estimator of weak lensing mass, yielding 15% ± 6% intrinsic scatter at r500WL (the pseudo-pressure YX yields a consistent scatter of 22% ± 5%). The scatter does not change when measured within a fixed physical radius of 1 Mpc. Clusters with small brightest cluster galaxy (BCG) to X-ray peak offsets constitute a very regular population whose members have the same gas mass fractions and whose even smaller (<10%) deviations from regularity can be ascribed to line of sight geometrical effects alone. Cool-core clusters, while a somewhat different population, also show the same (<10%) scatter in the gas mass-lensing mass relation. There is a good correlation and a hint of bimodality in the plane defined by BCG offset and central entropy (or central cooling time). The pseudo-pressure YX does not discriminate between the more relaxed and less relaxed populations, making it perhaps the more even-handed mass proxy for surveys. Overall, hydrostatic masses underestimate weak lensing masses by 10% on the average at r500WL; but cool-core clusters are consistent with no bias, while non-cool-core clusters have a large and constant 15%-20% bias between r2500WL and r500WL, in agreement with N-body simulations incorporating unthermalized gas. For non-cool-core clusters, the bias correlates well with BCG ellipticity. We also examine centroid shift variance and power ratios to quantify substructure; these quantities do not correlate with residuals in the scaling relations. Individual clusters have for the most part forgotten the source of their departures from self-similarity.

  11. A cluster analysis on students' perceived motivational climate. Implications on psycho-social variables.

    PubMed

    Fernandez-Rio, Javier; Méndez-Giménez, Antonio; Cecchini Estrada, Jose A

    2014-01-01

    The aim of this study was to examine how students' perceptions of the class climate influence their basic psychological needs, motivational regulations, social goals and outcomes such as boredom, enjoyment, effort, and pressure/tension. 507 (267 males, 240 females) secondary education students agreed to participate. They completed a questionnaire that included the Spanish validated versions of Perceived Motivational Climate in Sport Questionnaire (PMCSQ-2), Basic Psychological Needs in Exercise (BPNES), Perceived Locus of Causality (PLOC), Social Goal Scale-Physical Education (SGS-PE), and several subscales of the IMI. A hierarchical cluster analysis uncovered four independent class climate profiles that were confirmed by a K-Means cluster analysis: "high ego", "low ego-task", "high ego-medium task", and "high task". Several MANOVAs were performed using these clusters as independent variables and the different outcomes as dependent variables (p < .01). Results linked high mastery class climates to positive consequences such as higher students' autonomy, competence, relatedness, intrinsic motivation, effort, enjoyment, responsibility and relationship, as well as low levels of amotivation, boredom and pressure/tension. Students' perceptions of a performance class climate made the positive scores decrease significantly. Cluster 3 revealed that a mastery oriented class structure undermines the negative behavioral and psychological effects of a performance class climate. This finding supports the buffering hypothesis of the achievement goal theory.

  12. A model-based cluster analysis approach to adolescent problem behaviors and young adult outcomes.

    PubMed

    Mun, Eun Young; Windle, Michael; Schainker, Lisa M

    2008-01-01

    Data from a community-based sample of 1,126 10th- and 11th-grade adolescents were analyzed using a model-based cluster analysis approach to empirically identify heterogeneous adolescent subpopulations from the person-oriented and pattern-oriented perspectives. The model-based cluster analysis is a new clustering procedure to investigate population heterogeneity utilizing finite mixture multivariate normal densities and accordingly to classify subpopulations using more rigorous statistical procedures for the comparison of alternative models. Four cluster groups were identified and labeled multiproblem high-risk, smoking high-risk, normative, and low-risk groups. The multiproblem high risk exhibited a constellation of high levels of problem behaviors, including delinquent and sexual behaviors, multiple illicit substance use, and depressive symptoms at age 16. They had risky temperamental attributes and lower academic functioning and educational expectations at age 15.5 and, subsequently, at age 24 completed fewer years of education, and reported lower levels of physical health and higher levels of continued involvement in substance use and abuse. The smoking high-risk group was also found to be at risk for poorer functioning in young adulthood, compared to the low-risk group. The normative and the low risk groups were, by and large, similar in their adolescent and young adult functioning. The continuity and comorbidity path from middle adolescence to young adulthood may be aided and abetted by chronic as well as episodic substance use by adolescents.

  13. Expanded natural product diversity revealed by analysis of lanthipeptide-like gene clusters in actinobacteria.

    PubMed

    Zhang, Qi; Doroghazi, James R; Zhao, Xiling; Walker, Mark C; van der Donk, Wilfred A

    2015-07-01

    Lanthionine-containing peptides (lanthipeptides) are a rapidly growing family of polycyclic peptide natural products belonging to the large class of ribosomally synthesized and posttranslationally modified peptides (RiPPs). Lanthipeptides are widely distributed in taxonomically distant species, and their currently known biosynthetic systems and biological activities are diverse. Building on the recent natural product gene cluster family (GCF) project, we report here large-scale analysis of lanthipeptide-like biosynthetic gene clusters from Actinobacteria. Our analysis suggests that lanthipeptide biosynthetic pathways, and by extrapolation the natural products themselves, are much more diverse than currently appreciated and contain many different posttranslational modifications. Furthermore, lanthionine synthetases are much more diverse in sequence and domain topology than currently characterized systems, and they are used by the biosynthetic machineries for natural products other than lanthipeptides. The gene cluster families described here significantly expand the chemical diversity and biosynthetic repertoire of lanthionine-related natural products. Biosynthesis of these novel natural products likely involves unusual and unprecedented biochemistries, as illustrated by several examples discussed in this study. In addition, class IV lanthipeptide gene clusters are shown not to be silent, setting the stage to investigate their biological activities.

  14. Expanded Natural Product Diversity Revealed by Analysis of Lanthipeptide-Like Gene Clusters in Actinobacteria

    PubMed Central

    Zhang, Qi; Doroghazi, James R.; Zhao, Xiling; Walker, Mark C.

    2015-01-01

    Lanthionine-containing peptides (lanthipeptides) are a rapidly growing family of polycyclic peptide natural products belonging to the large class of ribosomally synthesized and posttranslationally modified peptides (RiPPs). Lanthipeptides are widely distributed in taxonomically distant species, and their currently known biosynthetic systems and biological activities are diverse. Building on the recent natural product gene cluster family (GCF) project, we report here large-scale analysis of lanthipeptide-like biosynthetic gene clusters from Actinobacteria. Our analysis suggests that lanthipeptide biosynthetic pathways, and by extrapolation the natural products themselves, are much more diverse than currently appreciated and contain many different posttranslational modifications. Furthermore, lanthionine synthetases are much more diverse in sequence and domain topology than currently characterized systems, and they are used by the biosynthetic machineries for natural products other than lanthipeptides. The gene cluster families described here significantly expand the chemical diversity and biosynthetic repertoire of lanthionine-related natural products. Biosynthesis of these novel natural products likely involves unusual and unprecedented biochemistries, as illustrated by several examples discussed in this study. In addition, class IV lanthipeptide gene clusters are shown not to be silent, setting the stage to investigate their biological activities. PMID:25888176

  15. Bayesian Analysis of Two Stellar Populations in Galactic Globular Clusters. II. NGC 5024, NGC 5272, and NGC 6352

    NASA Astrophysics Data System (ADS)

    Wagner-Kaiser, R.; Stenning, D. C.; Robinson, E.; von Hippel, T.; Sarajedini, A.; van Dyk, D. A.; Stein, N.; Jefferys, W. H.

    2016-07-01

    We use Cycle 21 Hubble Space Telescope (HST) observations and HST archival Advanced Camera for Surveys Treasury observations of Galactic Globular Clusters to find and characterize two stellar populations in NGC 5024 (M53), NGC 5272 (M3), and NGC 6352. For these three clusters, both single and double-population analyses are used to determine a best fit isochrone(s). We employ a sophisticated Bayesian analysis technique to simultaneously fit the cluster parameters (age, distance, absorption, and metallicity) that characterize each cluster. For the two-population analysis, unique population level helium values are also fit to each distinct population of the cluster and the relative proportions of the populations are determined. We find differences in helium ranging from ˜0.05 to 0.11 for these three clusters. Model grids with solar α-element abundances ([α/Fe] = 0.0) and enhanced α-elements ([α/Fe] = 0.4) are adopted.

  16. Spatial cluster analysis of nanoscopically mapped serotonin receptors for classification of fixed brain tissue

    NASA Astrophysics Data System (ADS)

    Sams, Michael; Silye, Rene; Göhring, Janett; Muresan, Leila; Schilcher, Kurt; Jacak, Jaroslaw

    2014-01-01

    We present a cluster spatial analysis method using nanoscopic dSTORM images to determine changes in protein cluster distributions within brain tissue. Such methods are suitable to investigate human brain tissue and will help to achieve a deeper understanding of brain disease along with aiding drug development. Human brain tissue samples are usually treated postmortem via standard fixation protocols, which are established in clinical laboratories. Therefore, our localization microscopy-based method was adapted to characterize protein density and protein cluster localization in samples fixed using different protocols followed by common fluorescent immunohistochemistry techniques. The localization microscopy allows nanoscopic mapping of serotonin 5-HT1A receptor groups within a two-dimensional image of a brain tissue slice. These nanoscopically mapped proteins can be confined to clusters by applying the proposed statistical spatial analysis. Selected features of such clusters were subsequently used to characterize and classify the tissue. Samples were obtained from different types of patients, fixed with different preparation methods, and finally stored in a human tissue bank. To verify the proposed method, samples of a cryopreserved healthy brain have been compared with epitope-retrieved and paraffin-fixed tissues. Furthermore, samples of healthy brain tissues were compared with data obtained from patients suffering from mental illnesses (e.g., major depressive disorder). Our work demonstrates the applicability of localization microscopy and image analysis methods for comparison and classification of human brain tissues at a nanoscopic level. Furthermore, the presented workflow marks a unique technological advance in the characterization of protein distributions in brain tissue sections.

  17. Spatial cluster analysis of nanoscopically mapped serotonin receptors for classification of fixed brain tissue.

    PubMed

    Sams, Michael; Silye, Rene; Göhring, Janett; Muresan, Leila; Schilcher, Kurt; Jacak, Jaroslaw

    2014-01-01

    We present a cluster spatial analysis method using nanoscopic dSTORM images to determine changes in protein cluster distributions within brain tissue. Such methods are suitable to investigate human brain tissue and will help to achieve a deeper understanding of brain disease along with aiding drug development. Human brain tissue samples are usually treated postmortem via standard fixation protocols, which are established in clinical laboratories. Therefore, our localization microscopy-based method was adapted to characterize protein density and protein cluster localization in samples fixed using different protocols followed by common fluorescent immunohistochemistry techniques. The localization microscopy allows nanoscopic mapping of serotonin 5-HT1A receptor groups within a two-dimensional image of a brain tissue slice. These nanoscopically mapped proteins can be confined to clusters by applying the proposed statistical spatial analysis. Selected features of such clusters were subsequently used to characterize and classify the tissue. Samples were obtained from different types of patients, fixed with different preparation methods, and finally stored in a human tissue bank. To verify the proposed method, samples of a cryopreserved healthy brain have been compared with epitope-retrieved and paraffin-fixed tissues. Furthermore, samples of healthy brain tissues were compared with data obtained from patients suffering from mental illnesses (e.g., major depressive disorder). Our work demonstrates the applicability of localization microscopy and image analysis methods for comparison and classification of human brain tissues at a nanoscopic level. Furthermore, the presented workflow marks a unique technological advance in the characterization of protein distributions in brain tissue sections.

  18. Analysis of local bond-orientational order for liquid gallium at ambient pressure: Two types of cluster structures

    NASA Astrophysics Data System (ADS)

    Chen, Lin-Yuan; Tang, Ping-Han; Wu, Ten-Ming

    2016-07-01

    In terms of the local bond-orientational order (LBOO) parameters, a cluster approach to analyze local structures of simple liquids was developed. In this approach, a cluster is defined as a combination of neighboring seeds having at least nb local-orientational bonds and their nearest neighbors, and a cluster ensemble is a collection of clusters with a specified nb and number of seeds ns. This cluster analysis was applied to investigate the microscopic structures of liquid Ga at ambient pressure (AP). The liquid structures studied were generated through ab initio molecular dynamics simulations. By scrutinizing the static structure factors (SSFs) of cluster ensembles with different combinations of nb and ns, we found that liquid Ga at AP contained two types of cluster structures, one characterized by sixfold orientational symmetry and the other showing fourfold orientational symmetry. The SSFs of cluster structures with sixfold orientational symmetry were akin to the SSF of a hard-sphere fluid. On the contrary, the SSFs of cluster structures showing fourfold orientational symmetry behaved similarly as the anomalous SSF of liquid Ga at AP, which is well known for exhibiting a high-q shoulder. The local structures of a highly LBOO cluster whose SSF displayed a high-q shoulder were found to be more similar to the structure of β-Ga than those of other solid phases of Ga. More generally, the cluster structures showing fourfold orientational symmetry have an inclination to resemble more to β-Ga.

  19. Analysis of local bond-orientational order for liquid gallium at ambient pressure: Two types of cluster structures.

    PubMed

    Chen, Lin-Yuan; Tang, Ping-Han; Wu, Ten-Ming

    2016-07-14

    In terms of the local bond-orientational order (LBOO) parameters, a cluster approach to analyze local structures of simple liquids was developed. In this approach, a cluster is defined as a combination of neighboring seeds having at least nb local-orientational bonds and their nearest neighbors, and a cluster ensemble is a collection of clusters with a specified nb and number of seeds ns. This cluster analysis was applied to investigate the microscopic structures of liquid Ga at ambient pressure (AP). The liquid structures studied were generated through ab initio molecular dynamics simulations. By scrutinizing the static structure factors (SSFs) of cluster ensembles with different combinations of nb and ns, we found that liquid Ga at AP contained two types of cluster structures, one characterized by sixfold orientational symmetry and the other showing fourfold orientational symmetry. The SSFs of cluster structures with sixfold orientational symmetry were akin to the SSF of a hard-sphere fluid. On the contrary, the SSFs of cluster structures showing fourfold orientational symmetry behaved similarly as the anomalous SSF of liquid Ga at AP, which is well known for exhibiting a high-q shoulder. The local structures of a highly LBOO cluster whose SSF displayed a high-q shoulder were found to be more similar to the structure of β-Ga than those of other solid phases of Ga. More generally, the cluster structures showing fourfold orientational symmetry have an inclination to resemble more to β-Ga.

  20. Validation of disease states in schizophrenia: comparison of cluster analysis between US and European populations

    PubMed Central

    Thokagevistk, Katia; Millier, Aurélie; Lenert, Leslie; Sadikhov, Shamil; Moreno, Santiago; Toumi, Mondher

    2016-01-01

    Background There is controversy as to whether use of statistical clustering methods to identify common disease patterns in schizophrenia identifies patterns generalizable across countries. Objective The goal of this study was to compare disease states identified in a published study (Mohr/Lenert, 2004) considering US patients to disease states in a European cohort (EuroSC) considering English, French, and German patients. Methods Using methods paralleling those in Mohr/Lenert, we conducted a principal component analysis (PCA) on Positive and Negative Syndrome Scale items in the EuroSC data set (n=1,208), followed by k-means cluster analyses and a search for an optimal k. The optimal model structure was compared to Mohr/Lenert by assigning discrete severity levels to each cluster in each factor based on the cluster center. A harmonized model was created and patients were assigned to health states using both approaches; agreement rates in state assignment were then calculated. Results Five factors accounting for 56% of total variance were obtained from PCA. These factors corresponded to positive symptoms (Factor 1), negative symptoms (Factor 2), cognitive impairment (Factor 3), hostility/aggression (Factor 4), and mood disorder (Factor 5) (as in Mohr/Lenert). The optimal number of cluster states was six. The kappa statistic (95% confidence interval) for agreement in state assignment was 0.686 (0.670–0.703). Conclusion The patterns of schizophrenia effects identified using clustering in two different data sets were reasonably similar. Results suggest the Mohr/Lenert health state model is potentially generalizable to other populations. PMID:27386054

  1. Genomic and expression analysis of the vanG-like gene cluster of Clostridium difficile.

    PubMed

    Peltier, Johann; Courtin, Pascal; El Meouche, Imane; Catel-Ferreira, Manuella; Chapot-Chartier, Marie-Pierre; Lemée, Ludovic; Pons, Jean-Louis

    2013-07-01

    Primary antibiotic treatment of Clostridium difficile intestinal diseases requires metronidazole or vancomycin therapy. A cluster of genes homologous to enterococcal glycopeptides resistance vanG genes was found in the genome of C. difficile 630, although this strain remains sensitive to vancomycin. This vanG-like gene cluster was found to consist of five ORFs: the regulatory region consisting of vanR and vanS and the effector region consisting of vanG, vanXY and vanT. We found that 57 out of 83 C. difficile strains, representative of the main lineages of the species, harbour this vanG-like cluster. The cluster is expressed as an operon and, when present, is found at the same genomic location in all strains. The vanG, vanXY and vanT homologues in C. difficile 630 are co-transcribed and expressed to a low level throughout the growth phases in the absence of vancomycin. Conversely, the expression of these genes is strongly induced in the presence of subinhibitory concentrations of vancomycin, indicating that the vanG-like operon is functional at the transcriptional level in C. difficile. Hydrophilic interaction liquid chromatography (HILIC-HPLC) and MS analysis of cytoplasmic peptidoglycan precursors of C. difficile 630 grown without vancomycin revealed the exclusive presence of a UDP-MurNAc-pentapeptide with an alanine at the C terminus. UDP-MurNAc-pentapeptide [d-Ala] was also the only peptidoglycan precursor detected in C. difficile grown in the presence of vancomycin, corroborating the lack of vancomycin resistance. Peptidoglycan structures of a vanG-like mutant strain and of a strain lacking the vanG-like cluster did not differ from the C. difficile 630 strain, indicating that the vanG-like cluster also has no impact on cell-wall composition.

  2. Sensitivity Enhancement of RF Plasma Etch Endpoint Detection With K-means Cluster Analysis

    NASA Astrophysics Data System (ADS)

    Lee, Honyoung; Jang, Haegyu; Lee, Hak-Seung; Chae, Heeyeop

    2015-09-01

    Plasma etching process is the core process in semiconductor fabrication, and the etching endpoint detection is one of the essential FDC (Fault Detection and Classification) for yield management and mass production. In general, Optical emission spectrocopy (OES) has been used to detect endpoint because OES can be a non-invasive and real-time plasma monitoring tool. In OES, the trend of a few sensitive wavelengths is traced. However, in case of small-open area etch endpoint detection (ex. contact etch), it is at the boundary of the detection limit because of weak signal intensities of reaction reactants and products. Furthemore, the various materials covering the wafer such as photoresist, dielectric materials, and metals make the analysis of OES signals complicated. In this study, full spectra of optical emission signals were collected and the data were analyzed by a data-mining approach, modified K-means cluster analysis. The K-means cluster analysis is modified suitably to analyze a thousand of wavelength variables from OES. This technique can improve the sensitivity of EPD for small area oxide layer etching processes: about 1.0% oxide area. This technique is expected to be applied to various plasma monitoring applications including fault detections as well as EPD. Plasma Etch, EPD, K-means Cluster Analysis.

  3. [Pyrolysis-gas chromatographic fingerprints with hierarchical cluster analysis for Dendrobium candidum Wall. ex Lindl].

    PubMed

    Wang, Lili; Wang, Cong; Pan, Zaifa; Sun, Fa

    2008-09-01

    The pyrogram fingerprints of Dendrobium candidum Wall. ex Lindl. from different regions were studied by pyrolysis-gas chromatography/mass spectrometry (Py-GC/MS) and compared with hierarchical cluster analysis. The effect of pyrolysis temperature on the fingerprint was examined by evolved gas analysis, and then 450 degrees C was selected as the optimized pyrolysis temperature. An amount of 0.4 mg of raw drug powder was pyrolysed in a vertical microfurnace pyrolyzer, and the products were directly introduced into a gas chromatograph equipped with a flame ionization detector (FID) and a fused-silica capillary column (30 m x 0.25 mm x 0.25 microm). The pyrogram fingerprints of 10 samples from different regions showed a high similarity and a good reproducibility with the relative standard deviations (RSDs) of the retention times less than 0.33% and the RSDs of the relative peak areas less than 4.8%. Therefore, each sample was characterized by the peak area of 31 peaks in each pyrogram and these peaks were employed for hierarchical cluster analysis. Furthermore, the discrimination of the sample from different regions was achieved by hierarchical cluster analysis via recognizing the 10 x 31 data matrix. Thus, the results proved it is a simple, rapid and accurate method suitable for the quality control of the traditional Chinese medicines.

  4. Interactive Parallel Data Analysis within Data-Centric Cluster Facilities using the IPython Notebook

    NASA Astrophysics Data System (ADS)

    Pascoe, S.; Lansdowne, J.; Iwi, A.; Stephens, A.; Kershaw, P.

    2012-12-01

    The data deluge is making traditional analysis workflows for many researchers obsolete. Support for parallelism within popular tools such as matlab, IDL and NCO is not well developed and rarely used. However parallelism is necessary for processing modern data volumes on a timescale conducive to curiosity-driven analysis. Furthermore, for peta-scale datasets such as the CMIP5 archive, it is no longer practical to bring an entire dataset to a researcher's workstation for analysis, or even to their institutional cluster. Therefore, there is an increasing need to develop new analysis platforms which both enable processing at the point of data storage and which provides parallelism. Such an environment should, where possible, maintain the convenience and familiarity of our current analysis environments to encourage curiosity-driven research. We describe how we are combining the interactive python shell (IPython) with our JASMIN data-cluster infrastructure. IPython has been specifically designed to bridge the gap between the HPC-style parallel workflows and the opportunistic curiosity-driven analysis usually carried out using domain specific languages and scriptable tools. IPython offers a web-based interactive environment, the IPython notebook, and a cluster engine for parallelism all underpinned by the well-respected Python/Scipy scientific programming stack. JASMIN is designed to support the data analysis requirements of the UK and European climate and earth system modeling community. JASMIN, with its sister facility CEMS focusing the earth observation community, has 4.5 PB of fast parallel disk storage alongside over 370 computing cores provide local computation. Through the IPython interface to JASMIN, users can make efficient use of JASMIN's multi-core virtual machines to perform interactive analysis on all cores simultaneously or can configure IPython clusters across multiple VMs. Larger-scale clusters can be provisioned through JASMIN's batch scheduling system

  5. Countries population determination to test rice crisis indicator at national level using k-means cluster analysis

    NASA Astrophysics Data System (ADS)

    Hidayat, Y.; Purwandari, T.; Sukono; Ariska, Y. D.

    2017-01-01

    This study aimed to obtain information on the population of the countries which is have similarities with Indonesia based on three characteristics, that is the democratic atmosphere, rice consumption and purchasing power of rice. It is useful as a reference material for research which tested the strength and predictability of the rice crisis indicators Unprecedented Restlessness (UR). The similarities countries with Indonesia were conducted using multivariate analysis that is non-hierarchical cluster analysis k-Means with 38 countries as the data population. This analysis is done repeatedly until the obtainment number of clusters which is capable to show the differentiator power of the three characteristics and describe the high similarity within clusters. Based on the results, it turns out with 6 clusters can describe the differentiator power of characteristics of formed clusters. However, to answer the purpose of the study, only one cluster which will be taken accordance with the criteria of success for the population of countries that have similarities with Indonesia that cluster contain Indonesia therein, there are countries which is sustain crisis and non-crisis of rice in 2008, and cluster which is have the largest member among them. This criterion is met by cluster 2, which consists of 22 countries, namely Indonesia, Brazil, Costa Rica, Djibouti, Dominican Republic, Ecuador, Fiji, Guinea-Bissau, Haiti, India, Jamaica, Japan, Korea South, Madagascar, Malaysia, Mali, Nicaragua, Panama, Peru, Senegal, Sierra Leone and Suriname.

  6. Diagrammatic analysis of correlations in polymer fluids: Cluster diagrams via Edwards’ field theory

    NASA Astrophysics Data System (ADS)

    Morse, David C.

    2006-10-01

    Edwards' functional integral approach to the statistical mechanics of polymer liquids is amenable to a diagrammatic analysis in which free energies and correlation functions are expanded as infinite sums of Feynman diagrams. This analysis is shown to lead naturally to a perturbative cluster expansion that is closely related to the Mayer cluster expansion developed for molecular liquids by Chandler and co-workers. Expansion of the functional integral representation of the grand-canonical partition function yields a perturbation theory in which all quantities of interest are expressed as functionals of a monomer-monomer pair potential, as functionals of intramolecular correlation functions of non-interacting molecules, and as functions of molecular activities. In different variants of the theory, the pair potential may be either a bare or a screened potential. A series of topological reductions yields a renormalized diagrammatic expansion in which collective correlation functions are instead expressed diagrammatically as functionals of the true single-molecule correlation functions in the interacting fluid, and as functions of molecular number density. Similar renormalized expansions are also obtained for a collective Ornstein-Zernicke direct correlation function, and for intramolecular correlation functions. A concise discussion is given of the corresponding Mayer cluster expansion, and of the relationship between the Mayer and perturbative cluster expansions for liquids of flexible molecules. The application of the perturbative cluster expansion to coarse-grained models of dense multi-component polymer liquids is discussed, and a justification is given for the use of a loop expansion. As an example, the formalism is used to derive a new expression for the wave-number dependent direct correlation function and recover known expressions for the intramolecular two-point correlation function to first-order in a renormalized loop expansion for coarse-grained models of

  7. Accounting for Limited Detection Efficiency and Localization Precision in Cluster Analysis in Single Molecule Localization Microscopy

    PubMed Central

    Shivanandan, Arun; Unnikrishnan, Jayakrishnan; Radenovic, Aleksandra

    2015-01-01

    Single Molecule Localization Microscopy techniques like PhotoActivated Localization Microscopy, with their sub-diffraction limit spatial resolution, have been popularly used to characterize the spatial organization of membrane proteins, by means of quantitative cluster analysis. However, such quantitative studies remain challenged by the techniques’ inherent sources of errors such as a limited detection efficiency of less than 60%, due to incomplete photo-conversion, and a limited localization precision in the range of 10 – 30nm, varying across the detected molecules, mainly depending on the number of photons collected from each. We provide analytical methods to estimate the effect of these errors in cluster analysis and to correct for them. These methods, based on the Ripley’s L(r) – r or Pair Correlation Function popularly used by the community, can facilitate potentially breakthrough results in quantitative biology by providing a more accurate and precise quantification of protein spatial organization. PMID:25794150

  8. Screening of eight Eucalypt genotypes (Eucalyptus sp.) for water deficit tolerance using multivariate cluster analysis.

    PubMed

    Cha-Um, S; Somsueb, S; Samphumphuang, T; Kirdmanee, C

    2014-06-01

    The present study evaluated eight genotypes of river red gum (Eucalyptus camaldulensis Dehnh.) and a hybrid (E. camaldulensis × E. urophylla) for mannitol-induced water deficit (WD) under photoautotrophic conditions using multivariate cluster analysis. Shoot height, plant dry weight, and chlorophyll a content in hybrid genotypes, 58H2 and 27A2, were maintained when exposed to 200 mM mannitol for 14 days. In addition, the diminution of photosynthetic abilities, i.e. maximum quantum yield of PSII, photon yield of PSII, photochemical quenching, and net photosynthetic rate, under WD was minimal in hybrid genotypes compared to that in selection clones of E. camaldulensis. Under WD condition, there was greater accumulation of proline in all genotypes. A positive relationship was observed between physiological and morphological attributes under WD stress. Using Ward's cluster analysis, hybrid genotypes-H4, 58H2, and 27A2-were classified as water deficit tolerant.

  9. Cluster analysis and artificial neural networks multivariate classification of onion varieties.

    PubMed

    Rodríguez Galdón, Beatriz; Peña-Méndez, Eladia; Havel, Josef; Rodríguez Rodríguez, Elena María; Díaz Romero, Carlos

    2010-11-10

    Eight cultivars of different colored onions (white, golden, and red) were evaluated for fresh bulbs cultivated and grown under the same environmental and agronomical conditions. Cluster analysis and principal component analysis, based on different flavonoids, total phenols, and pungency, data showed that the onions were not clustered according to variety (genetic similarity degree), whereas the color was the variable with the highest influence, ranging between 50 and 70%. Artificial neural networks were applied to study the possibility of discriminating among onion varieties. Characterization of the onion according to variety and procedence of the seeds was around 95-100%. Samples belonging to the Carrizal Alto procedence had an incorrect classification for 25% of the data.

  10. Conceptual issues in the analysis of cost data within cluster randomized trials.

    PubMed

    Flynn, Terry; Peters, Tim

    2005-04-01

    Cluster randomized controlled trials (RCTs) are increasingly used in economic evaluations of social, educational and health care interventions. Methodological research has, therefore, been spread across several disciplines, with the result that it has taken many years for guidelines on good statistical practice in the design and analysis of such trials to become easily accessible to health service researchers. These guidelines remain incomplete, however, because they do not take account of issues specific to the analysis of cost data. In particular, they fail to recognize that the calculation of confidence intervals around costs needed to inform health care priority setting raises unique methodological issues. If poorly designed trials are to be avoided in future (including those by the authors), then collaboration between triallists and health economists is required. This paper sets out a framework that should facilitate such collaboration and draws attention to problems that must be addressed quickly in the design of cluster-based economic evaluations.

  11. [Algae identification research based on fluorescence spectral imaging technology combined with cluster analysis and principal component analysis].

    PubMed

    Liang, Man; Huang, Fu-rong; He, Xue-jia; Chen, Xing-dan

    2014-08-01

    In order to explore rapid real-time algae detection methods, in the present study experiments were carried out to use fluorescence spectral imaging technology combined with a pattern recognition method for identification research of different types of algae. The fluorescence effect of algae samples is obvious during the detection. The fluorescence spectral imaging system was adopted to collect spectral images of 40 algal samples. Through image denoising, binarization processing and making sure the effective pixels, the spectral curves of each sample were drawn according to the spectral cube. The spectra in the 400-720 nm wavelength range were obtained. Then, two pattern recognition methods, i.e., hierarchical cluster analysis and principal component analysis, were used to process the spectral data. The hierarchical cluster analysis results showed that the Euclidean distance method and average weighted method were used to calculate the cluster distance between samples, and the samples could be correctly classified at a level of the distance L=2.452 or above, with an accuracy of 100%. The principal component analysis results showed that first-order derivative, second-order derivative, multiplicative scatter correction, standard normal variate and other pretreatments were carried out on raw spectral data, then principal component analysis was conducted, among which the identification effect after the second-order derivative pretreatment was shown to be the most effective, and eight types of algae samples were independently distributed in the principal component eigenspace. It was thus shown that it was feasible to use fluorescence spectral imaging technology combined with cluster analysis and principal component analysis for algae identification. The method had the characteristics of being easy to operate, fast and nondestructive.

  12. Seismic clusters analysis in North-Eastern Italy by the nearest-neighbor approach

    NASA Astrophysics Data System (ADS)

    Peresan, Antonella; Gentili, Stefania

    2016-04-01

    The main features of earthquake clusters in the Friuli Venezia Giulia Region (North Eastern Italy) are explored, with the aim to get some new insights on local scale patterns of seismicity in the area. The study is based on a systematic analysis of robustly and uniformly detected seismic clusters of small-to-medium magnitude events, as opposed to selected clusters analyzed in earlier studies. To characterize the features of seismicity for FVG, we take advantage of updated information from local OGS bulletins, compiled at the National Institute of Oceanography and Experimental Geophysics, Centre of Seismological Research, since 1977. A preliminary reappraisal of the earthquake bulletins is carried out, in order to identify possible missing events and to remove spurious records (e.g. duplicates and explosions). The area of sufficient completeness is outlined; for this purpose, different techniques are applied, including a comparative analysis with global ISC data, which are available in the region for large and moderate size earthquakes. Various techniques are considered to estimate the average parameters that characterize the earthquake occurrence in the region, including the b-value and the fractal dimension of epicenters distribution. Specifically, besides the classical Gutenberg-Richter Law, the Unified Scaling Law for Earthquakes, USLE, is applied. Using the updated and revised OGS data, a new formal method for detection of earthquake clusters, based on nearest-neighbor distances of events in space-time-energy domain, is applied. The bimodality of the distribution, which characterizes the earthquake nearest-neighbor distances, is used to decompose the seismic catalog into sequences of individual clusters and background seismicity. Accordingly, the method allows for a data-driven identification of main shocks (first event with the largest magnitude in the cluster), foreshocks and aftershocks. Average robust estimates of the USLE parameters (particularly, b

  13. a Three-Step Spatial-Temporal Clustering Method for Human Activity Pattern Analysis

    NASA Astrophysics Data System (ADS)

    Huang, W.; Li, S.; Xu, S.

    2016-06-01

    How people move in cities and what they do in various locations at different times form human activity patterns. Human activity pattern plays a key role in in urban planning, traffic forecasting, public health and safety, emergency response, friend recommendation, and so on. Therefore, scholars from different fields, such as social science, geography, transportation, physics and computer science, have made great efforts in modelling and analysing human activity patterns or human mobility patterns. One of the essential tasks in such studies is to find the locations or places where individuals stay to perform some kind of activities before further activity pattern analysis. In the era of Big Data, the emerging of social media along with wearable devices enables human activity data to be collected more easily and efficiently. Furthermore, the dimension of the accessible human activity data has been extended from two to three (space or space-time) to four dimensions (space, time and semantics). More specifically, not only a location and time that people stay and spend are collected, but also what people "say" for in a location at a time can be obtained. The characteristics of these datasets shed new light on the analysis of human mobility, where some of new methodologies should be accordingly developed to handle them. Traditional methods such as neural networks, statistics and clustering have been applied to study human activity patterns using geosocial media data. Among them, clustering methods have been widely used to analyse spatiotemporal patterns. However, to our best knowledge, few of clustering algorithms are specifically developed for handling the datasets that contain spatial, temporal and semantic aspects all together. In this work, we propose a three-step human activity clustering method based on space, time and semantics to fill this gap. One-year Twitter data, posted in Toronto, Canada, is used to test the clustering-based method. The results show that the

  14. Cluster analysis of multiplex ligation-dependent probe amplification data in choroidal melanoma

    PubMed Central

    Caines, Rhydian; Eleuteri, Antonio; Kalirai, Helen; Fisher, Anthony C.; Heimann, Heinrich; Damato, Bertil E.; Coupland, Sarah E.

    2015-01-01

    Purpose To determine underlying correlations in multiplex ligation-dependent probe amplification (MLPA) data and their significance regarding survival following treatment of choroidal melanoma (CM). Methods MLPA data were available for 31 loci across four chromosomes (1p, 3, 6, and 8) in tumor material obtained from 602 patients with CM treated at the Liverpool Ocular Oncology Center (LOOC) between 1993 and 2012. Data representing chromosomes 3 and 8q were analyzed in depth since their association with CM patient survival is well-known. Unsupervised k-means cluster analysis was performed to detect latent structure in the data set. Principal component analysis (PCA) was also performed to determine the intrinsic dimensionality of the data. Survival analyses of the identified clusters were performed using Kaplan–Meier (KM) and log-rank statistical tests. Correlation with largest basal tumor diameter (LTD) was investigated. Results Chromosome 3: A two-cluster (bimodal) solution was found in chromosome 3, characterized by centroids at unilaterally normal probe values and unilateral deletion. There was a large, significant difference in the survival characteristics of the two clusters (log-rank, p<0.001; 5-year survival: 80% versus 40%). Both clusters had a broad distribution in LTD, although larger tumors were characteristically in the poorer outcome group (Mann–Whitney, p<0.001). Threshold values of 0.85 for deletion and 1.15 for gain optimized the classification of the clusters. PCA showed that the first principal component (PC1) contained more than 80% of the data set variance and all of the bimodality, with uniform coefficients (0.28±0.03). Chromosome 8q: No clusters were found in chromosome 8q. Using a conventional threshold-based definition of 8q gain, and in conjunction with the chromosome 3 clusters, three prognostic groups were identified: chromosomes 3 and 8q both normal, either chromosome 3 or 8q abnormal, and both chromosomes 3 and 8q abnormal. KM

  15. Molecular analysis of the hrp gene cluster in Xanthomonas oryzae pathovar oryzae KACC10859.

    PubMed

    Cho, Hee-Jung; Park, Young-Jin; Noh, Tae-Hwan; Kim, Yeong-Tae; Kim, Jeong-Gu; Song, Eun-Sung; Lee, Dong-Hee; Lee, Byoung-Moo

    2008-06-01

    Xanthomonas oryzae pathovar oryzae is the causal agent of rice bacterial blight. The plant pathogenic bacterium X. oryzae pv. oryzae expresses a type III secretion system that is necessary for both the pathogenicity in susceptible hosts and the induction of the hypersensitive response in resistant plants. This specialized protein transport system is encoded by a 32.18kb hrp (hypersensitive response and pathogenicity) gene cluster. The hrp gene cluster is composed of nine hrp, nine hrc (hrp conserved) and eight hpa (hrp-associated) genes and is controlled by HrpG and HrpX, which are known as regulators of the hrp gene cluster. Before mutational analysis of these hrp genes, the transcriptional linkages of the core region of the hrp gene cluster from hpaB to hrcC of the X. oryzae pv. oryzae KACC10859 was determined and the non-polarity of EZTn5 insertional mutagenesis was demonstrated by reverse transcription polymerase chain reaction. Pathogenicity assays of these non-polar hrp mutants were carried out on the susceptible rice cultivar, Milyang-23. According to the results of these assays, all hrp-hrc, except hrpF, and hpaB mutants lost their pathogenicity, which indicates that most hrp-hrc genes encode essential pathogenicity factors. On the other hand, most hpa mutants showed decreased virulence in a different pattern, i.e., hpa genes are not essential but are important for pathogenicity.

  16. Cluster analysis identifies aminoacid compositional features that indicate Toxoplasma gondii adhesin proteins

    PubMed Central

    Arenas, Ailan F; Salcedo, Gladys E; Moncada, Diego M; Erazo, Diego A; Osorio, Juan F; Gomez-Marin, Jorge E

    2012-01-01

    Toxoplasma gondii invade host cells using a multi-step process that depends on the regulated secretion of adhesions. To identify key primary sequence features of adhesins in this parasite, we analyze the relative frequency of individual amino acids, their dipeptide frequencies, and the polarity, polarizability and Van der Waals volume of the individual amino acids by using cluster analysis. This method identified cysteine as a key amino acid in the Toxoplasma adhesin group. The best vector algorithm of non-concatenated features was for 2 attributes: the single amino acid relative frequency and the dipeptide frequency. Polarity, polarizability and Van der Waals volume were not good classificatory attributes. Single amino acid attributes clustered unambiguously 67 apicomplexan hypothetical adhesins. This algorithm was also useful for clustering hypothetical Toxoplasma target host receptors. All of the cluster performances had over 70% sensitivity and 80% specificity. Compositional aminoacid data can be useful for improving machine learning-based prediction software when homology and structural data are not sufficient. PMID:23144551

  17. Sequencing and transcriptional analysis of the biosynthesis gene cluster of putrescine-producing Lactococcus lactis.

    PubMed

    Ladero, Victor; Rattray, Fergal P; Mayo, Baltasar; Martín, María Cruz; Fernández, María; Alvarez, Miguel A

    2011-09-01

    Lactococcus lactis is a prokaryotic microorganism with great importance as a culture starter and has become the model species among the lactic acid bacteria. The long and safe history of use of L. lactis in dairy fermentations has resulted in the classification of this species as GRAS (General Regarded As Safe) or QPS (Qualified Presumption of Safety). However, our group has identified several strains of L. lactis subsp. lactis and L. lactis subsp. cremoris that are able to produce putrescine from agmatine via the agmatine deiminase (AGDI) pathway. Putrescine is a biogenic amine that confers undesirable flavor characteristics and may even have toxic effects. The AGDI cluster of L. lactis is composed of a putative regulatory gene, aguR, followed by the genes (aguB, aguD, aguA, and aguC) encoding the catabolic enzymes. These genes are transcribed as an operon that is induced in the presence of agmatine. In some strains, an insertion (IS) element interrupts the transcription of the cluster, which results in a non-putrescine-producing phenotype. Based on this knowledge, a PCR-based test was developed in order to differentiate nonproducing L. lactis strains from those with a functional AGDI cluster. The analysis of the AGDI cluster and their flanking regions revealed that the capacity to produce putrescine via the AGDI pathway could be a specific characteristic that was lost during the adaptation to the milk environment by a process of reductive genome evolution.

  18. Molecular analysis of the cercosporin biosynthetic gene cluster in Cercospora nicotianae.

    PubMed

    Chen, Huiqin; Lee, Miin-Huey; Daub, Margret E; Chung, Kuang-Ren

    2007-05-01

    We describe a core gene cluster, comprised of eight genes (designated CTB1-8), and associated with cercosporin toxin production in Cercospora nicotianae. Sequence analysis identified 10 putative open reading frames (ORFs) flanking the previously characterized CTB1 and CTB3 genes that encode, respectively, the polyketide synthase and a dual methyltransferase/monooxygenase required for cercosporin production. Expression of eight of the genes was co-ordinately induced under cercosporin-producing conditions and was regulated by the Zn(II)Cys(6) transcriptional activator, CTB8. Expression of the genes, affected by nitrogen and carbon sources and pH, was also controlled by another transcription activator, CRG1, previously shown to regulate cercosporin production and resistance. Disruption of the CTB2 gene encoding a methyltransferase or the CTB8 gene yielded mutants that were completely defective in cercosporin production and inhibitory expression of the other CTB cluster genes. Similar 'feedback' transcriptional inhibition was observed when the CTB1, or CTB3 but not CTB4 gene was inactivated. Expression of four ORFs located on the two distal ends of the cluster did not correlate with cercosporin biosynthesis and did not show regulation by CTB8, suggesting that the biosynthetic cluster was limited to CTB1-8. A biosynthetic pathway and a regulatory network leading to cercosporin formation are proposed.

  19. Joint motion pattern classification by cluster analysis of kinematic, demographic, and subjective variables.

    PubMed

    Hwang, Jaejin; Shin, Hyunjung; Jung, Myung-Chul

    2013-07-01

    The purpose of this study is to identify joint motion patterns by classifying the full range of motion (ROM) into several sections. Forty participants were stratified by age and gender and they performed 18 full-swing motions at a self-selected speed. Joint angle, angular velocity, angular acceleration, and subjective discomfort rating were collected for each motion. K-means cluster analyses were used to classify joint motion patterns and ROM sections. The results showed that two or three clusters were mainly determined by the kinematic variables of angular velocity and acceleration. The motions of three clusters showed that the ROM sections of low and moderate velocity with moderate and high accelerations occurred in the initial (negative) and terminal (positive) phases, respectively, whereas those of high velocity with low acceleration were shown in the mid (neutral) phase. The motions of two clusters revealed that while the patterns of high velocity and high acceleration were found on the positive side of the ROM, those of low velocity and low acceleration were on the negative and neutral sides. The ROM sections close to both ends of the ROM may have a larger physical load than the others. This study provides information that could be useful for developing postural analysis tools for dynamic work.

  20. A Spatial Cluster Analysis of Tractor Overturns in Kentucky from 1960 to 2002

    PubMed Central

    Saman, Daniel M.; Cole, Henry P.; Odoi, Agricola; Myers, Melvin L.; Carey, Daniel I.; Westneat, Susan C.

    2012-01-01

    Background Agricultural tractor overturns without rollover protective structures are the leading cause of farm fatalities in the United States. To our knowledge, no studies have incorporated the spatial scan statistic in identifying high-risk areas for tractor overturns. The aim of this study was to determine whether tractor overturns cluster in certain parts of Kentucky and identify factors associated with tractor overturns. Methods A spatial statistical analysis using Kulldorff's spatial scan statistic was performed to identify county clusters at greatest risk for tractor overturns. A regression analysis was then performed to identify factors associated with tractor overturns. Results The spatial analysis revealed a cluster of higher than expected tractor overturns in four counties in northern Kentucky (RR = 2.55) and 10 counties in eastern Kentucky (RR = 1.97). Higher rates of tractor overturns were associated with steeper average percent slope of pasture land by county (p = 0.0002) and a greater percent of total tractors with less than 40 horsepower by county (p<0.0001). Conclusions This study reveals that geographic hotspots of tractor overturns exist in Kentucky and identifies factors associated with overturns. This study provides policymakers a guide to targeted county-level interventions (e.g., roll-over protective structures promotion interventions) with the intention of reducing tractor overturns in the highest risk counties in Kentucky. PMID:22291980

  1. The Heterogeneity of Early Parkinson’s Disease: A Cluster Analysis on Newly Diagnosed Untreated Patients

    PubMed Central

    Amboni, Marianna; Picillo, Marina; Moccia, Marcello; Longo, Katia; Santangelo, Gabriella; De Rosa, Anna; Allocca, Roberto; Giordano, Flavio; Orefice, Giuseppe; De Michele, Giuseppe; Santoro, Lucio; Pellecchia, Maria Teresa; Barone, Paolo

    2013-01-01

    Background The variability in the clinical phenotype of Parkinson’s disease seems to suggest the existence of several subtypes of the disease. To test this hypothesis we performed a cluster analysis using data assessing both motor and non-motor symptoms in a large cohort of newly diagnosed untreated PD patients. Methods We collected data on demographic, motor, and the whole complex of non-motor symptoms from 100 consecutive newly diagnosed untreated outpatients. Statistical cluster analysis allowed the identification of different subgroups, which have been subsequently explored. Results The data driven approach identified four distinct groups of patients, we have labeled: 1) Benign Pure Motor; 2) Benign mixed Motor-Non-Motor; 3) Non-Motor Dominant; and 4) Motor Dominant. Conclusion Our results confirmed the existence of different subgroups of early PD patients. Cluster analysis revealed the presence of distinct subtypes of patients profiled according to the relevance of both motor and non-motor symptoms. Identification of such subtypes may have important implications for generating pathogenetic hypotheses and therapeutic strategies. PMID:23936396

  2. A spatial cluster analysis of tractor overturns in Kentucky from 1960 to 2002

    USGS Publications Warehouse

    Saman, D.M.; Cole, H.P.; Odoi, A.; Myers, M.L.; Carey, D.I.; Westneat, S.C.

    2012-01-01

    Background: Agricultural tractor overturns without rollover protective structures are the leading cause of farm fatalities in the United States. To our knowledge, no studies have incorporated the spatial scan statistic in identifying high-risk areas for tractor overturns. The aim of this study was to determine whether tractor overturns cluster in certain parts of Kentucky and identify factors associated with tractor overturns. Methods: A spatial statistical analysis using Kulldorff's spatial scan statistic was performed to identify county clusters at greatest risk for tractor overturns. A regression analysis was then performed to identify factors associated with tractor overturns. Results: The spatial analysis revealed a cluster of higher than expected tractor overturns in four counties in northern Kentucky (RR = 2.55) and 10 counties in eastern Kentucky (RR = 1.97). Higher rates of tractor overturns were associated with steeper average percent slope of pasture land by county (p = 0.0002) and a greater percent of total tractors with less than 40 horsepower by county (p<0.0001). Conclusions: This study reveals that geographic hotspots of tractor overturns exist in Kentucky and identifies factors associated with overturns. This study provides policymakers a guide to targeted county-level interventions (e.g., roll-over protective structures promotion interventions) with the intention of reducing tractor overturns in the highest risk counties in Kentucky. ?? 2012 Saman et al.

  3. Combination of meta-analysis and graph clustering to identify prognostic markers of ESCC.

    PubMed

    Gao, Hongyun; Wang, Lishan; Cui, Shitao; Wang, Mingsong

    2012-04-01

    Esophageal squamous cell carcinoma (ESCC) is one of the most malignant gastrointestinal cancers and occurs at a high frequency rate in China and other Asian countries. Recently, several molecular markers were identified for predicting ESCC. Notwithstanding, additional prognostic markers, with a clear understanding of their underlying roles, are still required. Through bioinformatics, a graph-clustering method by DPClus was used to detect co-expressed modules. The aim was to identify a set of discriminating genes that could be used for predicting ESCC through graph-clustering and GO-term analysis. The results showed that CXCL12, CYP2C9, TGM3, MAL, S100A9, EMP-1 and SPRR3 were highly associated with ESCC development. In our study, all their predicted roles were in line with previous reports, whereby the assumption that a combination of meta-analysis, graph-clustering and GO-term analysis is effective for both identifying differentially expressed genes, and reflecting on their functions in ESCC.

  4. Classification and identification of metal-accumulating plant species by cluster analysis.

    PubMed

    Yang, Wenhao; Li, He; Zhang, Taoxiang; Sen, Lin; Ni, Wuzhong

    2014-09-01

    Identification and classification of metal-accumulating plant species is essential for phytoextraction. Cluster analysis is used for classifying individuals based on measured characteristics. In this study, classification of plant species for metal accumulation was conducted using cluster analysis based on a practical survey. Forty plant samples belonging to 21 species were collected from an ancient silver-mining site. Five groups such as hyperaccumulator, potential hyperaccumulator, accumulator, potential accumulator, and normal accumulating plant were graded. For Cd accumulation, the ancient silver-mining ecotype of Sedum alfredii was treated as a Cd hyperaccumulator, and the others were normal Cd-accumulating plants. For Zn accumulation, S. alfredii was considered as a potential Zn hyperaccumulator, Conyza canadensis and Artemisia lavandulaefolia were Zn accumulators, and the others were normal Zn-accumulating plants. For Pb accumulation, S. alfredii and Elatostema lineolatum were potential Pb hyperaccumulators, Rubus hunanensis, Ajuga decumbens, and Erigeron annuus were Pb accumulators, C. canadensis and A. lavandulaefolia were potential Pb accumulators, and the others were normal Pb-accumulating plants. Plant species with the potential for phytoextraction were identified such as S. alfredii for Cd and Zn, C. canadensis and A. lavandulaefolia for Zn and Pb, and E. lineolatum, R. hunanensis, A. decumbens, and E. annuus for Pb. Cluster analysis is effective in the classification of plant species for metal accumulation and identification of potential species for phytoextraction.

  5. Classification of microvascular patterns via cluster analysis reveals their prognostic significance in glioblastoma.

    PubMed

    Chen, Long; Lin, Zhi-Xiong; Lin, Guo-Shi; Zhou, Chang-Fu; Chen, Yu-Peng; Wang, Xing-Fu; Zheng, Zong-Qing

    2015-01-01

    There are limited researches focusing on microvascular patterns (MVPs) in human glioblastoma and their prognostic impact. We evaluated MVPs of 78 glioblastomas by CD34/periodic acid-Schiff dual staining and by cluster analysis of the percentage of microvascular area for distinct microvascular formations. The distribution of 5 types of basic microvascular formations, that is, microvascular sprouting (MS), vascular cluster (VC), vascular garland (VG), glomeruloid vascular proliferation (GVP), and vasculogenic mimicry (VM), was variable. Accordingly, cluster analysis classified MVPs into 2 types: type I MVP displayed prominent MSs and VCs, whereas type II MVP had numerous VGs, GVPs, and VMs. By analyzing the proportion of microvascular area for each type of formation, we determined that glioblastomas with few MSs and VCs had many GVPs and VMs, and vice versa. VG seemed to be a transitional type of formation. In case of type I MVP, expression of Ki-67 and p53 but not MGMT was significantly higher as compared with those of type II MVP (P < .05). Survival analysis showed that the type of MVPs presented as an independent prognostic factor of progression-free survival (PFS) and overall survival (OS) (both P < .001). Type II MVP had a more negative influence on PFS and OS than did type I MVP. We conclude that the heterogeneous MVPs in glioblastoma can be categorized properly by certain histopathologic and statistical analyses and may influence clinical outcome.

  6. Finding Groups Using Model-Based Cluster Analysis: Heterogeneous Emotional Self-Regulatory Processes and Heavy Alcohol Use Risk

    ERIC Educational Resources Information Center

    Mun, Eun Young; von Eye, Alexander; Bates, Marsha E.; Vaschillo, Evgeny G.

    2008-01-01

    Model-based cluster analysis is a new clustering procedure to investigate population heterogeneity utilizing finite mixture multivariate normal densities. It is an inferentially based, statistically principled procedure that allows comparison of nonnested models using the Bayesian information criterion to compare multiple models and identify the…

  7. Clustering Educational Digital Library Usage Data: A Comparison of Latent Class Analysis and K-Means Algorithms

    ERIC Educational Resources Information Center

    Xu, Beijie; Recker, Mimi; Qi, Xiaojun; Flann, Nicholas; Ye, Lei

    2013-01-01

    This article examines clustering as an educational data mining method. In particular, two clustering algorithms, the widely used K-means and the model-based Latent Class Analysis, are compared, using usage data from an educational digital library service, the Instructional Architect (IA.usu.edu). Using a multi-faceted approach and multiple data…

  8. A framework for graph-based synthesis, analysis, and visualization of HPC cluster job data.

    SciTech Connect

    Mayo, Jackson R.; Kegelmeyer, W. Philip, Jr.; Wong, Matthew H.; Pebay, Philippe Pierre; Gentile, Ann C.; Thompson, David C.; Roe, Diana C.; De Sapio, Vincent; Brandt, James M.

    2010-08-01

    The monitoring and system analysis of high performance computing (HPC) clusters is of increasing importance to the HPC community. Analysis of HPC job data can be used to characterize system usage and diagnose and examine failure modes and their effects. This analysis is not straightforward, however, due to the complex relationships that exist between jobs. These relationships are based on a number of factors, including shared compute nodes between jobs, proximity of jobs in time, etc. Graph-based techniques represent an approach that is particularly well suited to this problem, and provide an effective technique for discovering important relationships in job queuing and execution data. The efficacy of these techniques is rooted in the use of a semantic graph as a knowledge representation tool. In a semantic graph job data, represented in a combination of numerical and textual forms, can be flexibly processed into edges, with corresponding weights, expressing relationships between jobs, nodes, users, and other relevant entities. This graph-based representation permits formal manipulation by a number of analysis algorithms. This report presents a methodology and software implementation that leverages semantic graph-based techniques for the system-level monitoring and analysis of HPC clusters based on job queuing and execution data. Ontology development and graph synthesis is discussed with respect to the domain of HPC job data. The framework developed automates the synthesis of graphs from a database of job information. It also provides a front end, enabling visualization of the synthesized graphs. Additionally, an analysis engine is incorporated that provides performance analysis, graph-based clustering, and failure prediction capabilities for HPC systems.

  9. Analysis of the project synthesis goal cluster orientation and inquiry emphasis of elementary science textbooks

    NASA Astrophysics Data System (ADS)

    Staver, John R.; Bay, Mary

    The purpose of this descriptive study was to examine selected units of commonly used elementary science texts, using the Project Synthesis goal clusters as a framework for part of the examination. An inquiry classification scheme was used for the remaining segment. Four questions were answered: (1) To what extent do elementary science textbooks focus on each Project Synthesis goal cluster? (2) In which part of the text is such information found? (3) To what extent are the activities and experiments merely verifications of information already introduced in the text? (4) If inquiry is present in an activity, then what is the level of such inquiry?Eleven science textbook series, which comprise approximately 90 percent of the national market, were selected for analysis. Two units, one primary (K-3) and one intermediate (4-6), were selected for analysis by first identifying units common to most series, then randomly selecting one primary and one intermediate unit for analysis.Each randomly selected unit was carefully read, using the sentence as the unit of analysis. Each declarative and interrogative sentence in the body of the text was classified as: (1) academic; (2) personal; (3) career; or (4) societal in its focus. Each illustration, except those used in evaluation items, was similarly classified. Each activity/experiment and each miscellaneous sentence in end-of-chapter segments labelled review, summary, evaluation, etc., were similarly classified. Finally, each activity/experiment, as a whole, was categorized according to a four-category inquiry scheme (confirmation, structured inquiry, guided inquiry, open inquiry).In general, results of the analysis are: (1) most text prose focuses on academic science; (2) most remaining text prose focuses on the personal goal cluster; (3) the career and societal goal clusters receive only minor attention; (4) text illustrations exhibit a pattern similar to text prose; (5) text activities/experiments are academic in orientation

  10. Understanding the Support Needs of People with Intellectual and Related Developmental Disabilities through Cluster Analysis and Factor Analysis of Statewide Data

    ERIC Educational Resources Information Center

    Viriyangkura, Yuwadee

    2014-01-01

    Through a secondary analysis of statewide data from Colorado, people with intellectual and related developmental disabilities (ID/DD) were classified into five clusters based on their support needs characteristics using cluster analysis techniques. Prior latent factor models of support needs in the field of ID/DD were examined to investigate the…

  11. Cluster Mass Calibration at High Redshift: HST Weak Lensing Analysis of 13 Distant Galaxy Clusters from the South Pole Telescope Sunyaev-Zel'dovich Survey

    SciTech Connect

    Schrabback, T.; et al.

    2016-11-11

    We present an HST/ACS weak gravitational lensing analysis of 13 massive high-redshift (z_median=0.88) galaxy clusters discovered in the South Pole Telescope (SPT) Sunyaev-Zel'dovich Survey. This study is part of a larger campaign that aims to robustly calibrate mass-observable scaling relations over a wide range in redshift to enable improved cosmological constraints from the SPT cluster sample. We introduce new strategies to ensure that systematics in the lensing analysis do not degrade constraints on cluster scaling relations significantly. First, we efficiently remove cluster members from the source sample by selecting very blue galaxies in V-I colour. Our estimate of the source redshift distribution is based on CANDELS data, where we carefully mimic the source selection criteria of the cluster fields. We apply a statistical correction for systematic photometric redshift errors as derived from Hubble Ultra Deep Field data and verified through spatial cross-correlations. We account for the impact of lensing magnification on the source redshift distribution, finding that this is particularly relevant for shallower surveys. Finally, we account for biases in the mass modelling caused by miscentring and uncertainties in the mass-concentration relation using simulations. In combination with temperature estimates from Chandra we constrain the normalisation of the mass-temperature scaling relation ln(E(z) M_500c/10^14 M_sun)=A+1.5 ln(kT/7.2keV) to A=1.81^{+0.24}_{-0.14}(stat.) +/- 0.09(sys.), consistent with self-similar redshift evolution when compared to lower redshift samples. Additionally, the lensing data constrain the average concentration of the clusters to c_200c=5.6^{+3.7}_{-1.8}.

  12. Multi-Resolution Clustering Analysis and Visualization of Around One Million Synthetic Earthquake Events

    NASA Astrophysics Data System (ADS)

    Kaneko, J. Y.; Yuen, D. A.; Dzwinel, W.; Boryszko, K.; Ben-Zion, Y.; Sevre, E. O.

    2002-12-01

    The study of seismic patterns with synthetic data is important for analyzing the seismic hazard of faults because one can precisely control the spatial and temporal domains. Using modern clustering analysis from statistics and a recently introduced visualization software, AMIRA, we have examined the multi-resolution nature of a total assemblage involving 922,672 earthquake events in 4 numerically simulated models, which have different constitutive parameters, with 2 disparately different time intervals in a 3D spatial domain. The evolution of stress and slip on the fault plane was simulated with the 3D elastic dislocation theory for a configuration representing the central San Andreas Fault (Ben-Zion, J. Geophys. Res., 101, 5677-5706, 1996). The 4 different models represent various levels of fault zone disorder and have the following brittle properties and names: uniform properties (model U), a Parkfield type Asperity (A), fractal properties (F), and multi-size-heterogeneities (model M). We employed the MNN (mutual nearest neighbor) clustering method and developed a C-program that calculates simultaneously a number of parameters related to the location of the earthquakes and their magnitude values .Visualization was then used to look at the geometrical locations of the hypocenters and the evolution of seismic patterns. We wrote an AmiraScript that allows us to pass the parameters in an interactive format. With data sets consisting of 150 year time intervals, we have unveiled the distinctly multi-resolutional nature in the spatial-temporal pattern of small and large earthquake correlations shown previously by Eneva and Ben-Zion (J. Geophys. Res., 102, 24513-24528, 1997). In order to search for clearer possible stationary patterns and substructures within the clusters, we have also carried out the same analysis for corresponding data sets with time extending to several thousand years. The larger data sets were studied with finer and finer time intervals and multi

  13. Analysis of Radiation Damage in Light Water Reactors: Comparison of Cluster Analysis Methods for the Analysis of Atom Probe Data.

    PubMed

    Hyde, Jonathan M; DaCosta, Gérald; Hatzoglou, Constantinos; Weekes, Hannah; Radiguet, Bertrand; Styman, Paul D; Vurpillot, Francois; Pareige, Cristelle; Etienne, Auriane; Bonny, Giovanni; Castin, Nicolas; Malerba, Lorenzo; Pareige, Philippe

    2017-01-30

    Irradiation of reactor pressure vessel (RPV) steels causes the formation of nanoscale microstructural features (termed radiation damage), which affect the mechanical properties of the vessel. A key tool for characterizing these nanoscale features is atom probe tomography (APT), due to its high spatial resolution and the ability to identify different chemical species in three dimensions. Microstructural observations using APT can underpin development of a mechanistic understanding of defect formation. However, with atom probe analyses there are currently multiple methods for analyzing the data. This can result in inconsistencies between results obtained from different researchers and unnecessary scatter when combining data from multiple sources. This makes interpretation of results more complex and calibration of radiation damage models challenging. In this work simulations of a range of different microstructures are used to directly compare different cluster analysis algorithms and identify their strengths and weaknesses.

  14. Strong Lensing Analysis of the Galaxy Cluster MACS J1319.9+7003 and the Discovery of a Shell Galaxy

    NASA Astrophysics Data System (ADS)

    Zitrin, Adi

    2017-01-01

    We present a strong-lensing (SL) analysis of the galaxy cluster MACS J1319.9+7003 (z = 0.33, also known as Abell 1722), as part of our ongoing effort to analyze massive clusters with archival Hubble Space Telescope (HST) imaging. We spectroscopically measured with Keck/Multi-Object Spectrometer For Infra-Red Exploration (MOSFIRE) two galaxies multiply imaged by the cluster. Our analysis reveals a modest lens, with an effective Einstein radius of {θ }e(z=2)=12+/- 1\\prime\\prime , enclosing 2.1+/- 0.3× {10}13 M⊙. We briefly discuss the SL properties of the cluster, using two different modeling techniques (see the text for details), and make the mass models publicly available (ftp://wise-ftp.tau.ac.il/pub/adiz/MACS1319/). Independently, we identified a noteworthy, young shell galaxy (SG) system forming around two likely interacting cluster members, 20″ north of the brightest cluster galaxy. SGs are rare in galaxy clusters, and indeed, a simple estimate reveals that they are only expected in roughly one in several dozen, to several hundred, massive galaxy clusters (the estimate can easily change by an order of magnitude within a reasonable range of characteristic values relevant for the calculation). Taking advantage of our lens model best-fit, mass-to-light scaling relation for cluster members, we infer that the total mass of the SG system is ∼ 1.3× {10}11 {M}ȯ , with a host-to-companion mass ratio of about 10:1. Despite being rare in high density environments, the SG constitutes an example to how stars of cluster galaxies are efficiently redistributed to the intra-cluster medium. Dedicated numerical simulations for the observed shell configuration, perhaps aided by the mass model, might cast interesting light on the interaction history and properties of the two galaxies. An archival HST search in galaxy cluster images can reveal more such systems.

  15. Clustering-initiated factor analysis application for tissue classification in dynamic brain positron emission tomography.

    PubMed

    Boutchko, Rostyslav; Mitra, Debasis; Baker, Suzanne L; Jagust, William J; Gullberg, Grant T

    2015-07-01

    The goal is to quantify the fraction of tissues that exhibit specific tracer binding in dynamic brain positron emission tomography (PET). It is achieved using a new method of dynamic image processing: clustering-initiated factor analysis (CIFA). Standard processing of such data relies on region of interest analysis and approximate models of the tracer kinetics and of tissue properties, which can degrade accuracy and reproducibility of the analysis. Clustering-initiated factor analysis allows accurate determination of the time-activity curves and spatial distributions for tissues that exhibit significant radiotracer concentration at any stage of the emission scan, including the arterial input function. We used this approach in the analysis of PET images obtained using (11)C-Pittsburgh Compound B in which specific binding reflects the presence of β-amyloid. The fraction of the specific binding tissues determined using our approach correlated with that computed using the Logan graphical analysis. We believe that CIFA can be an accurate and convenient tool for measuring specific binding tissue concentration and for analyzing tracer kinetics from dynamic images for a variety of PET tracers. As an illustration, we show that four-factor CIFA allows extraction of two blood curves and the corresponding distributions of arterial and venous blood from PET images even with a coarse temporal resolution.

  16. Analysis of plasmaspheric plumes: CLUSTER and IMAGE observations and numerical simulations

    NASA Technical Reports Server (NTRS)

    Darouzet, Fabien; DeKeyser, Johan; Decreau, Pierrette; Gallagher, Dennis; Pierrard, Viviane; Lemaire, Joseph; Dandouras, Iannis; Matsui, Hiroshi; Dunlop, Malcolm; Andre, Mats

    2005-01-01

    Plasmaspheric plumes have been routinely observed by CLUSTER and IMAGE. The CLUSTER mission provides high time resolution four-point measurements of the plasmasphere near perigee. Total electron density profiles can be derived from the plasma frequency and/or from the spacecraft potential (note that the electron spectrometer is usually not operating inside the plasmasphere); ion velocity is also measured onboard these satellites (but ion density is not reliable because of instrumental limitations). The EUV imager onboard the IMAGE spacecraft provides global images of the plasmasphere with a spatial resolution of 0.1 RE every 10 minutes; such images acquired near apogee from high above the pole show the geometry of plasmaspheric plumes, their evolution and motion. We present coordinated observations for 3 plume events and compare CLUSTER in-situ data (panel A) with global images of the plasmasphere obtained from IMAGE (panel B), and with numerical simulations for the formation of plumes based on a model that includes the interchange instability mechanism (panel C). In particular, we study the geometry and the orientation of plasmaspheric plumes by using a four-point analysis method, the spatial gradient. We also compare several aspects of their motion as determined by different methods: (i) inner and outer plume boundary velocity calculated from time delays of this boundary observed by the wave experiment WHISPER on the four spacecraft, (ii) ion velocity derived from the ion spectrometer CIS onboard CLUSTER, (iii) drift velocity measured by the electron drift instrument ED1 onboard CLUSTER and (iv) global velocity determined from successive EUV images. These different techniques consistently indicate that plasmaspheric plumes rotate around the Earth, with their foot fully co-rotating, but with their tip rotating slower and moving farther out.

  17. Coping profiles, perceived stress and health-related behaviors: a cluster analysis approach.

    PubMed

    Doron, Julie; Trouillet, Raphael; Maneveau, Anaïs; Ninot, Grégory; Neveu, Dorine

    2015-03-01

    Using cluster analytical procedure, this study aimed (i) to determine whether people could be differentiated on the basis of coping profiles (or unique combinations of coping strategies); and (ii) to examine the relationships between these profiles and perceived stress and health-related behaviors. A sample of 578 French students (345 females, 233 males; M(age)= 21.78, SD(age)= 2.21) completed the Perceived Stress Scale-14 ( Bruchon-Schweitzer, 2002), the Brief COPE ( Muller and Spitz, 2003) and a series of items measuring health-related behaviors. A two-phased cluster analytic procedure (i.e. hierarchical and non-hierarchical-k-means) was employed to derive clusters of coping strategy profiles. The results yielded four distinctive coping profiles: High Copers, Adaptive Copers, Avoidant Copers and Low Copers. The results showed that clusters differed significantly in perceived stress and health-related behaviors. High Copers and Avoidant Copers displayed higher levels of perceived stress and engaged more in unhealthy behavior, compared with Adaptive Copers and Low Copers who reported lower levels of stress and engaged more in healthy behaviors. These findings suggested that individuals' relative reliance on some strategies and de-emphasis on others may be a more advantageous way of understanding the manner in which individuals cope with stress. Therefore, cluster analysis approach may provide an advantage over more traditional statistical techniques by identifying distinct coping profiles that might best benefit from interventions. Future research should consider coping profiles to provide a deeper understanding of the relationships between coping strategies and health outcomes and to identify risk groups.

  18. Identification of responders to inhaled corticosteroids in a chronic obstructive pulmonary disease population using cluster analysis

    PubMed Central

    Hinds, David R; DiSantostefano, Rachael L; Le, Hoa V; Pascoe, Steven

    2016-01-01

    Objectives To identify clusters of patients who may benefit from treatment with an inhaled corticosteroid (ICS)/long-acting β2 agonist (LABA) versus LABA alone, in terms of exacerbation reduction, and to validate previously identified clusters of patients with chronic obstructive pulmonary disease (COPD) (based on diuretic use and reversibility). Design Post hoc supervised cluster analysis using a modified recursive partitioning algorithm of two 1-year randomised, controlled trials of fluticasone furoate (FF)/vilanterol (VI) versus VI alone, with the primary end points of the annual rate of moderate-to-severe exacerbations. Setting Global. Participants 3255 patients with COPD (intent-to-treat populations) with a history of exacerbations in the past year. Interventions FF/VI 50/25 µg, 100/25 µg or 200/25 µg, or VI 25 µg; all one time per day. Outcome measures Mean annual COPD exacerbation rate to identify clusters of patients who benefit from adding an ICS (FF) to VI bronchodilator therapy. Results Three clusters were identified, including two groups that benefit from FF/VI versus VI: patients with blood eosinophils >2.4% (RR=0.68, 95% CI 0.58 to 0.79), or blood eosinophils ≤2.4% and smoking history ≤46 pack-years, experienced a reduced rate of exacerbations with FF/VI versus VI (RR=0.78, 95% CI 0.63 to 0.96), whereas those with blood eosinophils ≤2.4% and smoking history >46 pack-years were identified as non-responders (RR=1.22, 95% CI 0.94 to 1.58). Clusters of patients previously identified in the fluticasone propionate/salmeterol (SAL) versus SAL trials of similar design were not validated; all clusters of patients tended to benefit from FF/VI versus VI alone irrespective of diuretic use and reversibility. Conclusions In patients with COPD with a history of exacerbations, those with greater blood eosinophils or a lower smoking history may benefit more from ICS/LABA versus LABA alone as measured by a reduced rate of exacerbations. In terms of

  19. AVES: A high performance computer cluster array for the INTEGRAL satellite scientific data analysis

    NASA Astrophysics Data System (ADS)

    Federici, Memmo; Martino, Bruno Luigi; Ubertini, Pietro

    2012-07-01

    In this paper we describe a new computing system array, designed, built and now used at the Space Astrophysics and Planetary Institute (IAPS) in Rome, Italy, for the INTEGRAL Space Observatory scientific data analysis. This new system has become necessary in order to reduce the processing time of the INTEGRAL data accumulated during the more than 9 years of in-orbit operation. In order to fulfill the scientific data analysis requirements with a moderately limited investment the starting approach has been to use a `cluster' array of commercial quad-CPU computers, featuring the extremely large scientific and calibration data archive on line.

  20. A Comprehensive Comparison of Different Clustering Methods for Reliability Analysis of Microarray Data

    PubMed Central

    Kafieh, Rahele; Mehridehnavi, Alireza

    2013-01-01

    In this study, we considered some competitive learning methods including hard competitive learning and soft competitive learning with/without fixed network dimensionality for reliability analysis in microarrays. In order to have a more extensive view, and keeping in mind that competitive learning methods aim at error minimization or entropy maximization (different kinds of function optimization), we decided to investigate the abilities of mixture decomposition schemes. Therefore, we assert that this study covers the algorithms based on function optimization with particular insistence on different competitive learning methods. The destination is finding the most powerful method according to a pre-specified criterion determined with numerical methods and matrix similarity measures. Furthermore, we should provide an indication showing the intrinsic ability of the dataset to form clusters before we apply a clustering algorithm. Therefore, we proposed Hopkins statistic as a method for finding the intrinsic ability of a data to be clustered. The results show the remarkable ability of Rayleigh mixture model in comparison with other methods in reliability analysis task. PMID:24083134

  1. Multichannel biomedical time series clustering via hierarchical probabilistic latent semantic analysis.

    PubMed

    Wang, Jin; Sun, Xiangping; Nahavandi, Saeid; Kouzani, Abbas; Wu, Yuchuan; She, Mary

    2014-11-01

    Biomedical time series clustering that automatically groups a collection of time series according to their internal similarity is of importance for medical record management and inspection such as bio-signals archiving and retrieval. In this paper, a novel framework that automatically groups a set of unlabelled multichannel biomedical time series according to their internal structural similarity is proposed. Specifically, we treat a multichannel biomedical time series as a document and extract local segments from the time series as words. We extend a topic model, i.e., the Hierarchical probabilistic Latent Semantic Analysis (H-pLSA), which was originally developed for visual motion analysis to cluster a set of unlabelled multichannel time series. The H-pLSA models each channel of the multichannel time series using a local pLSA in the first layer. The topics learned in the local pLSA are then fed to a global pLSA in the second layer to discover the categories of multichannel time series. Experiments on a dataset extracted from multichannel Electrocardiography (ECG) signals demonstrate that the proposed method performs better than previous state-of-the-art approaches and is relatively robust to the variations of parameters including length of local segments and dictionary size. Although the experimental evaluation used the multichannel ECG signals in a biometric scenario, the proposed algorithm is a universal framework for multichannel biomedical time series clustering according to their structural similarity, which has many applications in biomedical time series management.

  2. The detection of Fermi AGN above 100 GeV using clustering analysis

    NASA Astrophysics Data System (ADS)

    Armstrong, Thomas; Brown, Anthony M.; Chadwick, Paula M.; Nolan, S. J.

    2015-09-01

    The density-based clustering algorithm DBSCAN has been applied to the Fermi Large Area Telescope (LAT) data set of Eγ ≥ 100 GeV events with |b| > 10°, in order to search for new very high energy (VHE) γ-ray sources. The clustering analysis returned 49 clusters, of which 21 correspond to already known VHE-emitting active galactic nuclei (AGN) within the TeVCat catalogue and a further 11 were found to be significant in a full Fermi analysis. Of these, two are previously detected Fermi VHE AGN, and nine represent new VHE sources consisting of six BL Lac objects, one blazar of unknown type and two unassociated sources. Comparing these, along with the VHE AGN RBS 0679 and RBS 0970 previously detected with Fermi-LAT, to the current populations of AGN detected with ground-based instruments and Fermi suggests that the VHE-emitting AGN discovered in this study are very similar to the TeVCat AGN and therefore further observations with ground-based imaging atmospheric Cherenkov telescopes are recommended.

  3. The Chloroplast Genome of Euglena mutabilis-Cluster Arrangement, Intron Analysis, and Intrageneric Trends.

    PubMed

    Dabbagh, Nadja; Preisfeld, Angelika

    2017-01-01

    A comparative analysis of the chloroplast genome of Euglena mutabilis underlined a high diversity in the evolution of plastids in euglenids. Gene clusters in more derived Euglenales increased in complexity with only a few, but remarkable changes in the genus Euglena. Euglena mutabilis differed from other Euglena species in a mirror-inverted arrangement of 12 from 15 identified clusters, making it very likely that the emergence at the base of the genus Euglena, which has been considered a long branch artifact, is truly a probable position. This was corroborated by many similarities in gene arrangement and orientation with Strombomonas and Monomorphina, rendering the genome organization of E. mutabilis in certain clusters as plesiomorphic feature. By RNA analysis exact exon-intron boundaries and the type of the 77 introns identified were mostly determined unambiguously. A detailed intron study of psbC pointed at two important issues: First, the number of introns varied even between species, and no trend from few to many introns could be observed. Second, mat1 was localized in Eutreptiales exclusively in intron 1, and mat2 was not identified. With the emergence of Euglenaceae in most species, a new intron containing mat2 inserted in front of the previous intron 1 and thereby became intron 2 with mat1.

  4. Kink-mode Waves and Bifurcated Current Sheets: CLUSTER Observations and Analysis Techniques

    NASA Astrophysics Data System (ADS)

    Cully, C.; Donovan, E.; Buchert, S.; Lucek, E.

    2003-12-01

    Although the magnetic configuration of the tail current sheet in the moments before reconnection is of considerable interest, many fundamental observational questions remain. What does the large-scale structure typically look like? How thick is the sheet? Is it bifurcated? What bulk wavemodes are active, and at what amplitude? Cluster observations, when combined with multipoint analysis techniques, offer the opportunity to observationally resolve some of these questions. We present an analysis technique that we use to first solve for the local normal vector to the current sheet at each data point, and then to identify the presence and wavemode of large-scale bulk wave modes (e.g. kink modes). We then take this motion into account when reconstructing the large-scale structure of the sheet from the measurements. We apply these techniques to Cluster observations of the tail current sheet before a substorm on the 11th of October, 2001. At the Cluster location 19 Re downtail, we find large-amplitude kink-mode waves that are propagating duskward in the minutes before reconnection onset.

  5. Supercomputer and cluster performance modeling and analysis efforts:2004-2006.

    SciTech Connect

    Sturtevant, Judith E.; Ganti, Anand; Meyer, Harold Edward; Stevenson, Joel O.; Benner, Robert E., Jr.; Goudy, Susan Phelps; Doerfler, Douglas W.; Domino, Stefan Paul; Taylor, Mark A.; Malins, Robert Joseph; Scott, Ryan T.; Barnette, Daniel Wayne; Rajan, Mahesh; Ang, James Alfred; Black, Amalia Rebecca; Laub, Thomas William; Vaughan, Courtenay Thomas; Franke, Brian Claude

    2007-02-01

    This report describes efforts by the Performance Modeling and Analysis Team to investigate performance characteristics of Sandia's engineering and scientific applications on the ASC capability and advanced architecture supercomputers, and Sandia's capacity Linux clusters. Efforts to model various aspects of these computers are also discussed. The goals of these efforts are to quantify and compare Sandia's supercomputer and cluster performance characteristics; to reveal strengths and weaknesses in such systems; and to predict performance characteristics of, and provide guidelines for, future acquisitions and follow-on systems. Described herein are the results obtained from running benchmarks and applications to extract performance characteristics and comparisons, as well as modeling efforts, obtained during the time period 2004-2006. The format of the report, with hypertext links to numerous additional documents, purposefully minimizes the document size needed to disseminate the extensive results from our research.

  6. Hierarchical cluster analysis in clinical research with heterogeneous study population: highlighting its visualization with R

    PubMed Central

    Murtagh, Fionn; Van Poucke, Sven; Lin, Su; Lan, Peng

    2017-01-01

    Big data clinical research typically involves thousands of patients and there are numerous variables available. Conventionally, these variables can be handled by multivariable regression modeling. In this article, the hierarchical cluster analysis (HCA) is introduced. This method is used to explore similarity between observations and/or clusters. The result can be visualized using heat maps and dendrograms. Sometimes, it would be interesting to add scatter plot and smooth lines into the panels of the heat map. The inherent R heatmap package does not provide this function. A series of scatter plots can be created using lattice package, and then background color of each panel is mapped to the regression coefficient by using custom-made panel functions. This is the unique feature of the lattice package. Dendrograms and color keys can be added as the legend elements of the lattice system. The latticeExtra package provides some useful functions for the work. PMID:28275620

  7. A cluster analysis of the circumstances of death in suicides in Hong Kong.

    PubMed

    Chen, Eric Y H; Chan, Wincy S C; Chan, Sandra S M; Liu, Ka Y; Chan, Cecilia L W; Wong, Paul W C; Law, Y W; Yip, Paul S F

    2007-10-01

    Classification of suicides is essential for clinicians to better identify self-harm patients with future suicidal risks. This study examined potential subtypes of suicide in a psychological autopsy sample (N = 148) in Hong Kong. Hierarchical cluster analysis extracted two subgroups of subjects in terms of expressed deliberation assessed by the Beck Suicide Intent Scale (SIS). The first group was associated with charcoal burning suicide, no psychiatric illness, indebtedness, better problem-solving ability, chronic stress, and higher overall SIS scores. The second group was associated with jumping from a height, psychotic disorders, psychiatric treatment, acute stress, and lower overall SIS score. The existence of a substantial cluster of subjects with lower expressed intent and preparation has important implications for the performance of the SIS as a predictive tool. Suicide prevention strategy may have to target potential subgroups with specific approaches.

  8. A weak lensing analysis of the PLCK G100.2-30.4 cluster

    NASA Astrophysics Data System (ADS)

    Radovich, M.; Formicola, I.; Meneghetti, M.; Bartalucci, I.; Bourdin, H.; Mazzotta, P.; Moscardini, L.; Ettori, S.; Arnaud, M.; Pratt, G. W.; Aghanim, N.; Dahle, H.; Douspis, M.; Pointecouteau, E.; Grado, A.

    2015-07-01

    We present a mass estimate of the Planck-discovered cluster PLCK G100.2-30.4, derived from a weak lensing analysis of deep Subaru griz images. We perform a careful selection of the background galaxies using the multi-band imaging data, and undertake the weak lensing analysis on the deep (1 h) r -band image. The shape measurement is based on the Kaiser-Squires-Broadhurst algorithm; we adopt the PSFex software to model the point spread function (PSF) across the field and correct for this in the shape measurement. The weak lensing analysis is validated through extensive image simulations. We compare the resulting weak lensing mass profile and total mass estimate to those obtained from our re-analysis of XMM-Newton observations, derived under the hypothesis of hydrostatic equilibrium. The total integrated mass profiles agree remarkably well, within 1σ across their common radial range. A mass M500 ~ 7 × 1014M⊙ is derived for the cluster from our weak lensing analysis. Comparing this value to that obtained from our reanalysis of XMM-Newton data, we obtain a bias factor of (1-b) = 0.8 ± 0.1. This is compatible within 1σ with the value of (1-b) obtained in Planck 2015 from the calibration of the bias factor using newly available weak lensing reconstructed masses. Based on data collected at Subaru Telescope (University of Tokyo).

  9. The human RHOX gene cluster: target genes and functional analysis of gene variants in infertile men.

    PubMed

    Borgmann, Jennifer; Tüttelmann, Frank; Dworniczak, Bernd; Röpke, Albrecht; Song, Hye-Won; Kliesch, Sabine; Wilkinson, Miles F; Laurentino, Sandra; Gromoll, Jörg

    2016-09-15

    The X-linked reproductive homeobox (RHOX) gene cluster encodes transcription factors preferentially expressed in reproductive tissues. This gene cluster has important roles in male fertility based on phenotypic defects of Rhox-mutant mice and the finding that aberrant RHOX promoter methylation is strongly associated with abnormal human sperm parameters. However, little is known about the molecular mechanism of RHOX function in humans. Using gene expression profiling, we identified genes regulated by members of the human RHOX gene cluster. Some genes were uniquely regulated by RHOXF1 or RHOXF2/2B, while others were regulated by both of these transcription factors. Several of these regulated genes encode proteins involved in processes relevant to spermatogenesis; e.g. stress protection and cell survival. One of the target genes of RHOXF2/2B is RHOXF1, suggesting cross-regulation to enhance transcriptional responses. The potential role of RHOX in human infertility was addressed by sequencing all RHOX exons in a group of 250 patients with severe oligozoospermia. This revealed two mutations in RHOXF1 (c.515G > A and c.522C > T) and four in RHOXF2/2B (-73C > G, c.202G > A, c.411C > T and c.679G > A), of which only one (c.202G > A) was found in a control group of men with normal sperm concentration. Functional analysis demonstrated that c.202G > A and c.679G > A significantly impaired the ability of RHOXF2/2B to regulate downstream genes. Molecular modelling suggested that these mutations alter RHOXF2/F2B protein conformation. By combining clinical data with in vitro functional analysis, we demonstrate how the X-linked RHOX gene cluster may function in normal human spermatogenesis and we provide evidence that it is impaired in human male fertility.

  10. Automated regional registration and characterization of corresponding microcalcification clusters on temporal pairs of mammograms for interval change analysis

    SciTech Connect

    Filev, Peter; Hadjiiski, Lubomir; Chan, Heang-Ping; Sahiner, Berkman; Ge Jun; Helvie, Mark A.; Roubidoux, Marilyn; Zhou Chuan

    2008-12-15

    A computerized regional registration and characterization system for analysis of microcalcification clusters on serial mammograms is being developed in our laboratory. The system consists of two stages. In the first stage, based on the location of a detected cluster on the current mammogram, a regional registration procedure identifies the local area on the prior that may contain the corresponding cluster. A search program is used to detect cluster candidates within the local area. The detected cluster on the current image is then paired with the cluster candidates on the prior image to form true (TP-TP) or false (TP-FP) pairs. Automatically extracted features were used in a newly designed correspondence classifier to reduce the number of false pairs. In the second stage, a temporal classifier, based on both current and prior information, is used if a cluster has been detected on the prior image, and a current classifier, based on current information alone, is used if no prior cluster has been detected. The data set used in this study consisted of 261 serial pairs containing biopsy-proven calcification clusters. An MQSA radiologist identified the corresponding clusters on the mammograms. On the priors, the radiologist rated the subtlety of 30 clusters (out of the 261 clusters) as 9 or 10 on a scale of 1 (very obvious) to 10 (very subtle). Leave-one-case-out resampling was used for feature selection and classification in both the correspondence and malignant/benign classification schemes. The search program detected 91.2%(238/261) of the clusters on the priors with an average of 0.42 FPs/image. The correspondence classifier identified 86.6%(226/261) of the TP-TP pairs with 20 false matches (0.08 FPs/image) relative to the entire set of 261 image pairs. In the malignant/benign classification stage the temporal classifier achieved a test A{sub z} of 0.81 for the 246 pairs which contained a detection on the prior. In addition, a classifier was designed by using the

  11. Cluster analysis for percolation on a two-dimensional fully frustrated system

    NASA Astrophysics Data System (ADS)

    Franzese, Giancarlo

    1996-12-01

    The percolation of Kandel, Ben-Av and Domany clusters for a two-dimensional fully frustrated Ising model is extensively studied through numerical simulations. Critical exponents, cluster distribution and fractal dimension of a percolating cluster are given.

  12. Emergy-based comparative analysis on industrial clusters: economic and technological development zone of Shenyang area, China.

    PubMed

    Liu, Zhe; Geng, Yong; Zhang, Pan; Dong, Huijuan; Liu, Zuoxi

    2014-09-01

    In China, local governments of many areas prefer to give priority to the development of heavy industrial clusters in pursuit of high value of gross domestic production (GDP) growth to get political achievements, which usually results in higher costs from ecological degradation and environmental pollution. Therefore, effective methods and reasonable evaluation system are urgently needed to evaluate the overall efficiency of industrial clusters. Emergy methods links economic and ecological systems together, which can evaluate the contribution of ecological products and services as well as the load placed on environmental systems. This method has been successfully applied in many case studies of ecosystem but seldom in industrial clusters. This study applied the methodology of emergy analysis to perform the efficiency of industrial clusters through a series of emergy-based indices as well as the proposed indicators. A case study of Shenyang Economic Technological Development Area (SETDA) was investigated to show the emergy method's practical potential to evaluate industrial clusters to inform environmental policy making. The results of our study showed that the industrial cluster of electric equipment and electronic manufacturing produced the most economic value and had the highest efficiency of energy utilization among the four industrial clusters. However, the sustainability index of the industrial cluster of food and beverage processing was better than the other industrial clusters.

  13. [Application of HATR-FTIR spectroscopy combined with cluster analysis to identification of Cuscuta chinensis lam and its unofficial varieties].

    PubMed

    Hong, Qing-hong; Cheng, Ze-feng; Li, Qun-li

    2008-08-01

    Horizontal attenuated total reflectance Fourier transform infrared spectroscopy was used to identify Cuscuta chinensis lam. samples directly and their chemical differences were compared. In addition to FTIRS/cluster analysis, the kindredship between the different varieties of official and unofficial Cuscuta chinensis lam was studied. As shown by the results of cluster analysis, the four samples mentioned above were separated to three groups. The proposed method can be effectively applied to analyse the qualitify of Cuscuta chinensis lam.

  14. Formation of Parametric Images in Positron Emission Tomography Using a Clustering-Based Kinetic Analysis With Statistical Clustering

    DTIC Science & Technology

    2007-11-02

    constant describing the conversion rate from FDG to FDG{6{PO4. In the clus- tering method, K1 is considered to be a scaling factor, whereas k2 and k3... ve minutes for clustering, and one minute for estimation. 0 0.015 0.03 0.045 0.06 0.075 0.09 0.105 0.12

  15. The use of the wavelet cluster analysis for asteroid family determination

    NASA Technical Reports Server (NTRS)

    Benjoya, Phillippe; Slezak, E.; Froeschle, Claude

    1992-01-01

    The asteroid family determination has been analysis method dependent for a longtime. A new cluster analysis based on the wavelet transform has allowed an automatic definition of families with a degree of significance versus randomness. Actually this method is rather general and can be applied to any kind of structural analysis. We will rather concentrate on the main features of the method. The analysis has been performed on the set of 4100 asteroid proper elements computed by Milani and Knezevic (see Milani and Knezevic 1990). Twenty one families have been found and influence of the chosen metric has been tested. The results have beem compared to Zappala et al.'s ones (see Zappala et al 1990) obtained by the use of a completely different method applied to the same set of data. For the first time, a good overlapping has been found between both method results, not only for the big well known families but also for the smallest ones.

  16. The effect of close relatives on unsupervised Bayesian clustering algorithms in population genetic structure analysis.

    PubMed

    Rodríguez-Ramilo, Silvia T; Wang, Jinliang

    2012-09-01

    The inference of population genetic structures is essential in many research areas in population genetics, conservation biology and evolutionary biology. Recently, unsupervised Bayesian clustering algorithms have been developed to detect a hidden population structure from genotypic data, assuming among others that individuals taken from the population are unrelated. Under this assumption, markers in a sample taken from a subpopulation can be considered to be in Hardy-Weinberg and linkage equilibrium. However, close relatives might be sampled from the same subpopulation, and consequently, might cause Hardy-Weinberg and linkage disequilibrium and thus bias a population genetic structure analysis. In this study, we used simulated and real data to investigate the impact of close relatives in a sample on Bayesian population structure analysis. We also showed that, when close relatives were identified by a pedigree reconstruction approach and removed, the accuracy of a population genetic structure analysis can be greatly improved. The results indicate that unsupervised Bayesian clustering algorithms cannot be used blindly to detect genetic structure in a sample with closely related individuals. Rather, when closely related individuals are suspected to be frequent in a sample, these individuals should be first identified and removed before conducting a population structure analysis.

  17. Understanding coastal change using shoreline trend analysis supported by cluster-based segmentation

    NASA Astrophysics Data System (ADS)

    Burningham, Helene; French, Jon

    2017-04-01

    Shoreline change analysis is a well defined and widely adopted approach for the examination of trends in coastal position over different timescales. Conventional shoreline change metrics are best suited to resolving progressive quasi-linear trends. However, coastal change is often highly non-linear and may exhibit complex behaviour including trend-reversals. This paper advocates a secondary level of investigation based on a cluster analysis to resolve a more complete range of coastal behaviours. Cluster-based segmentation of shoreline behaviour is demonstrated with reference to a regional-scale case study of the Suffolk coast, eastern UK. An exceptionally comprehensive suite of shoreline datasets covering the period 1881 to 2015 is used to examine both centennial- and intra-decadal scale change in shoreline position. Analysis of shoreline position changes at a 100 m alongshore interval along 74 km of coastline reveals a number of distinct behaviours. The suite of behaviours varies with the timescale of analysis. There is little evidence of regionally coherent shoreline change. Rather, the analyses reveal a complex interaction between met-ocean forcing, inherited geological and geomorphological controls, and evolving anthropogenic intervention that drives changing foci of erosion and deposition.

  18. Novel classification based on immunohistochemistry combined with hierarchical clustering analysis in non-functioning neuroendocrine tumor patients.

    PubMed

    Iida, Shinya; Miki, Yasuhiro; Ono, Katsuhiko; Akahira, Jun-ichi; Suzuki, Takashi; Ishida, Kazuyuki; Watanabe, Mika; Sasano, Hironobu

    2010-10-01

    Somatostatin analogues ameliorated many symptoms caused by neuroendocrine tumors (NET), but their antitumor activities are limited especially in non-functioning cases. An overactivation of signaling pathways under receptor tyrosine-kinase (RTK) has been recently demonstrated in some NET patients, but its details have remained largely unknown. Therefore, in this study, we immunolocalized therapeutic factors and evaluated the data to study the clinical significance of the molecules in non-functioning Japanese gastrointestinal NET. Fifty-two NET cases were available for examination in this study and expression of somatostatin receptor (sstr) 1, 2A, 2B, 3 and 5, activated form of mammalian target of rapamycin (mTOR), eukaryotic initiation factor 4-binding protein 1 (4EBP1), ribosomal protein s6 (S6), extracellular signal-regulated kinase (ERK) and insulin-like growth factor 1 receptor (IGF-1R) was evaluated using immunohistochemistry. We then studied the correlation among the immunohistochemical results of the individual cases using hierarchical clustering analysis. Results of clustering analysis demonstrated that NET cases were basically classified into Cluster I and II. Cluster I was associated with higher expression of sstr1, 2B and 3 and Cluster II was characterized by an activation of the PI3K/Akt pathway and IGF-1R and higher proliferative status. Cluster II was further classified into Cluster IIa and IIb. Cluster IIa was associated with higher expression of sstr1 and 5 and higher proliferative status and Cluster IIb was characterized by ERK activation. Hierarchical clustering analysis of immunoreactivity of the therapeutic factors can classify NET cases into three distinctive groups and the medical treatment may be determined according to this novel classification method for non-functioning NET patients.

  19. JOINT ANALYSIS OF CLUSTER OBSERVATIONS. II. CHANDRA/XMM-NEWTON X-RAY AND WEAK LENSING SCALING RELATIONS FOR A SAMPLE OF 50 RICH CLUSTERS OF GALAXIES

    SciTech Connect

    Mahdavi, Andisheh; Hoekstra, Henk; Babul, Arif; Bildfell, Chris; Jeltema, Tesla; Henry, J. Patrick

    2013-04-20

    We present a study of multiwavelength X-ray and weak lensing scaling relations for a sample of 50 clusters of galaxies. Our analysis combines Chandra and XMM-Newton data using an energy-dependent cross-calibration. After considering a number of scaling relations, we find that gas mass is the most robust estimator of weak lensing mass, yielding 15% {+-} 6% intrinsic scatter at r{sub 500}{sup WL} (the pseudo-pressure Y{sub X} yields a consistent scatter of 22% {+-} 5%). The scatter does not change when measured within a fixed physical radius of 1 Mpc. Clusters with small brightest cluster galaxy (BCG) to X-ray peak offsets constitute a very regular population whose members have the same gas mass fractions and whose even smaller (<10%) deviations from regularity can be ascribed to line of sight geometrical effects alone. Cool-core clusters, while a somewhat different population, also show the same (<10%) scatter in the gas mass-lensing mass relation. There is a good correlation and a hint of bimodality in the plane defined by BCG offset and central entropy (or central cooling time). The pseudo-pressure Y{sub X} does not discriminate between the more relaxed and less relaxed populations, making it perhaps the more even-handed mass proxy for surveys. Overall, hydrostatic masses underestimate weak lensing masses by 10% on the average at r{sub 500}{sup WL}; but cool-core clusters are consistent with no bias, while non-cool-core clusters have a large and constant 15%-20% bias between r{sub 2500}{sup WL} and r{sub 500}{sup WL}, in agreement with N-body simulations incorporating unthermalized gas. For non-cool-core clusters, the bias correlates well with BCG ellipticity. We also examine centroid shift variance and power ratios to quantify substructure; these quantities do not correlate with residuals in the scaling relations. Individual clusters have for the most part forgotten the source of their departures from self-similarity.

  20. Frequency Clustering Analysis for Resting State Functional Magnetic Resonance Imaging Based on Hilbert-Huang Transform

    PubMed Central

    Wu, Xia; Wu, Tong; Liu, Chenghua; Wen, Xiaotong; Yao, Li

    2017-01-01

    Objective: Exploring resting-state functional networks using functional magnetic resonance imaging (fMRI) is a hot topic in the field of brain functions. Previous studies suggested that the frequency dependence between blood oxygen level dependent (BOLD) signals may convey meaningful information regarding interactions between brain regions. Methods: In this article, we introduced a novel frequency clustering analysis method based on Hilbert-Huang Transform (HHT) and a label-replacement procedure. First, the time series from multiple predefined regions of interest (ROIs) were extracted. Second, each time series was decomposed into several intrinsic mode functions (IMFs) by using HHT. Third, the improved k-means clustering method using a label-replacement method was applied to the data of each subject to classify the ROIs into different classes. Results: Two independent resting-state fMRI dataset of healthy subjects were analyzed to test the efficacy of method. The results show almost identical clusters when applied to different runs of a dataset or to different datasets, indicating a stable performance of our framework. Conclusions and Significance: Our framework provided a novel measure for functional segregation of the brain according to time-frequency characteristics of resting state BOLD activities. PMID:28261074

  1. Quasichemical analysis of the cluster-pair approximation for the thermodynamics of proton hydration

    SciTech Connect

    Pollard, Travis; Beck, Thomas L.

    2014-06-14

    A theoretical analysis of the cluster-pair approximation (CPA) is presented based on the quasichemical theory of solutions. The sought single-ion hydration free energy of the proton includes an interfacial potential contribution by definition. It is shown, however, that the CPA involves an extra-thermodynamic assumption that does not guarantee uniform convergence to a bulk free energy value with increasing cluster size. A numerical test of the CPA is performed using the classical polarizable AMOEBA force field and supporting quantum chemical calculations. The enthalpy and free energy differences are computed for the kosmotropic Na{sup +}/F{sup −} ion pair in water clusters of size n = 5, 25, 105. Additional calculations are performed for the chaotropic Rb{sup +}/I{sup −} ion pair. A small shift in the proton hydration free energy and a larger shift in the hydration enthalpy, relative to the CPA values, are predicted based on the n = 105 simulations. The shifts arise from a combination of sequential hydration and interfacial potential effects. The AMOEBA and quantum chemical results suggest an electrochemical surface potential of water in the range −0.4 to −0.5 V. The physical content of single-ion free energies and implications for ion-water force field development are also discussed.

  2. The Open Connectome Project Data Cluster: Scalable Analysis and Vision for High-Throughput Neuroscience.

    PubMed

    Burns, Randal; Roncal, William Gray; Kleissas, Dean; Lillaney, Kunal; Manavalan, Priya; Perlman, Eric; Berger, Daniel R; Bock, Davi D; Chung, Kwanghun; Grosenick, Logan; Kasthuri, Narayanan; Weiler, Nicholas C; Deisseroth, Karl; Kazhdan, Michael; Lichtman, Jeff; Reid, R Clay; Smith, Stephen J; Szalay, Alexander S; Vogelstein, Joshua T; Vogelstein, R Jacob

    2013-01-01

    We describe a scalable database cluster for the spatial analysis and annotation of high-throughput brain imaging data, initially for 3-d electron microscopy image stacks, but for time-series and multi-channel data as well. The system was designed primarily for workloads that build connectomes- neural connectivity maps of the brain-using the parallel execution of computer vision algorithms on high-performance compute clusters. These services and open-science data sets are publicly available at openconnecto.me. The system design inherits much from NoSQL scale-out and data-intensive computing architectures. We distribute data to cluster nodes by partitioning a spatial index. We direct I/O to different systems-reads to parallel disk arrays and writes to solid-state storage-to avoid I/O interference and maximize throughput. All programming interfaces are RESTful Web services, which are simple and stateless, improving scalability and usability. We include a performance evaluation of the production system, highlighting the effec-tiveness of spatial data organization.

  3. Clustering drug-drug interaction networks with energy model layouts: community analysis and drug repurposing

    PubMed Central

    Udrescu, Lucreţia; Sbârcea, Laura; Topîrceanu, Alexandru; Iovanovici, Alexandru; Kurunczi, Ludovic; Bogdan, Paul; Udrescu, Mihai

    2016-01-01

    Analyzing drug-drug interactions may unravel previously unknown drug action patterns, leading to the development of new drug discovery tools. We present a new approach to analyzing drug-drug interaction networks, based on clustering and topological community detection techniques that are specific to complex network science. Our methodology uncovers functional drug categories along with the intricate relationships between them. Using modularity-based and energy-model layout community detection algorithms, we link the network clusters to 9 relevant pharmacological properties. Out of the 1141 drugs from the DrugBank 4.1 database, our extensive literature survey and cross-checking with other databases such as Drugs.com, RxList, and DrugBank 4.3 confirm the predicted properties for 85% of the drugs. As such, we argue that network analysis offers a high-level grasp on a wide area of pharmacological aspects, indicating possible unaccounted interactions and missing pharmacological properties that can lead to drug repositioning for the 15% drugs which seem to be inconsistent with the predicted property. Also, by using network centralities, we can rank drugs according to their interaction potential for both simple and complex multi-pathology therapies. Moreover, our clustering approach can be extended for applications such as analyzing drug-target interactions or phenotyping patients in personalized medicine applications. PMID:27599720

  4. Selecting background galaxies in weak-lensing analysis of galaxy clusters

    NASA Astrophysics Data System (ADS)

    Formicola, I.; Radovich, M.; Meneghetti, M.; Mazzotta, P.; Grado, A.; Giocoli, C.

    2016-05-01

    In this paper, we present a new method to select the faint, background galaxies used to derive the mass of galaxy clusters by weak lensing. The method is based on the simultaneous analysis of the shear signal, that should be consistent with zero for the foreground, unlensed galaxies, and of the colours of the galaxies: photometric data from the COSMic evOlution Survey are used to train the colour selection. In order to validate this methodology, we test it against a set of state-of-the-art image simulations of mock galaxy clusters in different redshift [0.23-0.45] and mass [0.5-1.55 × 1015 M⊙] ranges, mimicking medium-deep multicolour imaging observations [e.g. Subaru, Large Binocular Telescope]. The performance of our method in terms of contamination by unlensed sources is comparable to a selection based on photometric redshifts, which however requires a good spectral coverage and is thus much more observationally demanding. The application of our method to simulations gives an average ratio between estimated and true masses of ˜0.98 ± 0.09. As a further test, we finally apply our method to real data, and compare our results with other weak-lensing mass estimates in the literature: for this purpose, we choose the cluster Abell 2219 (z = 0.228), for which multiband (BVRi) data are publicly available.

  5. Cluster Analysis of Atmospheric Dynamics and Pollution Transport in a Coastal Area

    NASA Astrophysics Data System (ADS)

    Sokolov, Anton; Dmitriev, Egor; Maksimovich, Elena; Delbarre, Hervé; Augustin, Patrick; Gengembre, Cyril; Fourmentin, Marc; Locoge, Nadine

    2016-11-01

    Summertime atmospheric dynamics in the coastal zone of the industrialized Dunkerque agglomeration in northern France was characterized by a cluster analysis of back trajectories in the context of pollution transport. The MESO-NH atmospheric model was used to simulate the local dynamics at multiple scales with horizontal resolution down to 500 m, and for the online calculation of the Lagrangian backward trajectories with 30-min temporal resolution. Airmass transport was performed along six principal pathways obtained by the weighted k-means clustering technique. Four of these centroids corresponded to a range of wind speeds over the English Channel: two for wind directions from the north-east and two from the south-west. Another pathway corresponded to a south-westerly continental transport. The backward trajectories of the largest and most dispersed sixth cluster contained low wind speeds, including sea-breeze circulations. Based on analyses of meteorological data and pollution measurements, the principal atmospheric pathways were related to local air-contamination events. Continuous air quality and meteorological data were collected during the Benzene-Toluene-Ethylbenzene-Xylene 2006 campaign. The sites of the pollution measurements served as the endpoints for the backward trajectories. Pollutant transport pathways corresponding to the highest air contamination were defined.

  6. The Open Connectome Project Data Cluster: Scalable Analysis and Vision for High-Throughput Neuroscience

    PubMed Central

    Burns, Randal; Roncal, William Gray; Kleissas, Dean; Lillaney, Kunal; Manavalan, Priya; Perlman, Eric; Berger, Daniel R.; Bock, Davi D.; Chung, Kwanghun; Grosenick, Logan; Kasthuri, Narayanan; Weiler, Nicholas C.; Deisseroth, Karl; Kazhdan, Michael; Lichtman, Jeff; Reid, R. Clay; Smith, Stephen J.; Szalay, Alexander S.; Vogelstein, Joshua T.; Vogelstein, R. Jacob

    2013-01-01

    We describe a scalable database cluster for the spatial analysis and annotation of high-throughput brain imaging data, initially for 3-d electron microscopy image stacks, but for time-series and multi-channel data as well. The system was designed primarily for workloads that build connectomes— neural connectivity maps of the brain—using the parallel execution of computer vision algorithms on high-performance compute clusters. These services and open-science data sets are publicly available at openconnecto.me. The system design inherits much from NoSQL scale-out and data-intensive computing architectures. We distribute data to cluster nodes by partitioning a spatial index. We direct I/O to different systems—reads to parallel disk arrays and writes to solid-state storage—to avoid I/O interference and maximize throughput. All programming interfaces are RESTful Web services, which are simple and stateless, improving scalability and usability. We include a performance evaluation of the production system, highlighting the effec-tiveness of spatial data organization. PMID:24401992

  7. Quasichemical analysis of the cluster-pair approximation for the thermodynamics of proton hydration

    NASA Astrophysics Data System (ADS)

    Pollard, Travis; Beck, Thomas L.

    2014-06-01

    A theoretical analysis of the cluster-pair approximation (CPA) is presented based on the quasichemical theory of solutions. The sought single-ion hydration free energy of the proton includes an interfacial potential contribution by definition. It is shown, however, that the CPA involves an extra-thermodynamic assumption that does not guarantee uniform convergence to a bulk free energy value with increasing cluster size. A numerical test of the CPA is performed using the classical polarizable AMOEBA force field and supporting quantum chemical calculations. The enthalpy and free energy differences are computed for the kosmotropic Na+/F- ion pair in water clusters of size n = 5, 25, 105. Additional calculations are performed for the chaotropic Rb+/I- ion pair. A small shift in the proton hydration free energy and a larger shift in the hydration enthalpy, relative to the CPA values, are predicted based on the n = 105 simulations. The shifts arise from a combination of sequential hydration and interfacial potential effects. The AMOEBA and quantum chemical results suggest an electrochemical surface potential of water in the range -0.4 to -0.5 V. The physical content of single-ion free energies and implications for ion-water force field development are also discussed.

  8. Genomic structural analysis of porcine fatty acid desaturase cluster on chromosome 2.

    PubMed

    Taniguchi, Masaaki; Arakawa, Aisaku; Motoyama, Michiyo; Nakajima, Ikuyo; Nii, Masahiro; Mikawa, Satoshi

    2015-04-01

    Fatty acid composition is an economically important trait in meat-producing livestock. To gain insight into the molecular genetics of fatty acid desaturase (FADS) genes in pigs, we investigated the genomic structure of the porcine FADS gene family on chromosome 2. We also examined the tissue distribution of FADS gene expression. The genomic structure of FADS family in mammals consists of three isoforms FADS1, FADS2 and FADS3. However, porcine FADS cluster in the latest pig genome assembly (Sscrofa 10.2) containing some gaps is distinct from that in other mammals. We therefore sought to determine the genomic structure, including the FADS cluster in a 200-kbp range by sequencing gap regions. The structure we obtained was similar to that in other mammals. We then investigated the porcine FADS1 transcription start site and identified a novel isoform named FADS1b. Phylogenetic analysis revealed that the three members of the FADS cluster were orthologous among mammals, whereas the various FADS1 isoforms identified in pigs, mice and cattle might be attributable to species-specific transcriptional regulation with alternative promoters. Porcine FADS1b and FADS3 isoforms were predominantly expressed in the inner layer of the subcutaneous adipose tissue. Additional analyses will reveal the effects of these functionally unknown isoforms on fatty acid composition in pig fat tissues.

  9. The Initial-Final Mass Relation: Analysis of White Dwarfs in the M7 Open Cluster

    NASA Astrophysics Data System (ADS)

    Cummings, Jeff D.; Kalirai, Jason S.; Geisler, Douglas; Tremblay, Pier-Emmanuel; Mauro, Francesco; Deliyannis, Constantine P.

    2017-01-01

    The initial-final mass relation (IFMR) is a direct comparison of the mass a star forms with on the main sequence to its final mass as a white dwarf. This provides critical information for our understanding of stellar evolution and mass loss, and how these are dependent on initial mass. Our group has done detailed analysis of the known white dwarfs in star clusters to improve the semi-empirical IFMR, but limited data (most importantly at the highest masses) causes remaining uncertainties. Our new wide-field photometric and spectroscopic observations of the young and nearby M7 open cluster have discovered and confirmed five new white dwarfs consistent with single-star membership. Four are intermediate-mass white dwarfs (0.65 to 0.85 Msun) and the final is a white dwarf estimated to be at 1.25 Msun and with an estimated initial mass of 6.75 Msun. Higher signal-to-noise follow-up spectra are required, but these and similar observations of other young and nearby clusters will begin to characterize the poorly explored ultra-high-mass end of the IFMR.

  10. MC2: Multiwavelength and Dynamical Analysis of the Merging Galaxy Cluster ZwCl 0008.8+5215: An Older and Less Massive Bullet Cluster

    NASA Astrophysics Data System (ADS)

    Golovich, Nathan; van Weeren, Reinout J.; Dawson, William A.; Jee, M. James; Wittman, David

    2017-04-01

    We present and analyze a rich data set including Subaru/SuprimeCam, HST/Advanced Camera for Surveys and Wide Field Camera 3, Keck/DEIMOS, Chandra/ACIS-I, and JVLA/C and D array for the merging cluster of galaxies ZwCl 0008.8+5215. With a joint Subaru+HST weak gravitational lensing analysis, we identify two dominant subclusters and estimate the masses to be {M}200={5.7}-1.8+2.8× {10}14 {M}ȯ and {1.2}-0.6+1.4× {10}14 {M}ȯ . We estimate the projected separation between the two subclusters to be {924}-206+243 {kpc}. We perform a clustering analysis of spectroscopically confirmed cluster member galaxies and estimate the line-of-sight velocity difference between the two subclusters to be 92+/- 164 {km} {{{s}}}-1. We further motivate, discuss, and analyze the merger scenario through an analysis of the 42 ks of Chandra/ACIS-I and JVLA/C and D array polarization data. The X-ray surface brightness profile reveals a merging gas-core reminiscent of the Bullet Cluster. The global X-ray luminosity in the 0.5–7.0 keV band is 1.7+/- 0.1× {10}44 {erg} {{{s}}}-1 and the global X-ray temperature is 4.90 ± 0.13 keV. The radio relics are polarized up to 40 % , and along with the masses, velocities, and positions of the two subclusters, we input these quantities into a Monte Carlo dynamical analysis and estimate the merger velocity at pericenter to be {1800}-300+400 {km} {{{s}}}-1. This is a lower-mass version of the Bullet Cluster and therefore may prove useful in testing alternative models of dark matter (DM). We do not find significant offsets between DM and galaxies, but the uncertainties are large with the current lensing data. Furthermore, in the east, the BCG is offset from other luminous cluster galaxies, which poses a puzzle for defining DM–galaxy offsets.

  11. [Classification of allergens by positive percentage agreement and cluster analysis based on specific IgE antibodies in asthmatic children].

    PubMed

    Iwasaki, E; Baba, M

    1992-10-01

    Classification and characterization of allergens is important because allergic patients are sensitized by a variety of allergens. One hundred and sixty-one sera from asthmatic children were investigated for specific IgE antibodies against 35 allergens including 20 inhalants and 15 foods by means of the MAST method. We assessed the allergenic properties of the allergens based on positive percentage agreement and cluster analysis. There was a high positive percentage agreement of specific IgE antibodies between house dust and Dermatophagoides spp., a relatively high agreement between 5 molds, cat and dog epithelium, mugwort and wormwood and 5 grasses. Among the food allergens, the positive percentage agreements were relatively high, especially between cow's milk, casein, cheese, and between 3 cereal grains. In the cluster analysis, house dust and Dermatophagoides spp. made a big cluster; therefore 32 allergens except house dust and mites were analyzed. From the results of the cluster analysis, the major cluster consisted of (1) ragweed, (2) mugwort and wormwood, (3) timothy, sweet vernal, velvet and cultivated rye, (4) wheat, barley and rice, (5) molds, (6) cow's milk, casein, soybean and cheese, (7) shrimp and crab, (8) egg white, (9) Japanese cedar, (10) dog epithelium, (11) cat epithelium. The cluster of grass pollens and cereal grains made one cluster. These results tend to confirm the presence of species cross-reactivities within the major classes of allergens.

  12. Cluster analysis for identifying sub-types of tinnitus: a positron emission tomography and voxel-based morphometry study.

    PubMed

    Schecklmann, Martin; Lehner, Astrid; Poeppl, Timm B; Kreuzer, Peter M; Hajak, Göran; Landgrebe, Michael; Langguth, Berthold

    2012-11-16

    Tinnitus is a heterogeneous disorder with respect to its etiology and phenotype. Thus, the identification of sub-types implicates high relevance for treatment recommendations. For this aim, we used cluster analysis of patients for which clinical data, positron-emission tomography (PET) data and voxel-based morphometry (VBM) data were available. 44 patients with chronic tinnitus were included in this analysis. On a phenotypical level, we used tinnitus distress, duration, and laterality for clustering. To correct PET and VBM data for age, gender, and hearing, we built up a design matrix including these variables as regressors and extracted the residuals. We applied Ward's clustering method and forced cluster analysis to divide the data into two groups for both imaging and phenotypical data. On a phenotypical level the clustered groups differed only in tinnitus laterality (uni- vs. bilateral tinnitus), but not in tinnitus duration, distress, age, gender, and hearing. For grey matter volume, groups differed mainly in frontal, cingulate, temporal, and thalamic areas. For glucose metabolism, groups differed in temporal and parietal areas. The correspondence of classification was near chance level for the interrelationship of all three data set clusters. Thus, we showed that clustering according to imaging data is feasible and might depict a new approach for identifying tinnitus sub-types. However, it remains an open question to what extent the phenotypical and imaging levels may be interrelated. This article is part of a Special Issue entitled: Tinnitus Neuroscience.

  13. Multiband Fourier Analysis and Interstellar Reddening of Variable Stars in the Globular Cluster NGC 6584

    NASA Astrophysics Data System (ADS)

    Villiger, Nathan J.; Weinschenk, Sedrick; Hettinger, Paul T.; Murphy, Brian W.

    2017-01-01

    Globular clusters are excellent objects to study to help us understand the ways in which stars evolve. Key to this understanding are RR Lyrae variable stars. This research focused on the RR Lyrae stars in the globular cluster NGC 6584 to gain a better knowledge of post main sequence stellar evolution, horizontal branch morphology, and interstellar reddening to cluster variables. Using the 0.6 m SARA telescope at CTIO, we obtained nearly 1000 images in B, V, and I bands from July 2014 through July 2015. In addition to our prior work in V-band, this research adds B and I bands. By using difference image analysis, we found 77 variable stars in our 13’ x 13’ field of view. These consisted of 66 RR Lyrae stars, 7 long period variables, and 4 eclipsing binaries. The RR Lyrae stars were divided into 50 RR0 type stars, of which 14 exhibit the Blazhko effect, and 16 RR1 type stars. We found an average period for the RR0 variables of 0.56465 days and 0.30610 for the RR1 variables. By applying Fourier decomposition and examining the light curves in B, V, and I bands for each RR Lyrae variable, we were able to determine an average [Fe/H]JKZW of -1.619 ± 0.090, an average E(B-V) of 0.100 ± 0.032, and a distance to the cluster of 13527 ± 939 pc. This is the first detailed study to use RR Lyrae variable stars to estimate these parameters and the results are consistent with those obtained by other methods.

  14. DMRT gene cluster analysis in the platypus: new insights into genomic organization and regulatory regions.

    PubMed

    El-Mogharbel, Nisrine; Wakefield, Matthew; Deakin, Janine E; Tsend-Ayush, Enkhjargal; Grützner, Frank; Alsop, Amber; Ezaz, Tariq; Marshall Graves, Jennifer A

    2007-01-01

    We isolated and characterized a cluster of platypus DMRT genes and compared their arrangement, location, and sequence across vertebrates. The DMRT gene cluster on human 9p24.3 harbors, in order, DMRT1, DMRT3, and DMRT2, which share a DM domain. DMRT1 is highly conserved and involved in sexual development in vertebrates, and deletions in this region cause sex reversal in humans. Sequence comparisons of DMRT genes between species have been valuable in identifying exons, control regions, and conserved nongenic regions (CNGs). The addition of platypus sequences is expected to be particularly valuable, since monotremes fill a gap in the vertebrate genome coverage. We therefore isolated and fully sequenced platypus BAC clones containing DMRT3 and DMRT2 as well as DMRT1 and then generated multispecies alignments and ran prediction programs followed by experimental verification to annotate this gene cluster. We found that the three genes have 58-66% identity to their human orthologues, lie in the same order as in other vertebrates, and colocate on 1 of the 10 platypus sex chromosomes, X5. We also predict that optimal annotation of the newly sequenced platypus genome will be challenging. The analysis of platypus sequence revealed differences in structure and sequence of the DMRT gene cluster. Multispecies comparison was particularly effective for detecting CNGs, revealing several novel potential regulatory regions within DMRT3 and DMRT2 as well as DMRT1. RT-PCR indicated that platypus DMRT1 and DMRT3 are expressed specifically in the adult testis (and not ovary), but DMRT2 has a wider expression profile, as it does for other mammals. The platypus DMRT1 expression pattern, and its location on an X chromosome, suggests an involvement in monotreme sexual development.

  15. Similarity and Cluster Analysis of Intermediate Deep Events in the Southeastern Aegean

    NASA Astrophysics Data System (ADS)

    Ruscic, Marija; Becker, Dirk; Brüstle, Andrea; Meier, Thomas

    2016-04-01

    In order to gain a better understanding of geodynamic processes in the Hellenic subduction zone (HSZ), in particular in the eastern part of the HSZ, we analyze a cluster of intermediate deep events in the region of Nisyros volcano. The cluster recorded during the deployment of the temporary seismic network EGELADOS consists of 159 events at 80 to 200 km depth with local magnitudes ranging from magnitude 0.2 to magnitude 4.1. The network itself consisted of 56 onshore and 23 offshore broadband stations completed by 19 permanent stations from NOA, GEOFON and MedNet. It was deployed from September 2005 to March 2007 and it covered the entire HSZ. Here, both spatial and temporal clustering of the recorded events is studied by using the three component similarity analysis. The waveform cross-correlation was performed for all event combinations using data recorded on 45 onshore stations. The results are shown as a function of frequency for individual stations and as averaged values over the network. The cross-correlation coefficients at the single stations show a decreasing similarity with increasing epicentral distance as well as the effect of local heterogeneities at particular stations, causing noticeable differences in waveform similarities. Event relocation was performed by using the double-difference earthquake relocation software HypoDD and the results are compared with previously obtained single event locations which were calculated using nonlinear location tool NonLinLoc and station corrections. For the relocation, both differential travel times obtained by separate cross-correlation of P- and S-waveforms and manual readings of onset times are used. It is shown that after the relocation the inter-event distance for highly similar events has been reduced. By comparing the results of the cluster analysis with results obtained from the synthetic catalogs, where the event rate, portion and occurrence time of the aftershocks is varied, it is shown that the event

  16. MC2: Dynamical Analysis of the Merging Galaxy Cluster MACS J1149.5+2223

    NASA Astrophysics Data System (ADS)

    Golovich, Nathan; Dawson, William A.; Wittman, David; Ogrean, Georgiana; van Weeren, Reinout; Bonafede, Annalisa

    2016-11-01

    We present an analysis of the merging cluster MACS J1149.5+2223 using archival imaging from Subaru/Suprime-Cam and multi-object spectroscopy from Keck/DEIMOS and Gemini/GMOS. We employ two- and three-dimensional substructure tests and determine that MACS J1149.5+2223 is composed of two separate mergers among three subclusters occurring ˜1 Gyr apart. The primary merger gives rise to elongated X-ray morphology and a radio relic in the southeast. The brightest cluster galaxy is a member of the northern subcluster of the primary merger. This subcluster is very massive ({16.7}-1.60+1.25× {10}14 {M}⊙ ). The southern subcluster is also very massive ({10.8}-3.54+3.37× {10}14 {M}⊙ ), yet it lacks an associated X-ray surface brightness peak, and it has been unidentified previously despite the detailed study of this Frontier Field cluster. A secondary merger is occurring in the north along the line of sight (LOS) with a third, less massive subcluster ({1.20}-0.34+0.19× {10}14 {M}⊙ ). We perform a Monte Carlo dynamical analysis on the main merger and estimate a collision speed at pericenter of {2770}-310+610 km s-1. We show the merger to be returning from apocenter with core passage occurring {1.16}-0.25+0.50 Gyr before the observed state. We identify the LOS merging subcluster in a strong lensing analysis in the literature and show that it is likely bound to MACS J1149 despite having reached an extreme collision velocity of ˜4000 km s-1.

  17. Discovery of exacerbating cases in chronic hepatitis based on cluster analysis of time-series platelet count data

    NASA Astrophysics Data System (ADS)

    Hirano, Shoji; Tsumoto, Shusaku

    2007-04-01

    This paper reports the results of temporal analysis of platelet (PLT) data in chronic hepatitis dataset. First we briefly introduce a cluster analysis system for temporal data that we have developed. Second, we show the results of cluster analysis of PLT sequences. Third, we show the results of PLT value-based temporal analysis aiming at finding years for reaching F4, years elapsed between stages, and their relationships with virus types and fibrotic stages. The results of cluster analysis indicate that the temporal courses of PLT can be grouped into several patterns each of which presents similarity in average PLT level and increase/decrease trends. The results of value-based analysis suggests that liver fibrosis may proceed faster in the exacerbating cases.

  18. Diffusion of small clusters on metal (100) surfaces: Exact master-equation analysis for lattice-gas models

    SciTech Connect

    Sanchez, J.R.; Evans, J.W.

    1999-01-01

    Exact results are presented for the surface diffusion of small two-dimensional clusters, the constituent atoms of which are commensurate with a square lattice of adsorption sites. Cluster motion is due to the hopping of atoms along the cluster perimeter with various rates. We apply the formalism of Titulaer and Deutch [J. Chem. Phys. {bold 77}, 472 (1982)], which describes evolution in reciprocal space via a linear master equation with dimension equal to the number of cluster configurations. We focus on the regime of rapid hopping of atoms along straight close-packed edges, where certain subsets of configurations cycle rapidly between each other. Each such subset is treated as a single quasiconfiguration, thereby reducing the dimension of the evolution equation, simplifying the analysis, and elucidating limiting behavior. We also discuss the influence of concerted atom motions on the diffusion of tetramers and larger clusters. {copyright} {ital 1999} {ital The American Physical Society}

  19. Combined cluster and discriminant analysis: An efficient chemometric approach in diesel fuel characterization.

    PubMed

    Novák, Márton; Palya, Dóra; Bodai, Zsolt; Nyiri, Zoltán; Magyar, Norbert; Kovács, József; Eke, Zsuzsanna

    2017-01-01

    Combined cluster and discriminant analysis (CCDA) as a chemometric tool in compound specific isotope analysis of diesel fuels was studied. The stable carbon isotope ratios (δ(13)C) of n-alkanes in diesel fuel can be used to characterize or differentiate diesels originating from different sources. We investigated 25 diesel fuel samples representing 20 different brands. The samples were collected from 25 different service stations in 11 European countries over a 2 year period. The n-alkane fraction of diesel fuels was separated using solid-state urea clathrate formation combined with silica gel fractionation. The stable carbon isotope ratios of C10-C24 n-alkanes were measured with gas chromatography-isotope ratio mass spectrometry (GC-IRMS) using perdeuterated n-alkanes as internal standards. Beside the 25 samples one additional diesel fuel was prepared and measured three times to get totally homogenous samples in order to test the performance of our analytical and statistical routine. Stable isotope ratio data were evaluated with hierarchical cluster analysis (HCA), principal component analysis (PCA) and CCDA. CCDA combines two multivariate data analysis methods hierarchical cluster analysis with linear discriminant analysis (LDA). The main idea behind CCDA is to compare the goodness of preconceived (based on the sample origins) and random groupings. In CCDA all the samples were compared pairwise. The results for the parallel sample preparations showed that the analytical procedure does not have any significant effect on the δ(13)C values of n-alkanes. The three parallels proved to be totally homogenous with CCDA. HCA and PCA can be useful tools when the examining of the relationship among several samples is in question. However, these two techniques cannot be always decisive on the origin of similar samples. The initial hypothesis that all diesel fuel samples are considered chemically unique was verified by CCDA. The main advantage of CCDA is that it gives an

  20. [Achene morphology cluster analysis of Taraxacum F. H. Wigg. from northeast China and molecule systematics evidence determined by SRAP].

    PubMed

    Li, Hai-juan; Zhao, Xin; Jia, Qing-fei; Li, Tian-lai; Ning, Wei

    2012-08-01

    The achenes morphological and micro-morphological characteristics of six species of genus Taraxacum from northeastern China as well as SRAP cluster analysis were observed for their classification evidences. The achenes were observed by microscope and EPMA. Cluster analysis was given on the basis of the size, shape, cone proportion, color and surface sculpture of achenes. The Taraxacum inter-species achene shape characteristic difference is obvious, particularly spinulose distribution and size, achene color and achene size; with the Taraxacum plant achene shape the cluster method T. antungense Kitag. and the T. urbanum Kitag. should combine for the identical kind; the achene morphology cluster analysis and the SRAP tagged molecule systematics's cluster result retrieves in the table with "the Chinese flora". The class group to divide the result is consistent. Taraxacum plant achene shape characteristic stable conservative, may carry on the inter-species division and the sibship analysis according to the achene shape characteristic combination difference; the achene morphology cluster analysis as well as the SRAP tagged molecule systematics confirmation support dandelion classification result of "the Chinese flora".

  1. Proper motion survey and kinematic analysis of the ρ Ophiuchi embedded cluster

    NASA Astrophysics Data System (ADS)

    Ducourant, C.; Teixeira, R.; Krone-Martins, A.; Bontemps, S.; Despois, D.; Galli, P. A. B.; Bouy, H.; Le Campion, J. F.; Rapaport, M.; Cuillandre, J. C.

    2017-01-01

    Context. The ρ Ophiuchi molecular complex and in particular the Lynds L1688 dark cloud is unique in its proximity ( 130 pc), in its richness in young stars and protostars, and in its youth (0.5 Myr). It is certainly one of the best targets currently accessible from the ground to study the early phases of star-formation. Proper motion analysis is a very efficient tool for separating members of clusters from field stars, but very few proper motions are available in the ρ Ophiuchi region since most of the young sources are deeply embedded in dust and gas. Aims: We aim at performing a kinematic census of young stellar objects (YSOs) in the ρ Ophiuchi F core and partially in the E core of the L1688 dark cloud. Methods: We run a proper motion program at the ESO New Technology Telescope (NTT) with the Son of ISAAC (SOFI) instrument over nine years in the near-infrared. We complemented these observations with various public image databases to enlarge the time base of observations and the field of investigation to 0.5° × 0.5°. We derived positions and proper motions for 2213 objects. From these, 607 proper motions were derived from SOFI observations with a 1.8 mas/yr accuracy while the remaining objects were measured only from auxiliary data with a mean precision of about 3 mas/yr. Results: We performed a kinematic analysis of the most accurate proper motions derived in this work, which allowed us to separate cluster members from field stars and to derive the mean properties of the cluster. From the kinematic analysis we derived a list of 68 members and 14 candidate members, comprising 26 new objects with a high membership probability. These new members are generally fainter than the known ones. We measured a mean proper motion of (μαcosδ, μδ) = (-8.2,-24.3) ± 0.8 mas/yr for the L1688 dark cloud. A supervised classification was applied to photometric data of members to allocate a spectral energy distribution (SED) classification to the unclassified members

  2. Spatial assessment of air quality patterns in Malaysia using multivariate analysis

    NASA Astrophysics Data System (ADS)

    Dominick, Doreena; Juahir, Hafizan; Latif, Mohd Talib; Zain, Sharifuddin M.; Aris, Ahmad Zaharin

    2012-12-01

    This study aims to investigate possible sources of air pollutants and the spatial patterns within the eight selected Malaysian air monitoring stations based on a two-year database (2008-2009). The multivariate analysis was applied on the dataset. It incorporated Hierarchical Agglomerative Cluster Analysis (HACA) to access the spatial patterns, Principal Component Analysis (PCA) to determine the major sources of the air pollution and Multiple Linear Regression (MLR) to assess the percentage contribution of each air pollutant. The HACA results grouped the eight monitoring stations into three different clusters, based on the characteristics of the air pollutants and meteorological parameters. The PCA analysis showed that the major sources of air pollution were emissions from motor vehicles, aircraft, industries and areas of high population density. The MLR analysis demonstrated that the main pollutant contributing to variability in the Air Pollutant Index (API) at all stations was particulate matter with a diameter of less than 10 μm (PM10). Further MLR analysis showed that the main air pollutant influencing the high concentration of PM10 was carbon monoxide (CO). This was due to combustion processes, particularly originating from motor vehicles. Meteorological factors such as ambient temperature, wind speed and humidity were also noted to influence the concentration of PM10.

  3. Transcriptional analysis of the Streptomyces glaucescens tetracenomycin C biosynthesis gene cluster.

    PubMed Central

    Decker, H; Hutchinson, C R

    1993-01-01

    A 12.6-kb DNA fragment from Streptomyces glaucescens GLA.0 containing the 12 genes for tetracenomycin (TCM) C biosynthesis and resistance enabled Streptomyces lividans to produce TCM C. Transcriptional analysis of the tcmPG intergenic region in this cluster established the presence of two divergent promoters. The tcmIc mutation, a T-to-G transversion in the -10 region of the tcmG promoter, decreased promoter activity drastically at the stationary growth stage and time of maximum TCM C accumulation. This promoter may direct the transcription of a tcmGHIJKLMNO operon, while the other promoter is for tcmP. Images PMID:8509340

  4. CLASH: Weak-lensing shear-and-magnification analysis of 20 galaxy clusters

    SciTech Connect

    Umetsu, Keiichi; Czakon, Nicole; Medezinski, Elinor; Lemze, Doron; Ford, Holland; Nonino, Mario; Balestra, Italo; Biviano, Andrea; Merten, Julian; Postman, Marc; Koekemoer, Anton; Meneghetti, Massimo; Donahue, Megan; Molino, Alberto; Benítez, Narciso; Seitz, Stella; Gruen, Daniel; Broadhurst, Tom; Grillo, Claudio; Melchior, Peter; and others

    2014-11-10

    We present a joint shear-and-magnification weak-lensing analysis of a sample of 16 X-ray-regular and 4 high-magnification galaxy clusters at 0.19 ≲ z ≲ 0.69 selected from the Cluster Lensing And Supernova survey with Hubble (CLASH). Our analysis uses wide-field multi-color imaging, taken primarily with Suprime-Cam on the Subaru Telescope. From a stacked-shear-only analysis of the X-ray-selected subsample, we detect the ensemble-averaged lensing signal with a total signal-to-noise ratio of ≅ 25 in the radial range of 200-3500 kpc h {sup –1}, providing integrated constraints on the halo profile shape and concentration-mass relation. The stacked tangential-shear signal is well described by a family of standard density profiles predicted for dark-matter-dominated halos in gravitational equilibrium, namely, the Navarro-Frenk-White (NFW), truncated variants of NFW, and Einasto models. For the NFW model, we measure a mean concentration of c{sub 200c}=4.01{sub −0.32}{sup +0.35} at an effective halo mass of M{sub 200c}=1.34{sub −0.09}{sup +0.10}×10{sup 15} M{sub ⊙}. We show that this is in excellent agreement with Λ cold dark matter (ΛCDM) predictions when the CLASH X-ray selection function and projection effects are taken into account. The best-fit Einasto shape parameter is α{sub E}=0.191{sub −0.068}{sup +0.071}, which is consistent with the NFW-equivalent Einasto parameter of ∼0.18. We reconstruct projected mass density profiles of all CLASH clusters from a joint likelihood analysis of shear-and-magnification data and measure cluster masses at several characteristic radii assuming an NFW density profile. We also derive an ensemble-averaged total projected mass profile of the X-ray-selected subsample by stacking their individual mass profiles. The stacked total mass profile, constrained by the shear+magnification data, is shown to be consistent with our shear-based halo-model predictions, including the effects of surrounding large-scale structure as

  5. Regression Models for Demand Reduction based on Cluster Analysis of Load Profiles

    SciTech Connect

    Yamaguchi, Nobuyuki; Han, Junqiao; Ghatikar, Girish; Piette, Mary Ann; Asano, Hiroshi; Kiliccote, Sila

    2009-06-28

    This paper provides new regression models for demand reduction of Demand Response programs for the purpose of ex ante evaluation of the programs and screening for recruiting customer enrollment into the programs. The proposed regression models employ load sensitivity to outside air temperature and representative load pattern derived from cluster analysis of customer baseline load as explanatory variables. The proposed models examined their performances from the viewpoint of validity of explanatory variables and fitness of regressions, using actual load profile data of Pacific Gas and Electric Company's commercial and industrial customers who participated in the 2008 Critical Peak Pricing program including Manual and Automated Demand Response.

  6. A new point of view in the analysis of equilibrium and dynamical evolution of globular clusters .

    NASA Astrophysics Data System (ADS)

    Merafina, M.

    We develop models of globular clusters (GCs) with a different approach by applying thermodynamic principles to a Boltzmann distribution function, with an Hamiltonian function which contains an effective potential depending on the kinetic energy of the stars, due to the effect of tidal interactions induced by the hosting galaxy. The Hamiltonian function is solution of the Fokker-Planck equation solved in a different way with respect to the King approach. Interesting results implying a different caloric curve for the analysis of the evolution of GCs are presented.

  7. Time autocorrelation function analysis of master equation and its application to atomic clusters.

    PubMed

    Zhang, Chi; Berry, R Stephen

    2005-09-01

    We derive the energy fluctuation Delta(2)E, and the time autocorrelation kappa(tau) and its Fourier transformation--the fluctuation spectra S(omega)--of the master-equation transition matrix. The contribution from each eigenmode of the transition matrix to these fluctuation quantities reveals the relevant importance of the individual mode in the relaxation processes. The time scales associated with these relaxation processes are determined by the corresponding eigenvalues. Unlike traditional time evolution analysis, the autocorrelation function and fluctuation spectra analysis does not involve an arbitrary initial population. It is also more suitable for analyzing the underlying dynamic, kinetic behavior near the equilibrium and the behavior of the long-time-scale rare events. We utilize our technique to analyze the solid-liquid phase coexistence of the 13-atom Morse cluster and the fcc-to-icosahedral structure transition of the 38-atom Lennard-Jones cluster. For the processes studied, the fluctuation spectra from the master equation simplify the analysis of the transition matrix, and the important relaxation modes are easily extracted.

  8. Redefining the Breast Cancer Exosome Proteome by Tandem Mass Tag Quantitative Proteomics and Multivariate Cluster Analysis.

    PubMed

    Clark, David J; Fondrie, William E; Liao, Zhongping; Hanson, Phyllis I; Fulton, Amy; Mao, Li; Yang, Austin J

    2015-10-20

    Exosomes are microvesicles of endocytic origin constitutively released by multiple cell types into the extracellular environment. With evidence that exosomes can be detected in the blood of patients with various malignancies, the development of a platform that uses exosomes as a diagnostic tool has been proposed. However, it has been difficult to truly define the exosome proteome due to the challenge of discerning contaminant proteins that may be identified via mass spectrometry using various exosome enrichment strategies. To better define the exosome proteome in breast cancer, we incorporated a combination of Tandem-Mass-Tag (TMT) quantitative proteomics approach and Support Vector Machine (SVM) cluster analysis of three conditioned media derived fractions corresponding to a 10 000g cellular debris pellet, a 100 000g crude exosome pellet, and an Optiprep enriched exosome pellet. The quantitative analysis identified 2 179 proteins in all three fractions, with known exosomal cargo proteins displaying at least a 2-fold enrichment in the exosome fraction based on the TMT protein ratios. Employing SVM cluster analysis allowed for the classification 251 proteins as "true" exosomal cargo proteins. This study provides a robust and vigorous framework for the future development of using exosomes as a potential multiprotein marker phenotyping tool that could be useful in breast cancer diagnosis and monitoring disease progression.

  9. Patterns in longitudinal growth of refraction in Southern Chinese children: cluster and principal component analysis.

    PubMed

    Chen, Yanxian; Chang, Billy Heung Wing; Ding, Xiaohu; He, Mingguang

    2016-11-22

    In the present study we attempt to use hypothesis-independent analysis in investigating the patterns in refraction growth in Chinese children, and to explore the possible risk factors affecting the different components of progression, as defined by Principal Component Analysis (PCA). A total of 637 first-born twins in Guangzhou Twin Eye Study with 6-year annual visits (baseline age 7-15 years) were available in the analysis. Cluster 1 to 3 were classified after a partitioning clustering, representing stable, slow and fast progressing groups of refraction respectively. Baseline age and refraction, paternal refraction, maternal refraction and proportion of two myopic parents showed significant differences across the three groups. Three major components of progression were extracted using PCA: "Average refraction", "Acceleration" and the combination of "Myopia stabilization" and "Late onset of refraction progress". In regression models, younger children with more severe myopia were associated with larger "Acceleration". The risk factors of "Acceleration" included change of height and weight, near work, and parental myopia, while female gender, change of height and weight were associated with "Stabilization", and increased outdoor time was related to "Late onset of refraction progress". We therefore concluded that genetic and environmental risk factors have different impacts on patterns of refraction progression.

  10. Patterns in longitudinal growth of refraction in Southern Chinese children: cluster and principal component analysis

    PubMed Central

    Chen, Yanxian; Chang, Billy Heung Wing; Ding, Xiaohu; He, Mingguang

    2016-01-01

    In the present study we attempt to use hypothesis-independent analysis in investigating the patterns in refraction growth in Chinese children, and to explore the possible risk factors affecting the different components of progression, as defined by Principal Component Analysis (PCA). A total of 637 first-born twins in Guangzhou Twin Eye Study with 6-year annual visits (baseline age 7–15 years) were available in the analysis. Cluster 1 to 3 were classified after a partitioning clustering, representing stable, slow and fast progressing groups of refraction respectively. Baseline age and refraction, paternal refraction, maternal refraction and proportion of two myopic parents showed significant differences across the three groups. Three major components of progression were extracted using PCA: “Average refraction”, “Acceleration” and the combination of “Myopia stabilization” and “Late onset of refraction progress”. In regression models, younger children with more severe myopia were associated with larger “Acceleration”. The risk factors of “Acceleration” included change of height and weight, near work, and parental myopia, while female gender, change of height and weight were associated with “Stabilization”, and increased outdoor time was related to “Late onset of refraction progress”. We therefore concluded that genetic and environmental risk factors have different impacts on patterns of refraction progression. PMID:27874105

  11. Portraying Persons Who Inject Drugs Recently Infected with Hepatitis C Accessing Antiviral Treatment: A Cluster Analysis

    PubMed Central

    Bamvita, Jean-Marie; Roy, Elise; Levesque, Annie; Bruneau, Julie

    2014-01-01

    Objectives. To empirically determine a categorization of people who inject drug (PWIDs) recently infected with hepatitis C virus (HCV), in order to identify profiles most likely associated with early HCV treatment uptake. Methods. The study population was composed of HIV-negative PWIDs with a documented recent HCV infection. Eligibility criteria included being 18 years old or over, and having injected drugs in the previous 6 months preceding the estimated date of HCV exposure. Participant classification was carried out using a TwoStep cluster analysis. Results. From September 2007 to December 2011, 76 participants were included in the study. 60 participants were eligible for HCV treatment. Twenty-one participants initiated HCV treatment. The cluster analysis yielded 4 classes: class 1: Lukewarm health seekers dismissing HCV treatment offer; class 2: multisubstance users willing to shake off the hell; class 3: PWIDs unlinked to health service use; class 4: health seeker PWIDs willing to reverse the fate. Conclusion. Profiles generated by our analysis suggest that prior health care utilization, a key element for treatment uptake, differs between older and younger PWIDs. Such profiles could inform the development of targeted strategies to improve health outcomes and reduce HCV infection among PWIDs. PMID:25349730

  12. Investigating properties of a set of variable AGN with cluster analysis

    NASA Astrophysics Data System (ADS)

    Nair, A. D.

    1997-05-01

    Optical and gamma-ray properties of a sample of active galactic nuclei monitored at the Rosemary Hill Observatory are analysed using cluster analysis. Cluster analysis can be used to analyse large amounts of data with many variables and investigate linear or non-linear relationships in the data. It is found that the time-scale of variation is not related to the amplitude of variability. For BLLacs and optically violent variable (OVV) quasars the variability is proportional to the redshift and absolute magnitude, but this is not true for quasars in this sample. The analysis shows that gamma-ray-loud AGN tend to be associated with superluminal sources with OVV-like characteristics. The gamma-ray fluxes, for both OVV quasars and BLLacs, are proportional to the apparent transverse velocity, and this may point to beaming as the dominant cause for the gamma-ray flux. A large majority of the OVV quasars that display a large amplitude of variability are gamma- ray-loud, but this is not true for BL Lacs.

  13. Modelling childhood caries using parametric competing risks survival analysis methods for clustered data.

    PubMed

    Stephenson, J; Chadwick, B L; Playle, R A; Treasure, E T

    2010-01-01

    Caries in primary teeth is an ongoing issue in children's dental health. Its quantification is affected by clustering of data within children and the concurrent risk of exfoliation of primary teeth. This analysis of caries data of 103,776 primary molar tooth surfaces from a cohort study of 2,654 British children aged 4-5 years at baseline applied multilevel competing risks survival analysis methodology to identify factors significantly associated with caries occurrence in primary tooth surfaces in the presence of the concurrent risk of exfoliation, and assessed the effect of exfoliation on caries development. Multivariate multilevel parametric survival models were applied at surface level to the analysis of the sound-carious and sound-exfoliation transitions to which primary tooth surfaces are subject. Socio-economic class, fluoridation status and surface type were found to be the strongest predictors of primary caries, with the highest rates of occurrence and lowest median survival times associated with occlusal surfaces of children from poor socio-economic class living in non-fluoridated areas. The concurrent risk of exfoliation was shown to reduce the distinction in survival experience between different types of surfaces, and between surfaces of teeth from children of different socio-economic class or fluoridation status. Clustering of data had little effect on inferences of parameter significance.

  14. Reducing monitoring costs in industrially contaminated rivers: cluster and regression analysis approach.

    PubMed

    Ruman, M; Olkowska, E; Kozioł, K; Absalon, D; Matysik, M; Polkowska, Ż

    2014-03-01

    Monitoring contamination in river water is an expensive procedure, particularly for developing countries where pollution is a significant problem. This study was conducted to provide a pollution monitoring strategy that reduces the cost of laboratory analysis. The new monitoring strategy was designed as a result of cluster and regression analysis on field data collected from an industrially influenced river. Pollution sources in the study site were coal mining, metallurgy, chemical industry, and metropolitan sewage. This river resembles those in other areas of the world, including developing countries where environmental monitoring is financially constrained. Data were collected on variability of contaminant concentrations during four seasons at the same points on tributaries of the river. The variables described in the study are pH, electrical conductivity, inorganic ions, trace elements, and selected organic pollutants. These variables were divided into groups using cluster analysis. These groups were then tested using regression models to identify how the behavior of one variable changes in relation to another. It was found that up to 86.8% of variability of one parameter could be determined by another in the dataset. We adopted 60, 65, and 70% determination levels () for accepting a regression model. As a result, monitoring could be reduced by 15 (60% level) and 10 variables (65 and 70%) out of 43, which comprises 35 and 23% of the monitored variable total. Cost reduction would be most effective if trace elements or organic pollutants were excluded from monitoring because these are the constituents most expensive to analyze.

  15. Multiple large clusters of tuberculosis in London: a cross-sectional analysis of molecular and spatial data.

    PubMed

    Smith, Catherine M; Maguire, Helen; Anderson, Charlotte; Macdonald, Neil; Hayward, Andrew C

    2017-01-01

    Large outbreaks of tuberculosis (TB) represent a particular threat to disease control because they reflect multiple instances of active transmission. The extent to which long chains of transmission contribute to high TB incidence in London is unknown. We aimed to estimate the contribution of large clusters to the burden of TB in London and identify risk factors. We identified TB patients resident in London notified between 2010 and 2014, and used 24-locus mycobacterial interspersed repetitive units-variable number tandem repeat strain typing data to classify cases according to molecular cluster size. We used spatial scan statistics to test for spatial clustering and analysed risk factors through multinomial logistic regression. TB isolates from 7458 patients were included in the analysis. There were 20 large molecular clusters (with n>20 cases), comprising 795 (11%) of all cases; 18 (90%) large clusters exhibited significant spatial clustering. Cases in large clusters were more likely to be UK born (adjusted odds ratio 2.93, 95% CI 2.28-3.77), of black-Caribbean ethnicity (adjusted odds ratio 3.64, 95% CI 2.23-5.94) and have multiple social risk factors (adjusted odds ratio 3.75, 95% CI 1.96-7.16). Large clusters of cases contribute substantially to the burden of TB in London. Targeting interventions such as screening in deprived areas and social risk groups, including those of black ethnicities and born in the UK, should be a priority for reducing transmission.

  16. Multiple large clusters of tuberculosis in London: a cross-sectional analysis of molecular and spatial data

    PubMed Central

    Maguire, Helen; Anderson, Charlotte; Macdonald, Neil; Hayward, Andrew C.

    2017-01-01

    Large outbreaks of tuberculosis (TB) represent a particular threat to disease control because they reflect multiple instances of active transmission. The extent to which long chains of transmission contribute to high TB incidence in London is unknown. We aimed to estimate the contribution of large clusters to the burden of TB in London and identify risk factors. We identified TB patients resident in London notified between 2010 and 2014, and used 24-locus mycobacterial interspersed repetitive units–variable number tandem repeat strain typing data to classify cases according to molecular cluster size. We used spatial scan statistics to test for spatial clustering and analysed risk factors through multinomial logistic regression. TB isolates from 7458 patients were included in the analysis. There were 20 large molecular clusters (with n>20 cases), comprising 795 (11%) of all cases; 18 (90%) large clusters exhibited significant spatial clustering. Cases in large clusters were more likely to be UK born (adjusted odds ratio 2.93, 95% CI 2.28–3.77), of black-Caribbean ethnicity (adjusted odds ratio 3.64, 95% CI 2.23–5.94) and have multiple social risk factors (adjusted odds ratio 3.75, 95% CI 1.96–7.16). Large clusters of cases contribute substantially to the burden of TB in London. Targeting interventions such as screening in deprived areas and social risk groups, including those of black ethnicities and born in the UK, should be a priority for reducing transmission. PMID:28149918

  17. Improved initialisation of model-based clustering using Gaussian hierarchical partitions

    PubMed Central

    Scrucca, Luca; Raftery, Adrian E.

    2015-01-01

    Initialisation of the EM algorithm in model-based clustering is often crucial. Various starting points in the parameter space often lead to different local maxima of the likelihood function and, so to different clustering partitions. Among the several approaches available in the literature, model-based agglomerative hierarchical clustering is used to provide initial partitions in the popular mclust R package. This choice is computationally convenient and often yields good clustering partitions. However, in certain circumstances, poor initial partitions may cause the EM algorithm to converge to a local maximum of the likelihood function. We propose several simple and fast refinements based on data transformations and illustrate them through data examples. PMID:26949421

  18. Transcriptional Analysis of Essential Genes of the Escherichia coli Fatty Acid Biosynthesis Gene Cluster by Functional Replacement with the Analogous Salmonella typhimurium Gene Cluster

    PubMed Central

    Zhang, Yan; Cronan, John E.

    1998-01-01

    The genes encoding several key fatty acid biosynthetic enzymes (called the fab cluster) are clustered in the order plsX-fabH-fabD-fabG-acpP-fabF at min 24 of the Escherichia coli chromosome. A difficulty in analysis of the fab cluster by the polar allele duplication approach (Y. Zhang and J. E. Cronan, Jr., J. Bacteriol. 178:3614–3620, 1996) is that several of these genes are essential for the growth of E. coli. We overcame this complication by use of the fab gene cluster of Salmonella typhimurium, a close relative of E. coli, to provide functions necessary for growth. The S. typhimurium fab cluster was isolated by complementation of an E. coli fabD mutant and was found to encode proteins with >94% homology to those of E. coli. However, the S. typhimurium sequences cannot recombine with the E. coli sequences required to direct polar allele duplication via homologous recombination. Using this approach, we found that although approximately 60% of the plsX transcripts initiate at promoters located far upstream and include the upstream rpmF ribosomal protein gene, a promoter located upstream of the plsX coding sequence (probably within the upstream gene, rpmF) is sufficient for normal growth. We have also found that the fabG gene is obligatorily cotranscribed with upstream genes. Insertion of a transcription terminator cassette (Ω-Cm cassette) between the fabD and fabG genes of the E. coli chromosome abolished fabG transcription and blocked cell growth, thus providing the first indication that fabG is an essential gene. Insertion of the Ω-Cm cassette between fabH and fabD caused greatly decreased transcription of the fabD and fabG genes and slower cellular growth, indicating that fabD has only a weak promoter(s). PMID:9642179

  19. Heterogeneity of Severe Asthma in Childhood: Confirmation by Cluster Analysis of Children in the NIH/NHLBI Severe Asthma Research Program (SARP)

    PubMed Central

    Fitzpatrick, Anne M.; Teague, W. Gerald; Meyers, Deborah A.; Peters, Stephen P.; Li, Xingnan; Li, Huashi; Wenzel, Sally E.; Aujla, Shean; Castro, Mario; Bacharier, Leonard B.; Gaston, Benjamin M.; Bleecker, Eugene R.; Moore, Wendy C.

    2011-01-01

    Background Asthma in children is a heterogeneous disorder with many phenotypes. Although unsupervised cluster analysis is a useful tool for identifying phenotypes, it has not been applied to school-age children with persistent asthma across a wide range of severities. Objectives This study determined how children with severe asthma are distributed across a cluster analysis and how well these clusters conform to current definitions of asthma severity. Methods Cluster analysis was applied to 12 continuous and composite variables from 161 children at 5 centers enrolled in the Severe Asthma Research Program (SARP). Results Four clusters of asthma were identified. Children in Cluster 1 (n = 48) had relatively normal lung function and less atopy, while children in Cluster 2 (n = 52) had slightly lower lung function, more atopy, and increased symptoms and medication usage. Cluster 3 (n = 32) had greater co-morbidity, increased bronchial responsiveness and lower lung function. Cluster 4 (n = 29) had the lowest lung function and the greatest symptoms and medication usage. Predictors of cluster assignment were asthma duration, the number of asthma controller medications, and baseline lung function. Children with severe asthma were present in all clusters, and no cluster corresponded to definitions of asthma severity provided in asthma treatment guidelines. Conclusions Severe asthma in children is highly heterogeneous. Unique phenotypic clusters previously identified in adults can also be identified in children, but with important differences. Larger validation and longitudinal studies are needed to determine the baseline and predictive validity of these phenotypic clusters in the larger clinical setting. PMID:21195471

  20. Large-scale analysis of conserved rare codon clusters suggests an involvement in co-translational molecular recognition events

    PubMed Central

    Chartier, Matthieu; Gaudreault, Francis; Najmanovich, Rafael

    2012-01-01

    Motivation: An increasing amount of evidence from experimental and computational analysis suggests that rare codon clusters are functionally important for protein activity. Most of the studies on rare codon clusters were performed on a limited number of proteins or protein families. In the present study, we present the Sherlocc program and how it can be used for large scale protein family analysis of evolutionarily conserved rare codon clusters and their relation to protein function and structure. This large-scale analysis was performed using the whole Pfam database covering over 70% of the known protein sequence universe. Our program Sherlocc, detects statistically relevant conserved rare codon clusters and produces a user-friendly HTML output. Results: Statistically significant rare codon clusters were detected in a multitude of Pfam protein families. The most statistically significant rare codon clusters were predominantly identified in N-terminal Pfam families. Many of the longest rare codon clusters are found in membrane-related proteins which are required to interact with other proteins as part of their function, for example in targeting or insertion. We identified some cases where rare codon clusters can play a regulating role in the folding of catalytically important domains. Our results support the existence of a widespread functional role for rare codon clusters across species. Finally, we developed an online filter-based search interface that provides access to Sherlocc results for all Pfam families. Availability: The Sherlocc program and search interface are open access and are available at http://bcb.med.usherbrooke.ca Contact: rafael.najmanovich@usherbrooke.ca Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22467916

  1. Longitudinal comparisons of dietary patterns derived by cluster analysis in 7- to 13-year-old children.

    PubMed

    Northstone, Kate; Smith, Andrew D A C; Newby, P K; Emmett, Pauline M

    2013-06-01

    Little is known about changes in dietary patterns over time. The present study aims to derive dietary patterns using cluster analysis at three ages in children and track these patterns over time. In all, 3 d diet diaries were completed for children from the Avon Longitudinal Study of Parents and Children at 7, 10 and 13 years. Children were grouped based on the similarities between average weight consumed (g/d) of sixty-two food groups using k-means cluster analysis. A total of four clusters were obtained at each age, with very similar patterns being described at each time point: Processed (high consumption of processed foods, chips and soft drinks), Healthy (high consumption of high-fibre bread, fruit, vegetables and water), Traditional (high consumption of meat, potatoes and vegetables) and Packed Lunch (high consumption of white bread, sandwich fillings and snacks). The number of children remaining in the same cluster at different ages was reasonably high: 50 and 43% of children in the Healthy and Processed clusters, respectively, at age 7 years were in the same clusters at age 13 years. Maternal education was the strongest predictor of remaining in the Healthy cluster at each time point – children whose mothers had the highest level of education were nine times more likely to remain in that cluster compared to those with the lowest. Cluster analysis provides a simple way of examining changes in dietary patterns over time, and similar underlying patterns of diet at two ages during late childhood, that persisted through to early adolescence.

  2. Automated classification of mouse pup isolation syllables: from cluster analysis to an Excel-based "mouse pup syllable classification calculator".

    PubMed

    Grimsley, Jasmine M S; Gadziola, Marie A; Wenstrup, Jeffrey J

    2012-01-01

    Mouse pups vocalize at high rates when they are cold or isolated from the nest. The proportions of each syllable type produced carry information about disease state and are being used as behavioral markers for the internal state of animals. Manual classifications of these vocalizations identified 10 syllable types based on their spectro-temporal features. However, manual classification of mouse syllables is time consuming and vulnerable to experimenter bias. This study uses an automated cluster analysis to identify acoustically distinct syllable types produced by CBA/CaJ mouse pups, and then compares the results to prior manual classification methods. The cluster analysis identified two syllable types, based on their frequency bands, that have continuous frequency-time structure, and two syllable types featuring abrupt frequency transitions. Although cluster analysis computed fewer syllable types than manual classification, the clusters represented well the probability distributions of the acoustic features within syllables. These probability distributions indicate that some of the manually classified syllable types are not statistically distinct. The characteristics of the four classified clusters were used to generate a Microsoft Excel-based mouse syllable classifier that rapidly categorizes syllables, with over a 90% match, into the syllable types determined by cluster analysis.

  3. Front Crawl Sprint Performance: A Cluster Analysis of Biomechanics, Energetics, Coordinative, and Anthropometric Determinants in Young Swimmers.

    PubMed

    Figueiredo, Pedro; Silva, Ana; Sampaio, António; Vilas-Boas, João Paulo; Fernandes, Ricardo J

    2016-07-01

    The aim of this study was to evaluate the determinants of front crawl sprint performance of young swimmers using a cluster analysis. 103 swimmers, aged 11- to 13-years old, performed 25-m front crawl swimming at 50-m pace, recorded by two underwater cameras. Swimmers analysis included biomechanics, energetics, coordinative, and anthropometric characteristics. The organization of subjects in meaningful clusters, originated three groups (1.52 ± 0.16, 1.47 ± 0.17 and 1.40 ± 0.15 m/s, for Clusters 1, 2 and 3, respectively) with differences in velocity between Cluster 1 and 2 compared with Cluster 3 (p = .003). Anthropometric variables were the most determinants for clusters solution. Stroke length and stroke index were also considered relevant. In addition, differences between Cluster 1 and the others were also found for critical velocity, stroke rate and intracycle velocity variation (p < .05). It can be concluded that anthropometrics, technique and energetics (swimming efficiency) are determinant domains to young swimmers sprint performance.

  4. Cluster analysis of European Y-chromosomal STR haplotypes using the discrete Laplace method.

    PubMed

    Andersen, Mikkel Meyer; Eriksen, Poul Svante; Morling, Niels

    2014-07-01

    The European Y-chromosomal short tandem repeat (STR) haplotype distribution has previously been analysed in various ways. Here, we introduce a new way of analysing population substructure using a new method based on clustering within the discrete Laplace exponential family that models the probability distribution of the Y-STR haplotypes. Creating a consistent statistical model of the haplotypes enables us to perform a wide range of analyses. Previously, haplotype frequency estimation using the discrete Laplace method has been validated. In this paper we investigate how the discrete Laplace method can be used for cluster analysis to further validate the discrete Laplace method. A very important practical fact is that the calculations can be performed on a normal computer. We identified two sub-clusters of the Eastern and Western European Y-STR haplotypes similar to results of previous studies. We also compared pairwise distances (between geographically separated samples) with those obtained using the AMOVA method and found good agreement. Further analyses that are impossible with AMOVA were made using the discrete Laplace method: analysis of the homogeneity in two different ways and calculating marginal STR distributions. We found that the Y-STR haplotypes from e.g. Finland were relatively homogeneous as opposed to the relatively heterogeneous Y-STR haplotypes from e.g. Lublin, Eastern Poland and Berlin, Germany. We demonstrated that the observed distributions of alleles at each locus were similar to the expected ones. We also compared pairwise distances between geographically separated samples from Africa with those obtained using the AMOVA method and found good agreement.

  5. Multi-level bootstrap analysis of stable clusters in resting-state fMRI.

    PubMed

    Bellec, Pierre; Rosa-Neto, Pedro; Lyttelton, Oliver C; Benali, Habib; Evans, Alan C

    2010-07-01

    A variety of methods have been developed to identify brain networks with spontaneous, coherent activity in resting-state functional magnetic resonance imaging (fMRI). We propose here a generic statistical framework to quantify the stability of such resting-state networks (RSNs), which was implemented with k-means clustering. The core of the method consists in bootstrapping the available datasets to replicate the clustering process a large number of times and quantify the stable features across all replications. This bootstrap analysis of stable clusters (BASC) has several benefits: (1) it can be implemented in a multi-level fashion to investigate stable RSNs at the level of individual subjects and at the level of a group; (2) it provides a principled measure of RSN stability; and (3) the maximization of the stability measure can be used as a natural criterion to select the number of RSNs. A simulation study validated the good performance of the multi-level BASC on purely synthetic data. Stable networks were also derived from a real resting-state study for 43 subjects. At the group level, seven RSNs were identified which exhibited a good agreement with the previous findings from the literature. The comparison between the individual and group-level stability maps demonstrated the capacity of BASC to establish successful correspondences between these two levels of analysis and at the same time retain some interesting subject-specific characteristics, e.g. the specific involvement of subcortical regions in the visual and fronto-parietal networks for some subjects.

  6. Cluster analysis on the bulk elemental compositions of Antarctic stony meteorites

    NASA Astrophysics Data System (ADS)

    Miyamoto, Hideaki; Niihara, Takafumi; Kuritani, Takeshi; Hong, Peng K.; Dohm, James M.; Sugita, Seiji

    2016-05-01

    Remote sensing observations by recent successful missions to small bodies have revealed the difficulty in classifying the materials which cover their surfaces into a conventional classification of meteorites. Although reflectance spectroscopy is a powerful tool for this purpose, it is influenced by many factors, such as space weathering, lighting conditions, and surface physical conditions (e.g., particle size and style of mixing). Thus, complementary information, such as elemental compositions, which can be obtained by X-ray fluorescence (XRF) and gamma-ray spectrometers (GRS), have been considered very important. However, classifying planetary materials solely based on elemental compositions has not been investigated extensively. In this study, we perform principal component and cluster analyses on 12 major and minor elements of the bulk compositions of 500 meteorites reported in the National Institute of Polar Research (NIPR), Japan database. Our unique approach, which includes using hierarchical cluster analysis, indicates that meteorites can be classified into about 10 groups purely by their bulk elemental compositions. We suggest that Si, Fe, Mg, Ca, and Na are the optimal set of elements, as this set has been used successfully to classify meteorites of the NIPR database with more than 94% accuracy. Principal components analysis indicates that elemental compositions of meteorites form eight clusters in the three-dimensional space of the components. The three major principal components (PC1, PC2, and PC3) are interpreted as (1) degree of differentiations of the source body (i.e., primitive versus differentiated), (2) degree of thermal effects, and (3) degree of chemical fractionation, respectively.

  7. Unsupervised change detection in satellite images using fuzzy c-means clustering and principal component analysis

    NASA Astrophysics Data System (ADS)

    Kesikoğlu, M. H.; Atasever, Ü. H.; Özkan, C.

    2013-10-01

    Change detection analyze means that according to observations made in different times, the process of defining the change detection occurring in nature or in the state of any objects or the ability of defining the quantity of temporal effects by using multitemporal data sets. There are lots of change detection techniques met in literature. It is possible to group these techniques under two main topics as supervised and unsupervised change detection. In this study, the aim is to define the land cover changes occurring in specific area of Kayseri with unsupervised change detection techniques by using Landsat satellite images belonging to different years which are obtained by the technique of remote sensing. While that process is being made, image differencing method is going to be applied to the images by following the procedure of image enhancement. After that, the method of Principal Component Analysis is going to be applied to the difference image obtained. To determine the areas that have and don't have changes, the image is grouped as two parts by Fuzzy C-Means Clustering method. For achieving these processes, firstly the process of image to image registration is completed. As a result of this, the images are being referred to each other. After that, gray scale difference image obtained is partitioned into 3 × 3 nonoverlapping blocks. With the method of principal component analysis, eigenvector space is gained and from here, principal components are reached. Finally, feature vector space consisting principal component is partitioned into two clusters using Fuzzy C-Means Clustering and after that change detection process has been done.

  8. Similarity and Cluster Analysis of Intermediate Deep Events in the Southeastern Aegean

    NASA Astrophysics Data System (ADS)

    Ruscic, M.; Meier, T. M.; Becker, D.; Brüstle, A.

    2015-12-01

    In order to gain a better understanding of geodynamic processes in the Hellenic subduction zone (HSZ), in particular in the eastern part of the HSZ, we analyze a cluster of intermediate deep events in the region of Nisyros volcano. The events were recorded by the temporary seismic network EGELADOS deployed from September 2005 to March 2007. The network covered the entire Hellenic subduction zone and it consisted of 23 offshore and 56 onshore broadband stations completed by 19 permanent stations from NOA, GEOFON and MedNet. The cluster of intermediate deep seismicity consists of 159 events with local magnitudes ranging from magnitude 0.2 to magnitude 4.1 at depths from 80 to 200 km. The events occur close to the top of the slab at an about 30 km thick zone. The spatio-temporal clustering is studied using three component similarity analysis.Single event locations obtained using the nonlinear location tool NonLinLoc are compared to relative relocations calculated using the double-difference earthquake relocation software HypoDD. The relocation is performed with both manual readings of onset times as well as with differential traveltimes obtained by separate cross-correlation of P- and S-waveforms. The three-component waveform cross-correlation was performed for all the events using data from 45 stations. The results of the similarity analysis are shown as a function of frequency for individual stations and averaged over the network. Average similarities between waveforms of all event pairs reveal a low number of highly similar events but a large number of moderate similarities. Interestingly, the single station similarities between the event pairs show (1) in general decreasing similarity with increasing epicentral distance, (2) reduced similarities for paths crossing boundaries of slab segments, and (3) the influence of strong local heterogeneity leading to a considerable reduction of waveform similarities e.g. in the center of the Santorini volcano.

  9. High resolution spectroscopic analysis of seven giants in the bulge globular cluster NGC 6723

    NASA Astrophysics Data System (ADS)

    Rojas-Arriagada, A.; Zoccali, M.; Vásquez, S.; Ripepi, V.; Musella, I.; Marconi, M.; Grado, A.; Limatola, L.

    2016-03-01

    Context. Globular clusters associated with the Galactic bulge are important tracers of stellar populations in the inner Galaxy. High resolution analysis of stars in these clusters allows us to characterize them in terms of kinematics, metallicity, and individual abundances, and to compare these fingerprints with those characterizing field populations. Aims: We present iron and element ratios for seven red giant stars in the globular cluster NGC 6723, based on high resolution spectroscopy. Methods: High resolution spectra (R ~ 48 000) of seven K giants belonging to NGC 6723 were obtained with the FEROS spectrograph at the MPG/ESO 2.2 m telescope. Photospheric parameters were derived from ~130 Fe i and Fe ii transitions. Abundance ratios were obtained from line-to-line spectrum synthesis calculations on clean selected features. Results: An intermediate metallicity of [Fe/H] = -0.98 ± 0.08 dex and a heliocentric radial velocity of vhel = -96.6 ± 1.3 km s-1 were found for NGC 6723. Alpha-element abundances present enhancements of [O/Fe] = 0.29 ± 0.18 dex, [Mg/Fe] = 0.23 ± 0.10 dex, [Si/Fe] = 0.36 ± 0.05 dex, and [Ca/Fe] = 0.30 ± 0.07 dex. Similar overabundance is found for the iron-peak Ti with [Ti/Fe] = 0.24 ± 0.09 dex. Odd-Z elements Na and Al present abundances of [Na/Fe] = 0.00 ± 0.21 dex and [Al/Fe] = 0.31 ± 0.21 dex, respectively. Finally, the s-element Ba is also enhanced by [Ba/Fe] = 0.22 ± 0.21 dex. Conclusions: The enhancement levels of NGC 6723 are comparable to those of other metal-intermediate bulge globular clusters. In turn, these enhancement levels are compatible with the abundance profiles displayed by bulge field stars at that metallicity. This hints at a possible similar chemical evolution with globular clusters and the metal-poor of the bulge going through an early prompt chemical enrichment.

  10. Infrared analysis of clustering in the II-VI-VI compound CdSexTe1-x

    NASA Astrophysics Data System (ADS)

    Perkowitz, S.; Kim, L. S.; Becla, P.

    1991-03-01

    Infrared reflectivity spectra at 82 K for Bridgman-grown CdSexTe1-x crystals (x=0.05-0.35) show the two expected transverse-optical phonon modes and an unexpected third mode. Analysis of the data, using the cluster model of Verleur and Barker, shows that these spectra represent substantial nonrandom clustering of the anions around the cations. The magnitude and x dependence of the clustering is similar to that seen in the related compound CdSexS1-x grown at the same temperature, although by a different growth method.

  11. CLUSEAN: a computer-based framework for the automated analysis of bacterial secondary metabolite biosynthetic gene clusters.

    PubMed

    Weber, T; Rausch, C; Lopez, P; Hoof, I; Gaykova, V; Huson, D H; Wohlleben, W

    2009-03-10

    Bacterial secondary metabolites are an important source of antimicrobial and cytostatic drugs. These molecules are often synthesized in a stepwise fashion by multimodular megaenzymes that are encoded in clusters of genes encoding enzymes for precursor supply and modification. In this work,we present an open source software pipeline, CLUSEAN (CLUster SEquence ANalyzer) that helps to annotate and analyze such gene clusters. CLUSEAN integrates standard analysis tools, like BLAST and HMMer, with specific tools for the identification of the functional domains and motifs in nonribosomal peptide synthetases (NRPS)/type I polyketide synthases (PKS) and the prediction of specificities of NRPS.

  12. The contribution of cluster and discriminant analysis to the classification of complex aquifer systems.

    PubMed

    Panagopoulos, G P; Angelopoulou, D; Tzirtzilakis, E E; Giannoulopoulos, P

    2016-10-01

    This paper presents an innovated method for the discrimination of groundwater samples in common groups representing the hydrogeological units from where they have been pumped. This method proved very efficient even in areas with complex hydrogeological regimes. The proposed method requires chemical analyses of water samples only for major ions, meaning that it is applicable to most of cases worldwide. Another benefit of the method is that it gives a further insight of the aquifer hydrogeochemistry as it provides the ions that are responsible for the discrimination of the group. The procedure begins with cluster analysis of the dataset in order to classify the samples in the corresponding hydrogeological unit. The feasibility of the method is proven from the fact that the samples of volcanic origin were separated into two different clusters, namely the lava units and the pyroclastic-ignimbritic aquifer. The second step is the discriminant analysis of the data which provides the functions that distinguish the groups from each other and the most significant variables that define the hydrochemical composition of the aquifer. The whole procedure was highly successful as the 94.7 % of the samples were classified to the correct aquifer system. Finally, the resulted functions can be safely used to categorize samples of either unknown or doubtful origin improving thus the quality and the size of existing hydrochemical databases.

  13. Cluster analysis for description and characterization of boundary-layer eddies

    NASA Astrophysics Data System (ADS)

    Lennartz-Sassinek, Sabine; Liu, Shaofeng; Shao, Yaping

    2014-05-01

    Boundary-layer eddies play an important role in atmosphere and land-surface interactions. It is already known that near the surface many small eddies exist, which merge to form large eddies in the upper layers of the atmospheric boundary layer. However, it is not yet fully understood how these eddies organize themselves and how the organization can be best described. In this study, we present a new cluster analysis for description and characterization of boundary-layer eddies. Using this method, we group physical parameters in each air layer according to certain rules to connected clusters, each of which represents an eddy. The method is applied to the simulations of a coupled large-eddy atmosphere and land-surface model (LES-ALM). We present how the eddies are determined based on the different physical parameters for different air layers, the size distribution of the eddies within an air layer and how the results depend on the chosen physical parameter such as temperature, vertical temperature flux, moisture, vertical moisture flux and vertical wind. This analysis will help to better understand how information is transported through the different air layers and thus the role eddies play in the land-surface atmosphere interactions.

  14. Analysis of the gene cluster encoding toluene/o-xylene monooxygenase from Pseudomonas stutzeri OX1

    SciTech Connect

    Bertoni, G.; Martino, M.; Galli, E.; Barbieri, P.

    1998-10-01

    The toluene/o-xylene monooxygenase cloned from Pseudomonas stutzeri OX1 displays a very broad range of substrates and a very peculiar regioselectivity, because it is able to hydroxylate more than one position on the aromatic ring of several hydrocarbons and phenols. The nucleotide sequence of the gene cluster coding for this enzymatic system has been determined. The sequence analysis revealed the presence of six open reading frames (ORFs) homologous to other genes clustered in operons coding for multicomponent monooxygenases found in benzene- and toluene-degradative pathways cloned from Pseudomonas strains. Significant similarities were also found with multicomponent monooxygenase systems for phenol, methane, alkene, and dimethyl sulfide cloned from different bacterial strains. The knockout of each ORF and complementation with the wild-type allele indicated that all six ORFs are essential for the full activity of the toluene/o-xylene monooxygenase in Escherichia coli. This analysis also shows that despite its activity on both hydrocarbons and phenols, toluene/o-xylene monooxygenase belongs to a toluene multicomponent monooxygenase subfamily rather than to the monooxygenases active on phenols.

  15. Joint Analysis of Galaxy-Galaxy Lensing and Galaxy Clustering: Methodology and Forecasts for DES

    SciTech Connect

    Park, Y.

    2015-07-19

    The joint analysis of galaxy-galaxy lensing and galaxy clustering is a promising method for inferring the growth function of large scale structure. Our analysis will be carried out on data from the Dark Energy Survey (DES), with its measurements of both the distribution of galaxies and the tangential shears of background galaxies induced by these foreground lenses. We develop a practical approach to modeling the assumptions and systematic effects affecting small scale lensing, which provides halo masses, and large scale galaxy clustering. Introducing parameters that characterize the halo occupation distribution (HOD), photometric redshift uncertainties, and shear measurement errors, we study how external priors on different subsets of these parameters affect our growth constraints. Degeneracies within the HOD model, as well as between the HOD and the growth function, are identified as the dominant source of complication, with other systematic effects sub-dominant. The impact of HOD parameters and their degeneracies necessitate the detailed joint modeling of the galaxy sample that we employ. Finally, we conclude that DES data will provide powerful constraints on the evolution of structure growth in the universe, conservatively/optimistically constraining the growth function to 7.9%/4.8% with its first-year data that covered over 1000 square degrees, and to 3.9%/2.3% with its full five-year data that will survey 5000 square degrees, including both statistical and systematic uncertainties.

  16. Eating or Meeting? Cluster Analysis Reveals Intricacies of White Shark (Carcharodon carcharias) Migration and Offshore Behavior

    PubMed Central

    Jorgensen, Salvador J.; Arnoldi, Natalie S.; Estess, Ethan E.; Chapple, Taylor K.; Rückert, Martin; Anderson, Scot D.; Block, Barbara A.

    2012-01-01

    Elucidating how mobile ocean predators utilize the pelagic environment is vital to understanding the dynamics of oceanic species and ecosystems. Pop-up archival transmitting (PAT) tags have emerged as an important tool to describe animal migrations in oceanic environments where direct observation is not feasible. Available PAT tag data, however, are for the most part limited to geographic position, swimming depth and environmental temperature, making effective behavioral observation challenging. However, novel analysis approaches have the potential to extend the interpretive power of these limited observations. Here we developed an approach based on clustering analysis of PAT daily time-at-depth histogram records to distinguish behavioral modes in white sharks (Carcharodon carcharias). We found four dominant and distinctive behavioral clusters matching previously described behavioral patterns, including two distinctive offshore diving modes. Once validated, we mapped behavior mode occurrence in space and time. Our results demonstrate spatial, temporal and sex-based structure in the diving behavior of white sharks in the northeastern Pacific previously unrecognized including behavioral and migratory patterns resembling those of species with lek mating systems. We discuss our findings, in combination with available life history and environmental data, and propose specific testable hypotheses to distinguish between mating and foraging in northeastern Pacific white sharks that can provide a framework for future work. Our methodology can be applied to similar datasets from other species to further define behaviors during unobservable phases. PMID:23144707

  17. Dietary Patterns Derived by Cluster Analysis are Associated with Cognitive Function among Korean Older Adults.

    PubMed

    Kim, Jihye; Yu, Areum; Choi, Bo Youl; Nam, Jung Hyun; Kim, Mi Kyung; Oh, Dong Hoon; Yang, Yoon Jung

    2015-05-29

    The objective of this study was to investigate major dietary patterns among older Korean adults through cluster analysis and to determine an association between dietary patterns and cognitive function. This is a cross-sectional study. The data from the Korean Multi-Rural Communities Cohort Study was used. Participants included 765 participants aged 60 years and over. A quantitative food frequency questionnaire with 106 items was used to investigate dietary intake. The Korean version of the MMSE-KC (Mini-Mental Status Examination-Korean version) was used to assess cognitive function. Two major dietary patterns were identified using K-means cluster analysis. The "MFDF" dietary pattern indicated high consumption of Multigrain rice, Fish, Dairy products, Fruits and fruit juices, while the "WNC" dietary pattern referred to higher intakes of White rice, Noodles, and Coffee. Means of the total MMSE-KC and orientation score of the participants in the MFDF dietary pattern were higher than those of the WNC dietary pattern. Compared with the WNC dietary pattern, the MFDF dietary pattern showed a lower risk of cognitive impairment after adjusting for covariates (OR 0.64, 95% CI 0.44-0.94). The MFDF dietary pattern, with high consumption of multigrain rice, fish, dairy products, and fruits may be related to better cognition among Korean older adults.

  18. Enhancing the diversity of a corporate database using chemical database clustering and analysis

    NASA Astrophysics Data System (ADS)

    Shemetulskis, Norah E.; Dunbar, James B., Jr.; Dunbar, Bonnie W.; Moreland, David W.; Humblet, Christine

    1995-10-01

    The contribution that the Chemical Abstracts structural database (CAST-3D) and the Maybridge database (MAY) would make to diversifying the structural information and property space spanned by our corporate database (CBI) is assessed. A subset of the CAST-3D database has been selected to augment the structural diversity of various electronic databases used in computer-assisted drug design projects. The analysis of the MAY database directly offers the potential to expand the CBI compound library, but also provides a source for structural diversity in a format suitable for computer-assisted database searching and molecular design. The analysis performed is twofold. First, a nonhierarchical clustering technique available in the Daylight clustering package is applied to evaluate the structural differences between databases. The comparison is then extended to analyze various structure-derived property spaces calculated from molecular descriptors such as the logarithm of the octanol-water partition coefficient (CLOGP), the molar refractivity (CMR) and the electronic dipole moment (CDM). The diversity contribution of each database to these property spaces is quantified in relation to our corporate database.

  19. Dietary Patterns Derived by Cluster Analysis are Associated with Cognitive Function among Korean Older Adults

    PubMed Central

    Kim, Jihye; Yu, Areum; Choi, Bo Youl; Nam, Jung Hyun; Kim, Mi Kyung; Oh, Dong Hoon; Yang, Yoon Jung

    2015-01-01

    The objective of this study was to investigate major dietary patterns among older Korean adults through cluster analysis and to determine an association between dietary patterns and cognitive function. This is a cross-sectional study. The data from the Korean Multi-Rural Communities Cohort Study was used. Participants included 765 participants aged 60 years and over. A quantitative food frequency questionnaire with 106 items was used to investigate dietary intake. The Korean version of the MMSE-KC (Mini-Mental Status Examination–Korean version) was used to assess cognitive function. Two major dietary patterns were identified using K-means cluster analysis. The “MFDF” dietary pattern indicated high consumption of Multigrain rice, Fish, Dairy products, Fruits and fruit juices, while the “WNC” dietary pattern referred to higher intakes of White rice, Noodles, and Coffee. Means of the total MMSE-KC and orientation score of the participants in the MFDF dietary pattern were higher than those of the WNC dietary pattern. Compared with the WNC dietary pattern, the MFDF dietary pattern showed a lower risk of cognitive impairment after adjusting for covariates (OR 0.64, 95% CI 0.44–0.94). The MFDF dietary pattern, with high consumption of multigrain rice, fish, dairy products, and fruits may be related to better cognition among Korean older adults. PMID:26035243

  20. Spectral clustering applied for dynamic contrast-enhanced MR analysis of time-intensity curves.

    PubMed

    Tartare, Guillaume; Hamad, Denis; Azahaf, Mustapha; Puech, Philippe; Betrouni, Nacim

    2014-12-01

    Dynamic contrast-enhanced (DCE)-magnetic resonance imaging (MRI) represents an emerging method for the prediction of biomarker responses in cancer. However, DCE images remain difficult to analyze and interpret. Although pharmacokinetic approaches, which involve multi-step processes, can provide a general framework for the interpretation of these data, they are still too complex for robust and accurate implementation. Therefore, statistical data analysis techniques were recently suggested as another valid interpretation strategy for DCE-MRI. In this context, we propose a spectral clustering approach for the analysis of DCE-MRI time-intensity signals. This graph theory-based method allows for the grouping of signals after spatial transformation. Subsequently, these data clusters can be labeled following comparison to arterial signals. Here, we have performed experiments with simulated (i.e., generated via pharmacokinetic modeling) and clinical (i.e., obtained from patients scanned during prostate cancer diagnosis) data sets in order to demonstrate the feasibility and applicability of this kind of unsupervised and non-parametric approach.

  1. THE CLUSTER AGES EXPERIMENT (CASE). VII. ANALYSIS OF TWO ECLIPSING BINARIES IN THE GLOBULAR CLUSTER NGC 6362

    SciTech Connect

    Kaluzny, J.; Rozyczka, M.; Schwarzenberg-Czerny, A.; Mazur, B.; Thompson, I. B.; Dotter, A.; Burley, G. S.; Rucinski, S. M. E-mail: alex@camk.edu.pl E-mail: ian@obs.carnegiescience.edu E-mail: greg.burley@gmail.com

    2015-11-15

    We use photometric and spectroscopic observations of the detached eclipsing binaries V40 and V41 in the globular cluster NGC 6362 to derive masses, radii, and luminosities of the component stars. The orbital periods of these systems are 5.30 and 17.89 days, respectively. The measured masses of the primary and secondary components (M{sub p}, M{sub s}) are (0.8337 ± 0.0063, 0.7947 ± 0.0048) M{sub ⊙} for V40 and (0.8215 ± 0.0058, 0.7280 ± 0.0047) M{sub ⊙} for V41. The measured radii (R{sub p}, R{sub s}) are (1.3253 ± 0.0075, 0.997 ± 0.013) R{sub ⊙} for V40 and (1.0739 ± 0.0048, 0.7307 ± 0.0046) R{sub ⊙} for V41. Based on the derived luminosities, we find that the distance modulus of the cluster is 14.74 ± 0.04 mag—in good agreement with 14.72 mag obtained from color–magnitude diagram (CMD) fitting. We compare the absolute parameters of component stars with theoretical isochrones in mass–radius and mass–luminosity diagrams. For assumed abundances [Fe/H] = −1.07, [α/Fe] = 0.4, and Y = 0.25 we find the most probable age of V40 to be 11.7 ± 0.2 Gyr, compatible with the age of the cluster derived from CMD fitting (12.5 ± 0.5 Gyr). V41 seems to be markedly younger than V40. If independently confirmed, this result will suggest that V41 belongs to the younger of the two stellar populations recently discovered in NGC 6362. The orbits of both systems are eccentric. Given the orbital period and age of V40, its orbit should have been tidally circularized some ∼7 Gyr ago. The observed eccentricity is most likely the result of a relatively recent close stellar encounter.

  2. An application of cluster analysis for determining homogeneous subregions: The agroclimatological point of view. [Rio Grande do Sul, Brazil

    NASA Technical Reports Server (NTRS)

    Parada, N. D. J. (Principal Investigator); Cappelletti, C. A.

    1982-01-01

    A stratification oriented to crop area and yield estimation problems was performed using an algorithm of clustering. The variables used were a set of agroclimatological characteristics measured in each one of the 232 municipalities of the State of Rio Grande do Sul, Brazil. A nonhierarchical cluster analysis was used and the pseudo F-statistics criterion was implemented for determining the "cut point" in the number of strata.

  3. Dengue fever occurrence and vector detection by larval survey, ovitrap and MosquiTRAP: a space-time clusters analysis.

    PubMed

    de Melo, Diogo Portella Ornelas; Scherrer, Luciano Rios; Eiras, Álvaro Eduardo

    2012-01-01

    The use of vector surveillance tools for preventing dengue disease requires fine assessment of risk, in order to improve vector control activities. Nevertheless, the thresholds between vector detection and dengue fever occurrence are currently not well established. In Belo Horizonte (Minas Gerais, Brazil), dengue has been endemic for several years. From January 2007 to June 2008, the dengue vector Aedes (Stegomyia) aegypti was monitored by ovitrap, the sticky-trap MosquiTRAP™ and larval surveys in an study area in Belo Horizonte. Using a space-time scan for clusters detection implemented in SaTScan software, the vector presence recorded by the different monitoring methods was evaluated. Clusters of vectors and dengue fever were detected. It was verified that ovitrap and MosquiTRAP vector detection methods predicted dengue occurrence better than larval survey, both spatially and temporally. MosquiTRAP and ovitrap presented similar results of space-time intersections to dengue fever clusters. Nevertheless ovitrap clusters presented longer duration periods than MosquiTRAP ones, less acuratelly signalizing the dengue risk areas, since the detection of vector clusters during most of the study period was not necessarily correlated to dengue fever occurrence. It was verified that ovitrap clusters occurred more than 200 days (values ranged from 97.0±35.35 to 283.0±168.4 days) before dengue fever clusters, whereas MosquiTRAP clusters preceded dengue fever clusters by approximately 80 days (values ranged from 65.5±58.7 to 94.0±14. 3 days), the former showing to be more temporally precise. Thus, in the present cluster analysis study MosquiTRAP presented superior results for signaling dengue transmission risks both geographically and temporally. Since early detection is crucial for planning and deploying effective preventions, MosquiTRAP showed to be a reliable tool and this method provides groundwork for the development of even more precise tools.

  4. Dengue Fever Occurrence and Vector Detection by Larval Survey, Ovitrap and MosquiTRAP: A Space-Time Clusters Analysis

    PubMed Central

    de Melo, Diogo Portella Ornelas; Scherrer, Luciano Rios; Eiras, Álvaro Eduardo

    2012-01-01

    The use of vector surveillance tools for preventing dengue disease requires fine assessment of risk, in order to improve vector control activities. Nevertheless, the thresholds between vector detection and dengue fever occurrence are currently not well established. In Belo Horizonte (Minas Gerais, Brazil), dengue has been endemic for several years. From January 2007 to June 2008, the dengue vector Aedes (Stegomyia) aegypti was monitored by ovitrap, the sticky-trap MosquiTRAP™ and larval surveys in an study area in Belo Horizonte. Using a space-time scan for clusters detection implemented in SaTScan software, the vector presence recorded by the different monitoring methods was evaluated. Clusters of vectors and dengue fever were detected. It was verified that ovitrap and MosquiTRAP vector detection methods predicted dengue occurrence better than larval survey, both spatially and temporally. MosquiTRAP and ovitrap presented similar results of space-time intersections to dengue fever clusters. Nevertheless ovitrap clusters presented longer duration periods than MosquiTRAP ones, less acuratelly signalizing the dengue risk areas, since the detection of vector clusters during most of the study period was not necessarily correlated to dengue fever occurrence. It was verified that ovitrap clusters occurred more than 200 days (values ranged from 97.0±35.35 to 283.0±168.4 days) before dengue fever clusters, whereas MosquiTRAP clusters preceded dengue fever clusters by approximately 80 days (values ranged from 65.5±58.7 to 94.0±14. 3 days), the former showing to be more temporally precise. Thus, in the present cluster analysis study MosquiTRAP presented superior results for signaling dengue transmission risks both geographically and temporally. Since early detection is crucial for planning and deploying effective preventions, MosquiTRAP showed to be a reliable tool and this method provides groundwork for the development of even more precise tools. PMID:22848729

  5. Morphometry and Cluster Analysis of Low Shield Volcanoes on Earth and Mars

    NASA Astrophysics Data System (ADS)

    Henderson, A.; Christiansen, E. H.; Radebaugh, J.

    2015-12-01

    Volcanoes are common on all terrestrial planets and their morphology is influenced by eruption mechanisms, volumes, and compositions and temperatures of the magmas; these are in turn influenced by the tectonic setting. In an attempt to better understand the relationship between morphometry and volcanic processes, we compared low-shield volcanoes on Syria Planum, Mars, with basaltic shields of the eastern Snake River Plain (eSRP).We used 133 volcanoes on Syria Planum that are covered by MOLA and HRSC elevation data and 246 eSRP shields covered by the NED. Shields on Syria Planum average 191 +/- 88 m tall, 12 +/- 6 km in diameter, 16 +/- 28 km3 in volume, and have 1.7° +/- 0.8 flank slopes. eSRP shields average 83 +/- 44 m tall, 4 +/- 3 km in diameter, 0.8 +/- 2 km3 in volume, and have 2.5° +/- 1 flank slopes. Bivariate plots of morphometric characteristics show that Syria Planum and eSRP low shields form the extremes of the same morphospace shared with some Icelandic olivine tholeiite shields, but is generally distinct from other terrestrial volcanoes. Cluster analysis of SP and eSRP shields with other terrestrial volcanoes separates these volcanoes into one cluster and the majority of them into the same sub-cluster that is distinct from other terrestrial volcanoes. Principal component and cluster analysis of Syria Planum and eSRP shields using height, area, volume, slope, and eccentricity shows that Syria Planum and eSRP low-shields are similar in shape (slope and eccentricity). Apparently, these low shields formed by similar processes involving Hawaiian-type eruptions of low viscosity (mafic) lavas with fissure controlled eruptions, narrowing to central vents. Initially high eruption rates and long, tube-fed lava flows shifted to the development of small lava lakes that repeatedly overflowed, and on some with late fountaining to form steeper spatter ramparts. However, Syria Planum shields are systematically larger than those on the eastern Snake River Plain. The

  6. [Analysis on principle of treatment of cough of yan zhenghua based on apriori and clustering algorithm].

    PubMed

    Wu, Jia-Rui; Guo, Wei-Xian; Zhang, Xiao-Meng; Yang, Bing; Zhang, Bing

    2014-02-01

    Based on the data mining methods of association rules and clustering algorithm, the 188 prescriptions for cough that built by Yan Zhenghua were collected and analyzed to get the frequency of drug usage and the relationship between drugs. From which we could conclude the experiences of Yan Zhenghua for the treatment of cough. The results of the analysis were that 20 core combinations were dig out, such as Bambusae Caulis in Taenias-Almond-Sactmarsh Aster. And there were 10 new prescriptions were found out, such as Sactmarsh Aster-Scutellariae Radix-Album Viscum-Bambusae Caulis in Taenian-Eriobotryae Folium. The results of the analysis were proved that Yan Zhenghua was good at curing cough by using the traditional Chinese medicine that can dispel wind and heat from the body, and remove heat from the lung to relieve cough.

  7. Potential emission flux to aerosol pollutants over Bengal Gangetic plain through combined trajectory clustering and aerosol source fields analysis

    NASA Astrophysics Data System (ADS)

    Kumar, D. Bharath; Verma, S.

    2016-09-01

    A hybrid source-receptor analysis was carried out to evaluate the potential emission flux to winter monsoon (WinMon) aerosols over Bengal Gangetic plain urban (Kolkata, Kol) and semi-urban atmospheres (Kharagpur, Kgp). This was done through application of fuzzy c-mean clustering to back-trajectory data combined with emission flux and residence time weighted aerosols analysis. WinMon mean aerosol optical depth (AOD) and angstrom exponent (AE) at Kol (AOD: 0.77; AE: 1.17) were respectively slightly higher than and nearly equal to that at Kgp (AOD: 0.71; AE: 1.18). Out of six source region clusters over Indian subcontinent and two over Indian oceanic region, the cluster mean AOD was the highest when associated with the mean path of air mass originating from the Bay of Bengal and the Arabian sea clusters at Kol and that from the Indo-Gangetic plain (IGP) cluster at Kgp. Spatial distribution of weighted AOD fields showed the highest potential source of aerosols over the IGP, primarily over upper IGP (e.g. Punjab, Haryana), lower IGP (e.g. Uttarpradesh) and eastern region (e.g. west Bengal, Bihar, northeast India) clusters. The emission flux contribution potential (EFCP) of fossil fuel (FF) emissions at surface (SL) of Kol/Kgp, elevated layer (EL) of Kol, and of biomass burning (BB) emissions at SL of Kol were primarily from upper, lower, upper/lower IGP clusters respectively. The EFCP of FF/BB emissions at Kgp-EL/SL, and that of BB at EL of Kol/Kgp were mainly from eastern region and Africa (AFR) clusters respectively. Though the AFR cluster was constituted of significantly high emission flux source potential of dust emissions, the EFCP of dust from northwest India (NWI) was comparable to that from AFR at Kol SL/EL.

  8. Using Cluster Analysis and ICP-MS to Identify Groups of Ecstasy Tablets in Sao Paulo State, Brazil.

    PubMed

    Maione, Camila; de Oliveira Souza, Vanessa Cristina; Togni, Loraine Rezende; da Costa, José Luiz; Campiglia, Andres Dobal; Barbosa, Fernando; Barbosa, Rommel Melgaço

    2017-02-15

    The variations found in the elemental composition in ecstasy samples result in spectral profiles with useful information for data analysis, and cluster analysis of these profiles can help uncover different categories of the drug. We provide a cluster analysis of ecstasy tablets based on their elemental composition. Twenty-five elements were determined by ICP-MS in tablets apprehended by Sao Paulo's State Police, Brazil. We employ the K-means clustering algorithm along with C4.5 decision tree to help us interpret the clustering results. We found a better number of two clusters within the data, which can refer to the approximated number of sources of the drug which supply the cities of seizures. The C4.5 model was capable of differentiating the ecstasy samples from the two clusters with high prediction accuracy using the leave-one-out cross-validation. The model used only Nd, Ni, and Pb concentration values in the classification of the samples.

  9. Multi-frequency analysis of neutralino dark matter annihilations in the Coma cluster

    NASA Astrophysics Data System (ADS)

    Colafrancesco, S.; Profumo, S.; Ullio, P.

    2006-08-01

    We study the astrophysical implications of neutralino dark matter annihilations in galaxy clusters, with a specific application to the Coma cluster. We first address the determination of the dark halo models for Coma, starting from structure formation models and observational data, and we discuss in detail the role of sub-halos. We then perform a thorough analysis of the transport and diffusion properties of neutralino annihilation products, and investigate the resulting multi-frequency signals, from radio to gamma-ray frequencies. We also study other relevant astrophysical effects of neutralino annihilations, like the DM-induced Sunyaev-Zel'dovich effect and the intracluster gas heating. As for the particle physics setup, we adopt a two-fold approach, resorting both to model-independent bottom-up scenarios and to benchmark, GUT-motivated frameworks. We show that the Coma radio-halo data (the spectrum and the surface brightness) can be nicely fitted by the neutralino-induced signal for peculiar particle physics models and for magnetic field values, which we outline in detail. Fitting the radio data and moving to higher frequencies, we find that the multi-frequency spectral energy distributions are typically dim at EUV and X-ray frequencies (with respect to the data), but show a non-negligible gamma-ray emission, depending on the amplitude of the Coma magnetic field. A simultaneous fit to the radio, EUV and HXR data is not possible without violating the gamma-ray EGRET upper limit. The best-fit particle physics models yields substantial heating of the intracluster gas, but not sufficient energy injection as to explain the quenching of cooling flows in the innermost region of clusters. Due to the specific multi-frequency features of the DM-induced spectral energy distribution in Coma, we find that supersymmetric models can be significantly and optimally constrained either in the gamma-rays or at radio and microwave frequencies.

  10. pong: fast analysis and visualization of latent clusters in population genetic data

    PubMed Central

    Behr, Aaron A.; Liu, Katherine Z.; Liu-Fang, Gracie; Nakka, Priyanka; Ramachandran, Sohini

    2016-01-01

    Motivation: A series of methods in population genetics use multilocus genotype data to assign individuals membership in latent clusters. These methods belong to a broad class of mixed-membership models, such as latent Dirichlet allocation used to analyze text corpora. Inference from mixed-membership models can produce different output matrices when repeatedly applied to the same inputs, and the number of latent clusters is a parameter that is often varied in the analysis pipeline. For these reasons, quantifying, visualizing, and annotating the output from mixed-membership models are bottlenecks for investigators across multiple disciplines from ecology to text data mining. Results: We introduce pong, a network-graphical approach for analyzing and visualizing membership in latent clusters with a native interactive D3.js visualization. pong leverages efficient algorithms for solving the Assignment Problem to dramatically reduce runtime while increasing accuracy compared with other methods that process output from mixed-membership models. We apply pong to 225 705 unlinked genome-wide single-nucleotide variants from 2426 unrelated individuals in the 1000 Genomes Project, and identify previously overlooked aspects of global human population structure. We show that pong outpaces current solutions by more than an order of magnitude in runtime while providing a customizable and interactive visualization of population structure that is more accurate than those produced by current tools. Availability and Implementation: pong is freely available and can be installed using the Python package management system pip. pong’s source code is available at https://github.com/abehr/pong. Contact: aaron_behr@alumni.brown.edu or sramachandran@brown.edu Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:27283948

  11. XMM-Newton analysis of a newly discovered, extremely X-ray luminous galaxy cluster at high redshift

    NASA Astrophysics Data System (ADS)

    Thoelken, S.; Schrabback, T.

    2016-06-01

    Galaxy clusters, the largest virialized structures in the universe, provide an excellent method to test cosmology on large scales. The galaxy cluster mass function as a function of redshift is a key tool to determine the fundamental cosmological parameters and especially measurements at high redshifts can e.g. provide constraints on dark energy. The fgas test as a direct cosmological probe is of special importance. Therefore, relaxed galaxy clusters at high redshifts are needed but these objects are considered to be extremely rare in current structure formation models. Here we present first results from an XMM-Newton analysis of an extremely X-ray luminous, newly discovered and potentially cool core cluster at a redshift of z=0.9. We carefully account for background emission and PSF effects and model the cluster emission in three radial bins. Our preliminary results suggest that this cluster is indeed a good candidate for a cool core cluster and thus potentially of extreme value for cosmology.

  12. Analyzing Patients' Values by Applying Cluster Analysis and LRFM Model in a Pediatric Dental Clinic in Taiwan

    PubMed Central

    Lin, Shih-Yen; Liu, Chih-Wei

    2014-01-01

    This study combines cluster analysis and LRFM (length, recency, frequency, and monetary) model in a pediatric dental clinic in Taiwan to analyze patients' values. A two-stage approach by self-organizing maps and K-means method is applied to segment 1,462 patients into twelve clusters. The average values of L, R, and F excluding monetary covered by national health insurance program are computed for each cluster. In addition, customer value matrix is used to analyze customer values of twelve clusters in terms of frequency and monetary. Customer relationship matrix considering length and recency is also applied to classify different types of customers from these twelve clusters. The results show that three clusters can be classified into loyal patients with L, R, and F values greater than the respective average L, R, and F values, while three clusters can be viewed as lost patients without any variable above the average values of L, R, and F. When different types of patients are identified, marketing strategies can be designed to meet different patients' needs. PMID:25045741

  13. A cluster analysis of tic symptoms in children and adults with Tourette syndrome: clinical correlates and treatment outcome.

    PubMed

    McGuire, Joseph F; Nyirabahizi, Epiphanie; Kircanski, Katharina; Piacentini, John; Peterson, Alan L; Woods, Douglas W; Wilhelm, Sabine; Walkup, John T; Scahill, Lawrence

    2013-12-30

    Cluster analytic methods have examined the symptom presentation of chronic tic disorders (CTDs), with limited agreement across studies. The present study investigated patterns, clinical correlates, and treatment outcome of tic symptoms. 239 youth and adults with CTDs completed a battery of assessments at baseline to determine diagnoses, tic severity, and clinical characteristics. Participants were randomly assigned to receive either a comprehensive behavioral intervention for tics (CBIT) or psychoeducation and supportive therapy (PST). A cluster analysis was conducted on the baseline Yale Global Tic Severity Scale (YGTSS) symptom checklist to identify the constellations of tic symptoms. Four tic clusters were identified: Impulse Control and Complex Phonic Tics; Complex Motor Tics; Simple Head Motor/Vocal Tics; and Primarily Simple Motor Tics. Frequencies of tic symptoms showed few differences across youth and adults. Tic clusters had small associations with clinical characteristics and showed no associations to the presence of coexisting psychiatric conditions. Cluster membership scores did not predict treatment response to CBIT or tic severity reductions. Tic symptoms distinctly cluster with little difference across youth and adults, or coexisting conditions. This study, which is the first to examine tic clusters and response to treatment, suggested that tic symptom profiles respond equally well to CBIT. Clinical trials.gov. identifiers: NCT00218777; NCT00231985.

  14. Analyzing patients' values by applying cluster analysis and LRFM model in a pediatric dental clinic in Taiwan.

    PubMed

    Wu, Hsin-Hung; Lin, Shih-Yen; Liu, Chih-Wei

    2014-01-01

    This study combines cluster analysis and LRFM (length, recency, frequency, and monetary) model in a pediatric dental clinic in Taiwan to analyze patients' values. A two-stage approach by self-organizing maps and K-means method is applied to segment 1,462 patients into twelve clusters. The average values of L, R, and F excluding monetary covered by national health insurance program are computed for each cluster. In addition, customer value matrix is used to analyze customer values of twelve clusters in terms of frequency and monetary. Customer relationship matrix considering length and recency is also applied to classify different types of customers from these twelve clusters. The results show that three clusters can be classified into loyal patients with L, R, and F values greater than the respective average L, R, and F values, while three clusters can be viewed as lost patients without any variable above the average values of L, R, and F. When different types of patients are identified, marketing strategies can be designed to meet different patients' needs.

  15. Task Analysis for Health Occupations. Cluster: Nursing. Occupation: Professional Nurse (Associate Degree). Education for Employment Task Lists.

    ERIC Educational Resources Information Center

    Lake County Area Vocational Center, Grayslake, IL.

    This document contains a task analysis for health occupations (professional nurse) in the nursing cluster. For each task listed, occupation, duty area, performance standard, steps, knowledge, attitudes, safety, equipment/supplies, source of analysis, and Illinois state goals for learning are listed. For the duty area of "providing therapeutic…

  16. THE CLUSTER AGES EXPERIMENT (CASE). V. ANALYSIS OF THREE ECLIPSING BINARIES IN THE GLOBULAR CLUSTER M4

    SciTech Connect

    Kaluzny, J.; Rozyczka, M.; Krzeminski, W.; Pych, W.; Thompson, I. B.; Burley, G. S.; Shectman, S. A.; Dotter, A.; Rucinski, S. M. E-mail: mnr@camk.edu.pl E-mail: batka@camk.edu.pl E-mail: ian@obs.carnegiescience.edu E-mail: shec@obs.carnegiescience.edu E-mail: rucinski@astro.utoronto.ca

    2013-02-01

    We use photometric and spectroscopic observations of the eclipsing binaries V65, V66, and V69 in the field of the globular cluster M4 to derive masses, radii, and luminosities of their components. The orbital periods of these systems are 2.29, 8.11, and 48.19 days, respectively. The measured masses of the primary and secondary components (M{sub p} and M{sub s} ) are 0.8035 {+-} 0.0086 and 0.6050 {+-} 0.0044 M{sub Sun} for V65, 0.7842 {+-} 0.0045 and 0.7443 {+-} 0.0042 M{sub Sun} for V66, and 0.7665 {+-} 0.0053 and 0.7278 {+-} 0/0048 M{sub Sun} for V69. The measured radii (R{sub p} and R{sub s} ) are 1.147 {+-} 0.010 and 0.6110 {+-} 0.0092 R{sub Sun} for V66, 0.9347 {+-} 0.0048 and 0.8298 {+-} 0.0053 R{sub Sun} for V66, and 0.8655 {+-} 0.0097 and 0.8074 {+-} 0.0080 R{sub Sun} for V69. The orbits of V65 and V66 are circular, whereas that of V69 has an eccentricity of 0.38. Based on systemic velocities and relative proper motions, we show that all three systems are members of the cluster. We find that the distance to M4 is 1.82 {+-} 0.04 kpc-in good agreement with recent estimates based on entirely different methods. We compare the absolute parameters of V66 and V69 with two sets of theoretical isochrones in mass-radius and mass-luminosity diagrams, and for assumed [Fe/H] = -1.20, [{alpha}/Fe] = 0.4, and Y = 0.25 we find the most probable age of M4 to be between 11.2 and 11.3 Gyr. Color-magnitude diagram (CMD) fitting with the same parameters yields an age close to, or slightly in excess of, 12 Gyr. However, considering the sources of uncertainty involved in CMD fitting, these two methods of age determination are not discrepant. Age and distance determinations can be further improved when infrared eclipse photometry is obtained.

  17. Identifying and Tracking Individual Updraft Cores using Cluster Analysis: A TWP-ICE case study

    NASA Astrophysics Data System (ADS)

    Li, X.; Tao, W.; Collis, S. M.; Varble, A.

    2013-12-01

    Cumulus parameterizations in GCMs depend strongly on the vertical velocity structures of convective updraft cores, or plumes. There hasn't been an accurate way of identifying these cores. The majority of previous studies treat the updraft as a single grid column entity, thus missing many intrinsic characteristics, e.g., the size, strength and spatial orientation of an individual core, its life cycle, and the time variations of the entrainment/detrainment rates associated with its life cycle. In this study, we attempt to apply an innovative algorithm based on the centroid-based k-means cluster analysis to improve our understanding of convection and its associated updraft cores. Both 3-D Doppler radar retrievals and cloud-resolving model simulations of a TWP-ICE campaign case during the monsoon period will be used to test and improve this algorithm. This will provide for more in-depth comparisons between CRM simulations and observations that were not possible previously using the traditional piecewise analysis with each updraft column. The first step is to identify the strongest cores (maximum velocity >10 m/s), since they are well defined and produce definite answers when the cluster analysis algorithm is applied. The preliminary results show that the radar retrieved updraft cores are smaller in size and with the maximum velocity located uniformly at higher levels compared with model simulations. Overall, the model simulations produce much stronger cores compared with the radar retrievals. Within the model simulations, the bulk microphysical scheme simulation produces stronger cores than the spectral bin microphysical scheme. Planned researches include using high temporal-resolution simulations to further track the life cycle of individual updraft cores and study their characteristics.

  18. CN and CH Abundance Analysis in a Sample of Eight Galactic Globular Clusters

    NASA Astrophysics Data System (ADS)

    Smolinski, Jason P.; Lee, Y.; Beers, T. C.; Martell, S. L.; An, D.; Sivarani, T.

    2011-01-01

    Galactic globular clusters exhibit star-to-star variations in their light element abundances that are not predicted by formation and evolution models involving single stellar generations. Recently it has been suggested that internal pollution from early supernovae and AGB winds may have played important roles in forming a second generation of enriched stars. We present updated results of a CN and CH abundance analysis of stars from the base to the tip of the red giant branch, and in some cases down onto the main sequence, for eight globular clusters with available photometric and spectroscopic data from SDSS-I and SDSS-II/SEGUE. These results include a discussion of the radial distribution of CN enrichment and how this may impact the current paradigm. Funding for SDSS-I and SDSS-II has been provided by the Alfred P. Sloan Foundation, the Participating Institutions, the National Science Foundation, the U.S. Department of Energy, the National Aeronautics and Space Administration, the Japanese Monbukagakusho, the Max Planck Society, and the Higher Education Funding Council for England. The SDSS Web Site is http://www.sdss.org/. This work was supported in part by grants PHY 02-16783 and PHY 08-22648: Physics Frontiers Center/Joint Institute for Nuclear Astrophysics (JINA), awarded by the U.S. National Science Foundation.

  19. In silico genomic analysis of the human and murine guanylate-binding protein (GBP) gene clusters.

    PubMed

    Olszewski, Maureen A; Gray, John; Vestal, Deborah J

    2006-05-01

    The guanylate-binding proteins (GBPs) were among the first interferon (IFN)-stimulated genes (ISGs) discovered, but until recently, little was known about their functions and even less about the composition of the gene family. Analysis of the promoter of human GBP-1 contributed significantly toward the understanding of Jak-Stat signaling and the delineation of the IFN-gamma activation site (GAS) and IFN-stimulated response element (ISRE) promoter elements. In this study, we have examined the genomic arrangement and composition of the GBPs in both mouse and humans. There are seven GBP paralogs in humans and at least one pseudogene, all of which are located in a cluster of genes on chromosome 1. Five of the six MuGBPs and a GBP pseudogene are clustered in a syntenic region on chromosome 3. The sixth MuGBP, MuGBP-4, and three GBP pseudogenes are located on chromosome 5. As might be expected, the GBPs share similar genomic organizations of introns and exons. Five of the MuGBPs had previously been shown to be coordinately induced by IFNs, and as expected, all of the MuGBPs have GAS and ISRE elements in their promoters. Interestingly, not all of the HuGBPs have GAS and ISRE elements, suggesting that not all GBPs are IFN responsive in humans.

  20. Modelling the signature of clustered airguns and analysis on the directivity of an airgun array

    NASA Astrophysics Data System (ADS)

    Li, Guofa; Cao, Mingqiang; Chen, Haolin; Ni, Chengzhou

    2011-03-01

    Based on the Giles-Johnston approximation and the analysis of the mutual interaction between two bubbles at different distances, a model is established to simulate the signature of clustered airguns in offshore seismic exploration, concerning the practical factors which have effects on bubble oscillation such as the gun ports throttling and the heat conduction across the bubble wall. By using this model, the signatures of clustered airguns at different distances are calculated with the volume of airguns equal and unequal; the energy distribution of the airgun array signature in three-dimensional (3D) space is also analysed. The result of simulation indicates (1) the pressure wave emitted from bubble oscillation and the ghost wave reflected from the sea surface change the wave field around other bubbles, which makes it the primary source of the mutual interaction between bubbles; (2) with the distance decreasing, the mutual interaction between bubbles becomes strong, the bubble period increases, the primary peak amplitude decreases, and the primary/bubble peak amplitude ratio increases at first and then decreases afterwards; (3) as the distance between bubbles reduced, the bubbles with unequal volumes oscillate with the same period, and the frequency locking phenomenon occurs accompanying the violent bubble oscillation; and (4) the energy of the pressure wave is strongest down below the midpoint of the airgun array, while relatively weak in other directions, this can be used for true amplitude recovery in seismic data processing.

  1. A Systematic Computational Analysis of Biosynthetic Gene Cluster Evolution: Lessons for Engineering Biosynthesis

    PubMed Central

    Sali, Andrej; Takano, Eriko; Fischbach, Michael A.

    2014-01-01

    Bacterial secondary metabolites are widely used as antibiotics, anticancer drugs, insecticides and food additives. Attempts to engineer their biosynthetic gene clusters (BGCs) to produce unnatural metabolites with improved properties are often frustrated by the unpredictability and complexity of the enzymes that synthesize these molecules, suggesting that genetic changes within BGCs are limited by specific constraints. Here, by performing a systematic computational analysis of BGC evolution, we derive evidence for three findings that shed light on the ways in which, despite these constraints, nature successfully invents new molecules: 1) BGCs for complex molecules often evolve through the successive merger of smaller sub-clusters, which function as independent evolutionary entities. 2) An important subset of polyketide synthases and nonribosomal peptide synthetases evolve by concerted evolution, which generates sets of sequence-homogenized domains that may hold promise for engineering efforts since they exhibit a high degree of functional interoperability, 3) Individual BGC families evolve in distinct ways, suggesting that design strategies should take into account family-specific functional constraints. These findings suggest novel strategies for using synthetic biology to rationally engineer biosynthetic pathways. PMID:25474254

  2. Outcome of patients with autoimmune diseases in the intensive care unit: a mixed cluster analysis

    PubMed Central

    Bernal-Macías, Santiago; Reyes-Beltrán, Benjamín; Molano-González, Nicolás; Augusto Vega, Daniel; Bichernall, Claudia; Díaz, Luis Aurelio; Rojas-Villarraga, Adriana; Anaya, Juan-Manuel

    2015-01-01

    Objectives The interest on autoimmune diseases (ADs) and their outcome at the intensive care unit (ICU) has increased due to the clinical challenge for diagnosis and management as well as for prognosis. The current work presents a-year experience on these topics in a tertiary hospital. Methods The mixed-cluster methodology based on multivariate descriptive methods such as principal component analysis and multiple correspondence analyses was performed to summarize sets of related variables with strong associations and common clinical context. Results Fifty adult patients with ADs with a mean age of 46.7±17.55 years were assessed. The two most common diagnoses were systemic lupus erythematosus and systemic sclerosis, registered in 45% and 20% of patients, respectively. The main causes of admission to ICU were infection and AD flare up, observed in 36% and 24%, respectively. Mortality during ICU stay was 24%. The length of hospital stay before ICU admission, shock, vasopressors, mechanical ventilation, abdominal sepsis, Glasgow score and plasmapheresis were all factors associated with mortality. Two new clinical clusters variables (NCVs) were defined: Time ICU and ICU Support Profile, which were associated with survivor and no survivor variables. Conclusions Identification of single factors and groups of factors from NCVs will allow implementation of early and aggressive therapies in patients with ADs at the ICU in order to avoid fatal outcomes PMID:26688741

  3. Cluster analysis of Pinus taiwanensis for its ex situ conservation in China.

    PubMed

    Gao, X; Shi, L; Wu, Z

    2015-06-01

    Pinus taiwanensis Hayata is one of the most famous sights in the Huangshan Scenic Resort, China, because of its strong adaptability and ability to survive; however, this endemic species is currently under threat in China. Relationships between different P. taiwanensis populations have been well-documented; however, few studies have been conducted on how to protect this rare pine. In the present study, we propose the ex situ conservation of this species using geographical information system (GIS) cluster and genetic diversity analyses. The GIS cluster method was conducted as a preliminary analysis for establishing a sampling site category based on climatic factors. Genetic diversity was analyzed using morphological and genetic traits. By combining geographical information with genetic data, we demonstrate that growing conditions, morphological traits, and the genetic make-up of the population in the Huangshan Scenic Resort were most similar to conditions on Tianmu Mountain. Therefore, we suggest that Tianmu Mountain is the best choice for the ex situ conservation of P. taiwanensis. Our results provide a molecular basis for the sustainable management, utilization, and conservation of this species in Huangshan Scenic Resort.

  4. Addressing preference heterogeneity in public health policy by combining Cluster Analysis and Multi-Criteria Decision Analysis: Proof of Method.

    PubMed

    Kaltoft, Mette Kjer; Turner, Robin; Cunich, Michelle; Salkeld, Glenn; Nielsen, Jesper Bo; Dowie, Jack

    2015-01-01

    The use of subgroups based on biological-clinical and socio-demographic variables to deal with population heterogeneity is well-established in public policy. The use of subgroups based on preferences is rare, except when religion based, and controversial. If it were decided to treat subgroup preferences as valid determinants of public policy, a transparent analytical procedure is needed. In this proof of method study we show how public preferences could be incorporated into policy decisions in a way that respects both the multi-criterial nature of those decisions, and the heterogeneity of the population in relation to the importance assigned to relevant criteria. It involves combining Cluster Analysis (CA), to generate the subgroup sets of preferences, with Multi-Criteria Decision Analysis (MCDA), to provide the policy framework into which the clustered preferences are entered. We employ three techniques of CA to demonstrate that not only do different techniques produce different clusters, but that choosing among techniques (as well as developing the MCDA structure) is an important task to be undertaken in implementing the approach outlined in any specific policy context. Data for the illustrative, not substantive, application are from a Randomized Controlled Trial of online decision aids for Australian men aged 40-69 years considering Prostate-specific Antigen testing for prostate cancer. We show that such analyses can provide policy-makers with insights into the criterion-specific needs of different subgroups. Implementing CA and MCDA in combination to assist in the development of policies on important health and community issues such as drug coverage, reimbursement, and screening programs, poses major challenges -conceptual, methodological, ethical-political, and practical - but most are exposed by the techniques, not created by them.

  5. Proteomic and bioinformatic analysis of epithelial tight junction reveals an unexpected cluster of synaptic molecules

    PubMed Central

    Tang, Vivian W

    2006-01-01

    Background Zonula occludens, also known as the tight junction, is a specialized cell-cell interaction characterized by membrane "kisses" between epithelial cells. A cytoplasmic plaque of ~100 nm corresponding to a meshwork of densely packed proteins underlies the tight junction membrane domain. Due to its enormous size and difficulties in obtaining a biochemically pure fraction, the molecular composition of the tight junction remains largely unknown. Results A novel biochemical purification protocol has been developed to isolate tight junction protein complexes from cultured human epithelial cells. After identification of proteins by mass spectroscopy and fingerprint analysis, candidate proteins are scored and assessed individually. A simple algorithm has been devised to incorporate transmembrane domains and protein modification sites for scoring membrane proteins. Using this new scoring system, a total of 912 proteins have been identified. These 912 hits are analyzed using a bioinformatics approach to bin the hits in 4 categories: configuration, molecular function, cellular function, and specialized process. Prominent clusters of proteins related to the cytoskeleton, cell adhesion, and vesicular traffic have been identified. Weaker clusters of proteins associated with cell growth, cell migration, translation, and transcription are also found. However, the strongest clusters belong to synaptic proteins and signaling molecules. Localization studies of key components of synaptic transmission have confirmed the presence of both presynaptic and postsynaptic proteins at the tight junction domain. To correlate proteomics data with structure, the tight junction has been examined using electron microscopy. This has revealed many novel structures including end-on cytoskeletal attachments, vesicles fusing/budding at the tight junction membrane domain, secreted substances encased between the tight junction kisses, endocytosis of tight junction double membranes, satellite

  6. Three-dimensional Multi-probe Analysis of the Galaxy Cluster A1689

    NASA Astrophysics Data System (ADS)

    Umetsu, Keiichi; Sereno, Mauro; Medezinski, Elinor; Nonino, Mario; Mroczkowski, Tony; Diego, Jose M.; Ettori, Stefano; Okabe, Nobuhiro; Broadhurst, Tom; Lemze, Doron

    2015-06-01

    We perform a three-dimensional multi-probe analysis of the rich galaxy cluster A1689, one of the most powerful known lenses on the sky, by combining improved weak-lensing data from new wide-field {{BVR}}Ci\\prime z\\prime Subaru/Suprime-Cam observations with strong-lensing, X-ray, and Sunyaev-Zel’dovich effect (SZE) data sets. We reconstruct the projected matter distribution from a joint weak-lensing analysis of two-dimensional shear and azimuthally integrated magnification constraints, the combination of which allows us to break the mass-sheet degeneracy. The resulting mass distribution reveals elongation with an axis ratio of ˜0.7 in projection, aligned well with the distributions of cluster galaxies and intracluster gas. When assuming a spherical halo, our full weak-lensing analysis yields a projected halo concentration of {c}200c2D=8.9+/- 1.1 ({c}{vir}2D˜ 11), consistent with and improved from earlier weak-lensing work. We find excellent consistency between independent weak and strong lensing in the region of overlap. In a parametric triaxial framework, we constrain the intrinsic structure and geometry of the matter and gas distributions, by combining weak/strong lensing and X-ray/SZE data with minimal geometric assumptions. We show that the data favor a triaxial geometry with minor-major axis ratio 0.39±0.15 and major axis closely aligned with the line of sight (22°±10°). We obtain a halo mass {M}200c=(1.2+/- 0.2)× {10}15 {M}⊙ {h}-1 and a halo concentration {c}200c=8.4+/- 1.3, which overlaps with the ≳ 1σ tail of the predicted distribution. The shape of the gas is rounder than the underlying matter but quite elongated with minor-major axis ratio 0.60 ± 0.14. The gas mass fraction within 0.9 Mpc is {10}-2+3%, a typical value for high-mass clusters. The thermal gas pressure contributes to ˜60% of the equilibrium pressure, indicating a significant level of non-thermal pressure support. When compared to Planck's hydrostatic mass estimate, our

  7. Comparative analysis of a conserved zinc finger gene cluster on human chromosome 19q and mouse chromosome 7.

    PubMed

    Shannon, M; Ashworth, L K; Mucenski, M L; Lamerdin, J E; Branscomb, E; Stubbs, L

    1996-04-01

    Several lines of evidence now suggest that many of the zinc-finger-containing (ZNF) genes in the human genome are arranged in clusters. However, little is known about the structure or function of the clusters or about their conservation throughout evolution. Here, we report the analysis of a conserved ZNF gene cluster located in human chromosome 19q13.2 and mouse chromosome 7. Our results indicate that the human cluster consists of at least 10 related Kruppel-associated box (KRAB)-containing ZNF genes organized in tandem over a distance of 350-450 kb. Two cDNA clones representing genes in the murine cluster have been studied in detail. The KRAB A domains of these genes are nearly identical and are highly similar to human 19q13.2-derived KRAB sequences, but DNA-binding ZNF domains and other portions of the genes differ considerably. The two murine genes display distinct expression patterns, but are coexpressed in some adult tissues. These studies pave the way for a systematic analysis of the evolution of structure and function of genes within the numerous clustered ZNF families located on human chromosome 19 and elsewhere in the human and mouse genomes.

  8. A Constrained-Clustering Approach to the Analysis of Remote Sensing Data.

    DTIC Science & Technology

    1983-01-01

    One old and two new clustering methods were applied to the constrained-clustering problem of separating different agricultural fields based on multispectral remote sensing satellite data. (Constrained-clustering involves double clustering in multispectral measurement similarity and geographical location.) The results of applying the three methods are provided along with a discussion of their relative strengths and weaknesses and a detailed description of their algorithms.

  9. Clusters in the biopharmaceutical industry: toward a new method of analysis.

    PubMed

    Erden, Zeynep; von Krogh, Georg

    2011-05-01

    Clusters are groups of co-located and interconnected firms and institutions linked by commonalities in their strategies and complementarities in their activities and resources. There are several reasons for the geographical clustering of firms in the biopharmaceutical industry. This review unpacks some advantages and disadvantages of cluster participation, and proposes a new method to enable managers and researchers to identify clusters in the biopharmaceutical industry.

  10. VizieR Online Data Catalog: Slug analysis of star clusters in NGC 628 & 7793 (Krumholz+, 2015)

    NASA Astrophysics Data System (ADS)

    Krumholz, M. R.; Adamo, A.; Fumagalli, M.; Wofford, A.; Calzetti, D.; Lee, J. C.; Whitmore, B. C.; Bright, S. N.; Grasha, K.; Gouliermis, D. A.; Kim, H.; Nair, P.; Ryon, J. E.; Smith, L. J.; Thilker, D.; Ubeda, L.; Zackrisson, E.

    2016-02-01

    In this paper we use slug, the Stochastically Lighting Up Galaxies code (da Silva et al. 2012ApJ...745..145D, 2014MNRAS.444.3275D; Krumholz et al. 2015MNRAS.452.1447K), and its post-processing tool for analysis of star cluster properties, cluster_slug, to analyze an initial sample of clusters from the LEGUS (Calzetti et al. 2015AJ....149...51C). A description of the steps required to produce final cluster catalogs of the Legacy Extragalactic UV Survey (LEGUS) targets can be found in Calzetti et al. (2015AJ....149...51C), and in A. Adamo et al. (2015, in preparation). LEGUS is an HST Cycle 21 Treasury program that is imaging 50 nearby galaxies in five broadbands with the WFC3/UVIS, from the NUV to the I band. (1 data file).

  11. A multiple imputation approach to the analysis of clustered interval-censored failure time data with the additive hazards model

    PubMed Central

    Chen, Ling; Sun, Jianguo; Xiong, Chengjie

    2016-01-01

    Clustered interval-censored failure time data can occur when the failure time of interest is collected from several clusters and known only within certain time intervals. Regression analysis of clustered interval-censored failure time data is discussed assuming that the data arise from the semiparametric additive hazards model. A multiple imputation approach is proposed for inference. A major advantage of the approach is its simplicity because it avoids estimating the correlation within clusters by implementing a resampling-based method. The presented approach can be easily implemented by using the existing software packages for right-censored failure time data. Extensive simulation studies are conducted, indicating that the proposed imputation approach performs well for practical situations. The proposed approach also performs well compared to the existing methods and can be more conveniently applied to various types of data representation. The proposed methodology is further demonstrated by applying it to a lymphatic filariasis study. PMID:27773956

  12. Data Mining of University Philanthropic Giving: Cluster-Discriminant Analysis and Pareto Effects

    ERIC Educational Resources Information Center

    Le Blanc, Louis A.; Rucks, Conway T.

    2009-01-01

    A large sample of 33,000 university alumni records were cluster-analyzed to generate six groups relatively unique in their respective attribute values. The attributes used to cluster the former students included average gift to the university's foundation and to the alumni association for the same institution. Cluster detection is useful in this…

  13. Spherical Harmonic Analysis of Particle Velocity Distribution Function: Comparison of Moments and Anisotropies using Cluster Data

    NASA Technical Reports Server (NTRS)

    Gurgiolo, Chris; Vinas, Adolfo F.

    2009-01-01

    This paper presents a spherical harmonic analysis of the plasma velocity distribution function using high-angular, energy, and time resolution Cluster data obtained from the PEACE spectrometer instrument to demonstrate how this analysis models the particle distribution function and its moments and anisotropies. The results show that spherical harmonic analysis produced a robust physical representation model of the velocity distribution function, resolving the main features of the measured distributions. From the spherical harmonic analysis, a minimum set of nine spectral coefficients was obtained from which the moment (up to the heat flux), anisotropy, and asymmetry calculations of the velocity distribution function were obtained. The spherical harmonic method provides a potentially effective "compression" technique that can be easily carried out onboard a spacecraft to determine the moments and anisotropies of the particle velocity distribution function for any species. These calculations were implemented using three different approaches, namely, the standard traditional integration, the spherical harmonic (SPH) spectral coefficients integration, and the singular value decomposition (SVD) on the spherical harmonic methods. A comparison among the various methods shows that both SPH and SVD approaches provide remarkable agreement with the standard moment integration method.

  14. EMPCA and Cluster Analysis of Quasar Spectra: Sample Preparation and Validation

    NASA Astrophysics Data System (ADS)

    Wagner, Cassidy; Leighly, Karen; Macinnis, Francis; Marrs, Adam; Richards, Gordon T.

    2017-01-01

    All quasars are fundamentally similar, powered by accretion of matter onto a super massive black hole. However, patterns of differences can be identified through the emission lines. Quasar broad absorption lines have been postulated to be responsible for feedback in galaxy evolution. Principal component analysis (PCA) quantifies trends in emission lines of quasars that can be used to predict and reconstruct the underlying continuum in broad absorption line quasars.Richards et al. 2011 hypothesized that emission-line variance across the rest-UV spectrum is correlated with C IV blueshift and equivalent width. We fit their composite spectra, constructed based on these properties, to identify trends for the purpose of creating simulated spectra to test the weighted Expectation Maximization PCA (EMPCA; Bailey 2012) and cluster analysis method discussed in adjacent poster by Marrs et al.More than 800 SDSS spectra from Allen et al. 2011, with a redshift range of z = 2.2 - 2.3, were selected for analysis, particularly spectra with high signal to noise ratios, without broad absorption lines, and without numerous narrow absorption lines. Interstellar and intergalactic absorption lines add variance that contaminates the principal components. To remove these lines, we smoothed the spectra using a Fourier transform and a low-pass filter. We then used a line-finding and -removal program to remove or flag narrow absorption lines. From the principal components that resulted from the PCA analysis we were able to reconstruct the continua of a small sample of BAL QSOs.

  15. ASTM clustering for improving coal analysis by near-infrared spectroscopy.

    PubMed

    Andrés, J M; Bona, M T

    2006-11-15

    Multivariate analysis techniques have been applied to near-infrared (NIR) spectra coals to investigate the relationship between nine coal properties (moisture (%), ash (%), volatile matter (%), fixed carbon (%), heating value (kcal/kg), carbon (%), hydrogen (%), nitrogen (%) and sulphur (%)) and the corresponding predictor variables. In this work, a whole set of coal samples was grouped into six more homogeneous clusters following the ASTM reference method for classification prior to the application of calibration methods to each coal set. The results obtained showed a considerable improvement of the error determination compared with the calibration for the whole sample set. For some groups, the established calibrations approached the quality required by the ASTM/ISO norms for laboratory analysis. To predict property values for a new coal sample it is necessary the assignation of that sample to its respective group. Thus, the discrimination and classification ability of coal samples by Diffuse Reflectance Infrared Fourier Transform Spectroscopy (DRIFTS) in the NIR range was also studied by applying Soft Independent Modelling of Class Analogy (SIMCA) and Linear Discriminant Analysis (LDA) techniques. Modelling of the groups by SIMCA led to overlapping models that cannot discriminate for unique classification. On the other hand, the application of Linear Discriminant Analysis improved the classification of the samples but not enough to be satisfactory for every group considered.

  16. Structure and substructure analysis of DAFT/FADA galaxy clusters in the [0.4-0.9] redshift range

    NASA Astrophysics Data System (ADS)

    Guennou, L.; Adami, C.; Durret, F.; Lima Neto, G. B.; Ulmer, M. P.; Clowe, D.; LeBrun, V.; Martinet, N.; Allam, S.; Annis, J.; Basa, S.; Benoist, C.; Biviano, A.; Cappi, A.; Cypriano, E. S.; Gavazzi, R.; Halliday, C.; Ilbert, O.; Jullo, E.; Just, D.; Limousin, M.; Márquez, I.; Mazure, A.; Murphy, K. J.; Plana, H.; Rostagni, F.; Russeil, D.; Schirmer, M.; Slezak, E.; Tucker, D.; Zaritsky, D.; Ziegler, B.

    2014-01-01

    Context. The DAFT/FADA survey is based on the study of ~90 rich (masses found in the literature >2 × 1014 M⊙) and moderately distant clusters (redshifts 0.4 < z < 0.9), all with HST imaging data available. This survey has two main objectives: to constrain dark energy (DE) using weak lensing tomography on galaxy clusters and to build a database (deep multi-band imaging allowing photometric redshift estimates, spectroscopic data, X-ray data) of rich distant clusters to study their properties. Aims: We anal