Sample records for cluster sampling procedure

  1. Sampling procedures for inventory of commercial volume tree species in Amazon Forest.

    PubMed

    Netto, Sylvio P; Pelissari, Allan L; Cysneiros, Vinicius C; Bonazza, Marcelo; Sanquetta, Carlos R

    2017-01-01

    The spatial distribution of tropical tree species can affect the consistency of the estimators in commercial forest inventories, therefore, appropriate sampling procedures are required to survey species with different spatial patterns in the Amazon Forest. For this, the present study aims to evaluate the conventional sampling procedures and introduce the adaptive cluster sampling for volumetric inventories of Amazonian tree species, considering the hypotheses that the density, the spatial distribution and the zero-plots affect the consistency of the estimators, and that the adaptive cluster sampling allows to obtain more accurate volumetric estimation. We use data from a census carried out in Jamari National Forest, Brazil, where trees with diameters equal to or higher than 40 cm were measured in 1,355 plots. Species with different spatial patterns were selected and sampled with simple random sampling, systematic sampling, linear cluster sampling and adaptive cluster sampling, whereby the accuracy of the volumetric estimation and presence of zero-plots were evaluated. The sampling procedures applied to species were affected by the low density of trees and the large number of zero-plots, wherein the adaptive clusters allowed concentrating the sampling effort in plots with trees and, thus, agglutinating more representative samples to estimate the commercial volume.

  2. Manual hierarchical clustering of regional geochemical data using a Bayesian finite mixture model

    USGS Publications Warehouse

    Ellefsen, Karl J.; Smith, David

    2016-01-01

    Interpretation of regional scale, multivariate geochemical data is aided by a statistical technique called “clustering.” We investigate a particular clustering procedure by applying it to geochemical data collected in the State of Colorado, United States of America. The clustering procedure partitions the field samples for the entire survey area into two clusters. The field samples in each cluster are partitioned again to create two subclusters, and so on. This manual procedure generates a hierarchy of clusters, and the different levels of the hierarchy show geochemical and geological processes occurring at different spatial scales. Although there are many different clustering methods, we use Bayesian finite mixture modeling with two probability distributions, which yields two clusters. The model parameters are estimated with Hamiltonian Monte Carlo sampling of the posterior probability density function, which usually has multiple modes. Each mode has its own set of model parameters; each set is checked to ensure that it is consistent both with the data and with independent geologic knowledge. The set of model parameters that is most consistent with the independent geologic knowledge is selected for detailed interpretation and partitioning of the field samples.

  3. On evaluating clustering procedures for use in classification

    NASA Technical Reports Server (NTRS)

    Pore, M. D.; Moritz, T. E.; Register, D. T.; Yao, S. S.; Eppler, W. G. (Principal Investigator)

    1979-01-01

    The problem of evaluating clustering algorithms and their respective computer programs for use in a preprocessing step for classification is addressed. In clustering for classification the probability of correct classification is suggested as the ultimate measure of accuracy on training data. A means of implementing this criterion and a measure of cluster purity are discussed. Examples are given. A procedure for cluster labeling that is based on cluster purity and sample size is presented.

  4. Evaluation of the procedure 1A component of the 1980 US/Canada wheat and barley exploratory experiment

    NASA Technical Reports Server (NTRS)

    Chapman, G. M. (Principal Investigator); Carnes, J. G.

    1981-01-01

    Several techniques which use clusters generated by a new clustering algorithm, CLASSY, are proposed as alternatives to random sampling to obtain greater precision in crop proportion estimation: (1) Proportional Allocation/relative count estimator (PA/RCE) uses proportional allocation of dots to clusters on the basis of cluster size and a relative count cluster level estimate; (2) Proportional Allocation/Bayes Estimator (PA/BE) uses proportional allocation of dots to clusters and a Bayesian cluster-level estimate; and (3) Bayes Sequential Allocation/Bayesian Estimator (BSA/BE) uses sequential allocation of dots to clusters and a Bayesian cluster level estimate. Clustering in an effective method in making proportion estimates. It is estimated that, to obtain the same precision with random sampling as obtained by the proportional sampling of 50 dots with an unbiased estimator, samples of 85 or 166 would need to be taken if dot sets with AI labels (integrated procedure) or ground truth labels, respectively were input. Dot reallocation provides dot sets that are unbiased. It is recommended that these proportion estimation techniques are maintained, particularly the PA/BE because it provides the greatest precision.

  5. Extending cluster Lot Quality Assurance Sampling designs for surveillance programs

    PubMed Central

    Hund, Lauren; Pagano, Marcello

    2014-01-01

    Lot quality assurance sampling (LQAS) has a long history of applications in industrial quality control. LQAS is frequently used for rapid surveillance in global health settings, with areas classified as poor or acceptable performance based on the binary classification of an indicator. Historically, LQAS surveys have relied on simple random samples from the population; however, implementing two-stage cluster designs for surveillance sampling is often more cost-effective than simple random sampling. By applying survey sampling results to the binary classification procedure, we develop a simple and flexible non-parametric procedure to incorporate clustering effects into the LQAS sample design to appropriately inflate the sample size, accommodating finite numbers of clusters in the population when relevant. We use this framework to then discuss principled selection of survey design parameters in longitudinal surveillance programs. We apply this framework to design surveys to detect rises in malnutrition prevalence in nutrition surveillance programs in Kenya and South Sudan, accounting for clustering within villages. By combining historical information with data from previous surveys, we design surveys to detect spikes in the childhood malnutrition rate. PMID:24633656

  6. Extending cluster lot quality assurance sampling designs for surveillance programs.

    PubMed

    Hund, Lauren; Pagano, Marcello

    2014-07-20

    Lot quality assurance sampling (LQAS) has a long history of applications in industrial quality control. LQAS is frequently used for rapid surveillance in global health settings, with areas classified as poor or acceptable performance on the basis of the binary classification of an indicator. Historically, LQAS surveys have relied on simple random samples from the population; however, implementing two-stage cluster designs for surveillance sampling is often more cost-effective than simple random sampling. By applying survey sampling results to the binary classification procedure, we develop a simple and flexible nonparametric procedure to incorporate clustering effects into the LQAS sample design to appropriately inflate the sample size, accommodating finite numbers of clusters in the population when relevant. We use this framework to then discuss principled selection of survey design parameters in longitudinal surveillance programs. We apply this framework to design surveys to detect rises in malnutrition prevalence in nutrition surveillance programs in Kenya and South Sudan, accounting for clustering within villages. By combining historical information with data from previous surveys, we design surveys to detect spikes in the childhood malnutrition rate. Copyright © 2014 John Wiley & Sons, Ltd.

  7. A modified procedure for mixture-model clustering of regional geochemical data

    USGS Publications Warehouse

    Ellefsen, Karl J.; Smith, David B.; Horton, John D.

    2014-01-01

    A modified procedure is proposed for mixture-model clustering of regional-scale geochemical data. The key modification is the robust principal component transformation of the isometric log-ratio transforms of the element concentrations. This principal component transformation and the associated dimension reduction are applied before the data are clustered. The principal advantage of this modification is that it significantly improves the stability of the clustering. The principal disadvantage is that it requires subjective selection of the number of clusters and the number of principal components. To evaluate the efficacy of this modified procedure, it is applied to soil geochemical data that comprise 959 samples from the state of Colorado (USA) for which the concentrations of 44 elements are measured. The distributions of element concentrations that are derived from the mixture model and from the field samples are similar, indicating that the mixture model is a suitable representation of the transformed geochemical data. Each cluster and the associated distributions of the element concentrations are related to specific geologic and anthropogenic features. In this way, mixture model clustering facilitates interpretation of the regional geochemical data.

  8. Profiling Local Optima in K-Means Clustering: Developing a Diagnostic Technique

    ERIC Educational Resources Information Center

    Steinley, Douglas

    2006-01-01

    Using the cluster generation procedure proposed by D. Steinley and R. Henson (2005), the author investigated the performance of K-means clustering under the following scenarios: (a) different probabilities of cluster overlap; (b) different types of cluster overlap; (c) varying samples sizes, clusters, and dimensions; (d) different multivariate…

  9. Application of adaptive cluster sampling to low-density populations of freshwater mussels

    USGS Publications Warehouse

    Smith, D.R.; Villella, R.F.; Lemarie, D.P.

    2003-01-01

    Freshwater mussels appear to be promising candidates for adaptive cluster sampling because they are benthic macroinvertebrates that cluster spatially and are frequently found at low densities. We applied adaptive cluster sampling to estimate density of freshwater mussels at 24 sites along the Cacapon River, WV, where a preliminary timed search indicated that mussels were present at low density. Adaptive cluster sampling increased yield of individual mussels and detection of uncommon species; however, it did not improve precision of density estimates. Because finding uncommon species, collecting individuals of those species, and estimating their densities are important conservation activities, additional research is warranted on application of adaptive cluster sampling to freshwater mussels. However, at this time we do not recommend routine application of adaptive cluster sampling to freshwater mussel populations. The ultimate, and currently unanswered, question is how to tell when adaptive cluster sampling should be used, i.e., when is a population sufficiently rare and clustered for adaptive cluster sampling to be efficient and practical? A cost-effective procedure needs to be developed to identify biological populations for which adaptive cluster sampling is appropriate.

  10. Procedures to handle inventory cluster plots that straddle two or more conditions

    Treesearch

    Jerold T. Hahn; Colin D. MacLean; Stanford L. Arner; William A. Bechtold

    1995-01-01

    We review the relative merits and field procedures for four basic plot designs to handle forest inventory plots that straddle two or more conditions, given that subplots will not be moved. A cluster design is recommended that combines fixed-area subplots and variable-radius plot (VRP) sampling. Each subplot in a cluster consists of a large fixed-area subplot for...

  11. Testing the accuracy of clustering redshifts with simulations

    NASA Astrophysics Data System (ADS)

    Scottez, V.; Benoit-Lévy, A.; Coupon, J.; Ilbert, O.; Mellier, Y.

    2018-03-01

    We explore the accuracy of clustering-based redshift inference within the MICE2 simulation. This method uses the spatial clustering of galaxies between a spectroscopic reference sample and an unknown sample. This study give an estimate of the reachable accuracy of this method. First, we discuss the requirements for the number objects in the two samples, confirming that this method does not require a representative spectroscopic sample for calibration. In the context of next generation of cosmological surveys, we estimated that the density of the Quasi Stellar Objects in BOSS allows us to reach 0.2 per cent accuracy in the mean redshift. Secondly, we estimate individual redshifts for galaxies in the densest regions of colour space ( ˜ 30 per cent of the galaxies) without using the photometric redshifts procedure. The advantage of this procedure is threefold. It allows: (i) the use of cluster-zs for any field in astronomy, (ii) the possibility to combine photo-zs and cluster-zs to get an improved redshift estimation, (iii) the use of cluster-z to define tomographic bins for weak lensing. Finally, we explore this last option and build five cluster-z selected tomographic bins from redshift 0.2 to 1. We found a bias on the mean redshift estimate of 0.002 per bin. We conclude that cluster-z could be used as a primary redshift estimator by next generation of cosmological surveys.

  12. Homogeneity tests of clustered diagnostic markers with applications to the BioCycle Study

    PubMed Central

    Tang, Liansheng Larry; Liu, Aiyi; Schisterman, Enrique F.; Zhou, Xiao-Hua; Liu, Catherine Chun-ling

    2014-01-01

    Diagnostic trials often require the use of a homogeneity test among several markers. Such a test may be necessary to determine the power both during the design phase and in the initial analysis stage. However, no formal method is available for the power and sample size calculation when the number of markers is greater than two and marker measurements are clustered in subjects. This article presents two procedures for testing the accuracy among clustered diagnostic markers. The first procedure is a test of homogeneity among continuous markers based on a global null hypothesis of the same accuracy. The result under the alternative provides the explicit distribution for the power and sample size calculation. The second procedure is a simultaneous pairwise comparison test based on weighted areas under the receiver operating characteristic curves. This test is particularly useful if a global difference among markers is found by the homogeneity test. We apply our procedures to the BioCycle Study designed to assess and compare the accuracy of hormone and oxidative stress markers in distinguishing women with ovulatory menstrual cycles from those without. PMID:22733707

  13. The Gap Procedure: for the identification of phylogenetic clusters in HIV-1 sequence data.

    PubMed

    Vrbik, Irene; Stephens, David A; Roger, Michel; Brenner, Bluma G

    2015-11-04

    In the context of infectious disease, sequence clustering can be used to provide important insights into the dynamics of transmission. Cluster analysis is usually performed using a phylogenetic approach whereby clusters are assigned on the basis of sufficiently small genetic distances and high bootstrap support (or posterior probabilities). The computational burden involved in this phylogenetic threshold approach is a major drawback, especially when a large number of sequences are being considered. In addition, this method requires a skilled user to specify the appropriate threshold values which may vary widely depending on the application. This paper presents the Gap Procedure, a distance-based clustering algorithm for the classification of DNA sequences sampled from individuals infected with the human immunodeficiency virus type 1 (HIV-1). Our heuristic algorithm bypasses the need for phylogenetic reconstruction, thereby supporting the quick analysis of large genetic data sets. Moreover, this fully automated procedure relies on data-driven gaps in sorted pairwise distances to infer clusters, thus no user-specified threshold values are required. The clustering results obtained by the Gap Procedure on both real and simulated data, closely agree with those found using the threshold approach, while only requiring a fraction of the time to complete the analysis. Apart from the dramatic gains in computational time, the Gap Procedure is highly effective in finding distinct groups of genetically similar sequences and obviates the need for subjective user-specified values. The clusters of genetically similar sequences returned by this procedure can be used to detect patterns in HIV-1 transmission and thereby aid in the prevention, treatment and containment of the disease.

  14. Data processing 1: Advancements in machine analysis of multispectral data

    NASA Technical Reports Server (NTRS)

    Swain, P. H.

    1972-01-01

    Multispectral data processing procedures are outlined beginning with the data display process used to accomplish data editing and proceeding through clustering, feature selection criterion for error probability estimation, and sample clustering and sample classification. The effective utilization of large quantities of remote sensing data by formulating a three stage sampling model for evaluation of crop acreage estimates represents an improvement in determining the cost benefit relationship associated with remote sensing technology.

  15. Adaptive sampling in research on risk-related behaviors.

    PubMed

    Thompson, Steven K; Collins, Linda M

    2002-11-01

    This article introduces adaptive sampling designs to substance use researchers. Adaptive sampling is particularly useful when the population of interest is rare, unevenly distributed, hidden, or hard to reach. Examples of such populations are injection drug users, individuals at high risk for HIV/AIDS, and young adolescents who are nicotine dependent. In conventional sampling, the sampling design is based entirely on a priori information, and is fixed before the study begins. By contrast, in adaptive sampling, the sampling design adapts based on observations made during the survey; for example, drug users may be asked to refer other drug users to the researcher. In the present article several adaptive sampling designs are discussed. Link-tracing designs such as snowball sampling, random walk methods, and network sampling are described, along with adaptive allocation and adaptive cluster sampling. It is stressed that special estimation procedures taking the sampling design into account are needed when adaptive sampling has been used. These procedures yield estimates that are considerably better than conventional estimates. For rare and clustered populations adaptive designs can give substantial gains in efficiency over conventional designs, and for hidden populations link-tracing and other adaptive procedures may provide the only practical way to obtain a sample large enough for the study objectives.

  16. The Mass Function of Abell Clusters

    NASA Astrophysics Data System (ADS)

    Chen, J.; Huchra, J. P.; McNamara, B. R.; Mader, J.

    1998-12-01

    The velocity dispersion and mass functions for rich clusters of galaxies provide important constraints on models of the formation of Large-Scale Structure (e.g., Frenk et al. 1990). However, prior estimates of the velocity dispersion or mass function for galaxy clusters have been based on either very small samples of clusters (Bahcall and Cen 1993; Zabludoff et al. 1994) or large but incomplete samples (e.g., the Girardi et al. (1998) determination from a sample of clusters with more than 30 measured galaxy redshifts). In contrast, we approach the problem by constructing a volume-limited sample of Abell clusters. We collected individual galaxy redshifts for our sample from two major galaxy velocity databases, the NASA Extragalactic Database, NED, maintained at IPAC, and ZCAT, maintained at SAO. We assembled a database with velocity information for possible cluster members and then selected cluster members based on both spatial and velocity data. Cluster velocity dispersions and masses were calculated following the procedures of Danese, De Zotti, and di Tullio (1980) and Heisler, Tremaine, and Bahcall (1985), respectively. The final velocity dispersion and mass functions were analyzed in order to constrain cosmological parameters by comparison to the results of N-body simulations. Our data for the cluster sample as a whole and for the individual clusters (spatial maps and velocity histograms) in our sample is available on-line at http://cfa-www.harvard.edu/ huchra/clusters. This website will be updated as more data becomes available in the master redshift compilations, and will be expanded to include more clusters and large groups of galaxies.

  17. A comparison of unsupervised classification procedures on LANDSAT MSS data for an area of complex surface conditions in Basilicata, Southern Italy

    NASA Technical Reports Server (NTRS)

    Justice, C.; Townshend, J. (Principal Investigator)

    1981-01-01

    Two unsupervised classification procedures were applied to ratioed and unratioed LANDSAT multispectral scanner data of an area of spatially complex vegetation and terrain. An objective accuracy assessment was undertaken on each classification and comparison was made of the classification accuracies. The two unsupervised procedures use the same clustering algorithm. By on procedure the entire area is clustered and by the other a representative sample of the area is clustered and the resulting statistics are extrapolated to the remaining area using a maximum likelihood classifier. Explanation is given of the major steps in the classification procedures including image preprocessing; classification; interpretation of cluster classes; and accuracy assessment. Of the four classifications undertaken, the monocluster block approach on the unratioed data gave the highest accuracy of 80% for five coarse cover classes. This accuracy was increased to 84% by applying a 3 x 3 contextual filter to the classified image. A detailed description and partial explanation is provided for the major misclassification. The classification of the unratioed data produced higher percentage accuracies than for the ratioed data and the monocluster block approach gave higher accuracies than clustering the entire area. The moncluster block approach was additionally the most economical in terms of computing time.

  18. Generating Random Samples of a Given Size Using Social Security Numbers.

    ERIC Educational Resources Information Center

    Erickson, Richard C.; Brauchle, Paul E.

    1984-01-01

    The purposes of this article are (1) to present a method by which social security numbers may be used to draw cluster samples of a predetermined size and (2) to describe procedures used to validate this method of drawing random samples. (JOW)

  19. CCD photometry of NGC 6101 - Another globular cluster with blue straggler stars

    NASA Technical Reports Server (NTRS)

    Sarajedini, Ata; Da Costa, G. S.

    1991-01-01

    Results are presented on CCD photometric observations of a large sample of stars in the southern globular cluster NGC 6101, and the procedures used to derive the color-magnitude (C-M) diagram of the cluster are described. No indication was found of any difference in age, at the less than 2 Gyr level, between NGC 6101 cluster and other clusters of similar abundance, such as M92. The C-M diagram revealed a significant blue straggler population. It was found that, in NGC 6101, these stars are more centrally concentrated than the cluster subgiants of similar magnitude, indicating that the blue stragglers have larger masses. Results on the magnitude and luminosity function of the sample are consistent with the bianry mass transfer or merger hypotheses for the origin of blue straggler stars.

  20. Topology in two dimensions. II - The Abell and ACO cluster catalogues

    NASA Astrophysics Data System (ADS)

    Plionis, Manolis; Valdarnini, Riccardo; Coles, Peter

    1992-09-01

    We apply a method for quantifying the topology of projected galaxy clustering to the Abell and ACO catalogues of rich clusters. We use numerical simulations to quantify the statistical bias involved in using high peaks to define the large-scale structure, and we use the results obtained to correct our observational determinations for this known selection effect and also for possible errors introduced by boundary effects. We find that the Abell cluster sample is consistent with clusters being identified with high peaks of a Gaussian random field, but that the ACO shows a slight meatball shift away from the Gaussian behavior over and above that expected purely from the high-peak selection. The most conservative explanation of this effect is that it is caused by some artefact of the procedure used to select the clusters in the two samples.

  1. The Atacama Cosmology Telescope: Physical Properties and Purity of a Galaxy Cluster Sample Selected Via the Sunyaev-Zel'Dovich Effect

    NASA Technical Reports Server (NTRS)

    Menanteau, Felipe; Gonzalez, Jorge; Juin, Jean-Baptiste; Marriage, Tobias; Reese, Erik D.; Acquaviva, Viviana; Aguirre, Paula; Appel, John Willam; Baker, Andrew J.; Barrientos, L. Felipe; hide

    2010-01-01

    We present optical and X-ray properties for the first confirmed galaxy cluster sample selected by the Sunyaev-Zel'dovich Effect from 148 GHz maps over 455 square degrees of sky made with the Atacama Cosmology Telescope. These maps. coupled with multi-band imaging on 4-meter-class optical telescopes, have yielded a sample of 23 galaxy clusters with redshifts between 0.118 and 1.066. Of these 23 clusters, 10 are newly discovered. The selection of this sample is approximately mass limited and essentially independent of redshift. We provide optical positions, images, redshifts and X-ray fluxes and luminosities for the full sample, and X-ray temperatures of an important subset. The mass limit of the full sample is around 8.0 x 10(exp 14) Stellar Mass. with a number distribution that peaks around a redshift of 0.4. For the 10 highest significance SZE-selected cluster candidates, all of which are optically confirmed, the mass threshold is 1 x 10(exp 15) Stellar Mass and the redshift range is 0.167 to 1.066. Archival observations from Chandra, XMM-Newton. and ROSAT provide X-ray luminosities and temperatures that are broadly consistent with this mass threshold. Our optical follow-up procedure also allowed us to assess the purity of the ACT cluster sample. Eighty (one hundred) percent of the 148 GHz candidates with signal-to-noise ratios greater than 5.1 (5.7) are confirmed as massive clusters. The reported sample represents one of the largest SZE-selected sample of massive clusters over all redshifts within a cosmologically-significant survey volume, which will enable cosmological studies as well as future studies on the evolution, morphology, and stellar populations in the most massive clusters in the Universe.

  2. The Hubble Space Telescope Medium Deep Survey Cluster Sample: Methodology and Data

    NASA Astrophysics Data System (ADS)

    Ostrander, E. J.; Nichol, R. C.; Ratnatunga, K. U.; Griffiths, R. E.

    1998-12-01

    We present a new, objectively selected, sample of galaxy overdensities detected in the Hubble Space Telescope Medium Deep Survey (MDS). These clusters/groups were found using an automated procedure that involved searching for statistically significant galaxy overdensities. The contrast of the clusters against the field galaxy population is increased when morphological data are used to search around bulge-dominated galaxies. In total, we present 92 overdensities above a probability threshold of 99.5%. We show, via extensive Monte Carlo simulations, that at least 60% of these overdensities are likely to be real clusters and groups and not random line-of-sight superpositions of galaxies. For each overdensity in the MDS cluster sample, we provide a richness and the average of the bulge-to-total ratio of galaxies within each system. This MDS cluster sample potentially contains some of the most distant clusters/groups ever detected, with about 25% of the overdensities having estimated redshifts z > ~0.9. We have made this sample publicly available to facilitate spectroscopic confirmation of these clusters and help more detailed studies of cluster and galaxy evolution. We also report the serendipitous discovery of a new cluster close on the sky to the rich optical cluster Cl l0016+16 at z = 0.546. This new overdensity, HST 001831+16208, may be coincident with both an X-ray source and a radio source. HST 001831+16208 is the third cluster/group discovered near to Cl 0016+16 and appears to strengthen the claims of Connolly et al. of superclustering at high redshift.

  3. Career Decision Statuses among Portuguese Secondary School Students: A Cluster Analytical Approach

    ERIC Educational Resources Information Center

    Santos, Paulo Jorge; Ferreira, Joaquim Armando

    2012-01-01

    Career indecision is a complex phenomenon and an increasing number of authors have proposed that undecided individuals do not form a group with homogeneous characteristics. This study examines career decision statuses among a sample of 362 12th-grade Portuguese students. A cluster-analytical procedure, based on a battery of instruments designed to…

  4. The contribution of cluster and discriminant analysis to the classification of complex aquifer systems.

    PubMed

    Panagopoulos, G P; Angelopoulou, D; Tzirtzilakis, E E; Giannoulopoulos, P

    2016-10-01

    This paper presents an innovated method for the discrimination of groundwater samples in common groups representing the hydrogeological units from where they have been pumped. This method proved very efficient even in areas with complex hydrogeological regimes. The proposed method requires chemical analyses of water samples only for major ions, meaning that it is applicable to most of cases worldwide. Another benefit of the method is that it gives a further insight of the aquifer hydrogeochemistry as it provides the ions that are responsible for the discrimination of the group. The procedure begins with cluster analysis of the dataset in order to classify the samples in the corresponding hydrogeological unit. The feasibility of the method is proven from the fact that the samples of volcanic origin were separated into two different clusters, namely the lava units and the pyroclastic-ignimbritic aquifer. The second step is the discriminant analysis of the data which provides the functions that distinguish the groups from each other and the most significant variables that define the hydrochemical composition of the aquifer. The whole procedure was highly successful as the 94.7 % of the samples were classified to the correct aquifer system. Finally, the resulted functions can be safely used to categorize samples of either unknown or doubtful origin improving thus the quality and the size of existing hydrochemical databases.

  5. Sample size determination for GEE analyses of stepped wedge cluster randomized trials.

    PubMed

    Li, Fan; Turner, Elizabeth L; Preisser, John S

    2018-06-19

    In stepped wedge cluster randomized trials, intact clusters of individuals switch from control to intervention from a randomly assigned period onwards. Such trials are becoming increasingly popular in health services research. When a closed cohort is recruited from each cluster for longitudinal follow-up, proper sample size calculation should account for three distinct types of intraclass correlations: the within-period, the inter-period, and the within-individual correlations. Setting the latter two correlation parameters to be equal accommodates cross-sectional designs. We propose sample size procedures for continuous and binary responses within the framework of generalized estimating equations that employ a block exchangeable within-cluster correlation structure defined from the distinct correlation types. For continuous responses, we show that the intraclass correlations affect power only through two eigenvalues of the correlation matrix. We demonstrate that analytical power agrees well with simulated power for as few as eight clusters, when data are analyzed using bias-corrected estimating equations for the correlation parameters concurrently with a bias-corrected sandwich variance estimator. © 2018, The International Biometric Society.

  6. Two-stage sequential sampling: A neighborhood-free adaptive sampling procedure

    USGS Publications Warehouse

    Salehi, M.; Smith, D.R.

    2005-01-01

    Designing an efficient sampling scheme for a rare and clustered population is a challenging area of research. Adaptive cluster sampling, which has been shown to be viable for such a population, is based on sampling a neighborhood of units around a unit that meets a specified condition. However, the edge units produced by sampling neighborhoods have proven to limit the efficiency and applicability of adaptive cluster sampling. We propose a sampling design that is adaptive in the sense that the final sample depends on observed values, but it avoids the use of neighborhoods and the sampling of edge units. Unbiased estimators of population total and its variance are derived using Murthy's estimator. The modified two-stage sampling design is easy to implement and can be applied to a wider range of populations than adaptive cluster sampling. We evaluate the proposed sampling design by simulating sampling of two real biological populations and an artificial population for which the variable of interest took the value either 0 or 1 (e.g., indicating presence and absence of a rare event). We show that the proposed sampling design is more efficient than conventional sampling in nearly all cases. The approach used to derive estimators (Murthy's estimator) opens the door for unbiased estimators to be found for similar sequential sampling designs. ?? 2005 American Statistical Association and the International Biometric Society.

  7. Sample size estimation for alternating logistic regressions analysis of multilevel randomized community trials of under-age drinking.

    PubMed

    Reboussin, Beth A; Preisser, John S; Song, Eun-Young; Wolfson, Mark

    2012-07-01

    Under-age drinking is an enormous public health issue in the USA. Evidence that community level structures may impact on under-age drinking has led to a proliferation of efforts to change the environment surrounding the use of alcohol. Although the focus of these efforts is to reduce drinking by individual youths, environmental interventions are typically implemented at the community level with entire communities randomized to the same intervention condition. A distinct feature of these trials is the tendency of the behaviours of individuals residing in the same community to be more alike than that of others residing in different communities, which is herein called 'clustering'. Statistical analyses and sample size calculations must account for this clustering to avoid type I errors and to ensure an appropriately powered trial. Clustering itself may also be of scientific interest. We consider the alternating logistic regressions procedure within the population-averaged modelling framework to estimate the effect of a law enforcement intervention on the prevalence of under-age drinking behaviours while modelling the clustering at multiple levels, e.g. within communities and within neighbourhoods nested within communities, by using pairwise odds ratios. We then derive sample size formulae for estimating intervention effects when planning a post-test-only or repeated cross-sectional community-randomized trial using the alternating logistic regressions procedure.

  8. Optimizing disinfection by-product monitoring points in a distribution system using cluster analysis.

    PubMed

    Delpla, Ianis; Florea, Mihai; Pelletier, Geneviève; Rodriguez, Manuel J

    2018-06-04

    Trihalomethanes (THMs) and Haloacetic Acids (HAAs) are the main groups detected in drinking water and are consequently strictly regulated. However, the increasing quantity of data for disinfection byproducts (DBPs) produced from research projects and regulatory programs remains largely unexploited, despite a great potential for its use in optimizing drinking water quality monitoring to meet specific objectives. In this work, we developed a procedure to optimize locations and periods for DBPs monitoring based on a set of monitoring scenarios using the cluster analysis technique. The optimization procedure used a robust set of spatio-temporal monitoring results on DBPs (THMs and HAAs) generated from intensive sampling campaigns conducted in a residential sector of a water distribution system. Results shows that cluster analysis allows for the classification of water quality in different groups of THMs and HAAs according to their similarities, and the identification of locations presenting water quality concerns. By using cluster analysis with different monitoring objectives, this work provides a set of monitoring solutions and a comparison between various monitoring scenarios for decision-making purposes. Finally, it was demonstrated that the data from intensive monitoring of free chlorine residual and water temperature as DBP proxy parameters, when processed using cluster analysis, could also help identify the optimal sampling points and periods for regulatory THMs and HAAs monitoring. Copyright © 2018 Elsevier Ltd. All rights reserved.

  9. HICOSMO - cosmology with a complete sample of galaxy clusters - I. Data analysis, sample selection and luminosity-mass scaling relation

    NASA Astrophysics Data System (ADS)

    Schellenberger, G.; Reiprich, T. H.

    2017-08-01

    The X-ray regime, where the most massive visible component of galaxy clusters, the intracluster medium, is visible, offers directly measured quantities, like the luminosity, and derived quantities, like the total mass, to characterize these objects. The aim of this project is to analyse a complete sample of galaxy clusters in detail and constrain cosmological parameters, like the matter density, Ωm, or the amplitude of initial density fluctuations, σ8. The purely X-ray flux-limited sample (HIFLUGCS) consists of the 64 X-ray brightest galaxy clusters, which are excellent targets to study the systematic effects, that can bias results. We analysed in total 196 Chandra observations of the 64 HIFLUGCS clusters, with a total exposure time of 7.7 Ms. Here, we present our data analysis procedure (including an automated substructure detection and an energy band optimization for surface brightness profile analysis) that gives individually determined, robust total mass estimates. These masses are tested against dynamical and Planck Sunyaev-Zeldovich (SZ) derived masses of the same clusters, where good overall agreement is found with the dynamical masses. The Planck SZ masses seem to show a mass-dependent bias to our hydrostatic masses; possible biases in this mass-mass comparison are discussed including the Planck selection function. Furthermore, we show the results for the (0.1-2.4) keV luminosity versus mass scaling relation. The overall slope of the sample (1.34) is in agreement with expectations and values from literature. Splitting the sample into galaxy groups and clusters reveals, even after a selection bias correction, that galaxy groups exhibit a significantly steeper slope (1.88) compared to clusters (1.06).

  10. [Study on procedure of seed quality testing and seed grading scale of Phellodendron amurense].

    PubMed

    Liu, Yanlu; Zhang, Zhao; Dai, Lingchao; Zhang, Bengang; Zhang, Xiaoling; Wang, Han

    2011-12-01

    To study the procedure of seed quality testing and seed grading scale of Phellodendron amurense. Seed quality testing methods were developed, which included the test of sampling, seed purity, weight per 1 000 seeds, seed moisture, seed viability and germination rate. The related data from 62 cases of seed specimens of P. amurense were analyzed by cluster analysis. The seed quality test procedure was developed, and the seed quality grading scale was formulated.

  11. The association between content of the elements S, Cl, K, Fe, Cu, Zn and Br in normal and cirrhotic liver tissue from Danes and Greenlandic Inuit examined by dual hierarchical clustering analysis.

    PubMed

    Laursen, Jens; Milman, Nils; Pind, Niels; Pedersen, Henrik; Mulvad, Gert

    2014-01-01

    Meta-analysis of previous studies evaluating associations between content of elements sulphur (S), chlorine (Cl), potassium (K), iron (Fe), copper (Cu), zinc (Zn) and bromine (Br) in normal and cirrhotic autopsy liver tissue samples. Normal liver samples from 45 Greenlandic Inuit, median age 60 years and from 71 Danes, median age 61 years. Cirrhotic liver samples from 27 Danes, median age 71 years. Element content was measured using X-ray fluorescence spectrometry. Dual hierarchical clustering analysis, creating a dual dendrogram, one clustering element contents according to calculated similarities, one clustering elements according to correlation coefficients between the element contents, both using Euclidian distance and Ward Procedure. One dendrogram separated subjects in 7 clusters showing no differences in ethnicity, gender or age. The analysis discriminated between elements in normal and cirrhotic livers. The other dendrogram clustered elements in four clusters: sulphur and chlorine; copper and bromine; potassium and zinc; iron. There were significant correlations between the elements in normal liver samples: S was associated with Cl, K, Br and Zn; Cl with S and Br; K with S, Br and Zn; Cu with Br. Zn with S and K. Br with S, Cl, K and Cu. Fe did not show significant associations with any other element. In contrast to simple statistical methods, which analyses content of elements separately one by one, dual hierarchical clustering analysis incorporates all elements at the same time and can be used to examine the linkage and interplay between multiple elements in tissue samples. Copyright © 2013 Elsevier GmbH. All rights reserved.

  12. Gene expression pattern recognition algorithm inferences to classify samples exposed to chemical agents

    NASA Astrophysics Data System (ADS)

    Bushel, Pierre R.; Bennett, Lee; Hamadeh, Hisham; Green, James; Ableson, Alan; Misener, Steve; Paules, Richard; Afshari, Cynthia

    2002-06-01

    We present an analysis of pattern recognition procedures used to predict the classes of samples exposed to pharmacologic agents by comparing gene expression patterns from samples treated with two classes of compounds. Rat liver mRNA samples following exposure for 24 hours with phenobarbital or peroxisome proliferators were analyzed using a 1700 rat cDNA microarray platform. Sets of genes that were consistently differentially expressed in the rat liver samples following treatment were stored in the MicroArray Project System (MAPS) database. MAPS identified 238 genes in common that possessed a low probability (P < 0.01) of being randomly detected as differentially expressed at the 95% confidence level. Hierarchical cluster analysis on the 238 genes clustered specific gene expression profiles that separated samples based on exposure to a particular class of compound.

  13. Sexual Abuse among Female High School Students in Istanbul, Turkey

    ERIC Educational Resources Information Center

    Alikasifoglu, Mujgan; Erginoz, Ethem; Ercan, Oya; Albayrak-Kaymak, Deniz; Uysal, Omer; Ilter, Ozdemir

    2006-01-01

    Objective: The objective of the study was to determine the prevalence of sexual abuse in female adolescents in Istanbul, Turkey from data collected as part of a school-based population study on health and health behaviors. Method: A stratified cluster sampling procedure was used for this cross-sectional study. The study sample included 1,955…

  14. Cluster Analysis of Clinical Data Identifies Fibromyalgia Subgroups

    PubMed Central

    Docampo, Elisa; Collado, Antonio; Escaramís, Geòrgia; Carbonell, Jordi; Rivera, Javier; Vidal, Javier; Alegre, José

    2013-01-01

    Introduction Fibromyalgia (FM) is mainly characterized by widespread pain and multiple accompanying symptoms, which hinder FM assessment and management. In order to reduce FM heterogeneity we classified clinical data into simplified dimensions that were used to define FM subgroups. Material and Methods 48 variables were evaluated in 1,446 Spanish FM cases fulfilling 1990 ACR FM criteria. A partitioning analysis was performed to find groups of variables similar to each other. Similarities between variables were identified and the variables were grouped into dimensions. This was performed in a subset of 559 patients, and cross-validated in the remaining 887 patients. For each sample and dimension, a composite index was obtained based on the weights of the variables included in the dimension. Finally, a clustering procedure was applied to the indexes, resulting in FM subgroups. Results Variables clustered into three independent dimensions: “symptomatology”, “comorbidities” and “clinical scales”. Only the two first dimensions were considered for the construction of FM subgroups. Resulting scores classified FM samples into three subgroups: low symptomatology and comorbidities (Cluster 1), high symptomatology and comorbidities (Cluster 2), and high symptomatology but low comorbidities (Cluster 3), showing differences in measures of disease severity. Conclusions We have identified three subgroups of FM samples in a large cohort of FM by clustering clinical data. Our analysis stresses the importance of family and personal history of FM comorbidities. Also, the resulting patient clusters could indicate different forms of the disease, relevant to future research, and might have an impact on clinical assessment. PMID:24098674

  15. Adaptive sampling in behavioral surveys.

    PubMed

    Thompson, S K

    1997-01-01

    Studies of populations such as drug users encounter difficulties because the members of the populations are rare, hidden, or hard to reach. Conventionally designed large-scale surveys detect relatively few members of the populations so that estimates of population characteristics have high uncertainty. Ethnographic studies, on the other hand, reach suitable numbers of individuals only through the use of link-tracing, chain referral, or snowball sampling procedures that often leave the investigators unable to make inferences from their sample to the hidden population as a whole. In adaptive sampling, the procedure for selecting people or other units to be in the sample depends on variables of interest observed during the survey, so the design adapts to the population as encountered. For example, when self-reported drug use is found among members of the sample, sampling effort may be increased in nearby areas. Types of adaptive sampling designs include ordinary sequential sampling, adaptive allocation in stratified sampling, adaptive cluster sampling, and optimal model-based designs. Graph sampling refers to situations with nodes (for example, people) connected by edges (such as social links or geographic proximity). An initial sample of nodes or edges is selected and edges are subsequently followed to bring other nodes into the sample. Graph sampling designs include network sampling, snowball sampling, link-tracing, chain referral, and adaptive cluster sampling. A graph sampling design is adaptive if the decision to include linked nodes depends on variables of interest observed on nodes already in the sample. Adjustment methods for nonsampling errors such as imperfect detection of drug users in the sample apply to adaptive as well as conventional designs.

  16. A sampling design framework for monitoring secretive marshbirds

    USGS Publications Warehouse

    Johnson, D.H.; Gibbs, J.P.; Herzog, M.; Lor, S.; Niemuth, N.D.; Ribic, C.A.; Seamans, M.; Shaffer, T.L.; Shriver, W.G.; Stehman, S.V.; Thompson, W.L.

    2009-01-01

    A framework for a sampling plan for monitoring marshbird populations in the contiguous 48 states is proposed here. The sampling universe is the breeding habitat (i.e. wetlands) potentially used by marshbirds. Selection protocols would be implemented within each of large geographical strata, such as Bird Conservation Regions. Site selection will be done using a two-stage cluster sample. Primary sampling units (PSUs) would be land areas, such as legal townships, and would be selected by a procedure such as systematic sampling. Secondary sampling units (SSUs) will be wetlands or portions of wetlands in the PSUs. SSUs will be selected by a randomized spatially balanced procedure. For analysis, the use of a variety of methods as a means of increasing confidence in conclusions that may be reached is encouraged. Additional effort will be required to work out details and implement the plan.

  17. Cluster analysis of molecular simulation trajectories for systems where both conformation and orientation of the sampled states are important.

    PubMed

    Abramyan, Tigran M; Snyder, James A; Thyparambil, Aby A; Stuart, Steven J; Latour, Robert A

    2016-08-05

    Clustering methods have been widely used to group together similar conformational states from molecular simulations of biomolecules in solution. For applications such as the interaction of a protein with a surface, the orientation of the protein relative to the surface is also an important clustering parameter because of its potential effect on adsorbed-state bioactivity. This study presents cluster analysis methods that are specifically designed for systems where both molecular orientation and conformation are important, and the methods are demonstrated using test cases of adsorbed proteins for validation. Additionally, because cluster analysis can be a very subjective process, an objective procedure for identifying both the optimal number of clusters and the best clustering algorithm to be applied to analyze a given dataset is presented. The method is demonstrated for several agglomerative hierarchical clustering algorithms used in conjunction with three cluster validation techniques. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  18. Clustering analysis of proteins from microbial genomes at multiple levels of resolution.

    PubMed

    Zaslavsky, Leonid; Ciufo, Stacy; Fedorov, Boris; Tatusova, Tatiana

    2016-08-31

    Microbial genomes at the National Center for Biotechnology Information (NCBI) represent a large collection of more than 35,000 assemblies. There are several complexities associated with the data: a great variation in sampling density since human pathogens are densely sampled while other bacteria are less represented; different protein families occur in annotations with different frequencies; and the quality of genome annotation varies greatly. In order to extract useful information from these sophisticated data, the analysis needs to be performed at multiple levels of phylogenomic resolution and protein similarity, with an adequate sampling strategy. Protein clustering is used to construct meaningful and stable groups of similar proteins to be used for analysis and functional annotation. Our approach is to create protein clusters at three levels. First, tight clusters in groups of closely-related genomes (species-level clades) are constructed using a combined approach that takes into account both sequence similarity and genome context. Second, clustroids of conservative in-clade clusters are organized into seed global clusters. Finally, global protein clusters are built around the the seed clusters. We propose filtering strategies that allow limiting the protein set included in global clustering. The in-clade clustering procedure, subsequent selection of clustroids and organization into seed global clusters provides a robust representation and high rate of compression. Seed protein clusters are further extended by adding related proteins. Extended seed clusters include a significant part of the data and represent all major known cell machinery. The remaining part, coming from either non-conservative (unique) or rapidly evolving proteins, from rare genomes, or resulting from low-quality annotation, does not group together well. Processing these proteins requires significant computational resources and results in a large number of questionable clusters. The developed filtering strategies allow to identify and exclude such peripheral proteins limiting the protein dataset in global clustering. Overall, the proposed methodology allows the relevant data at different levels of details to be obtained and data redundancy eliminated while keeping biologically interesting variations.

  19. Estimating multilevel logistic regression models when the number of clusters is low: a comparison of different statistical software procedures.

    PubMed

    Austin, Peter C

    2010-04-22

    Multilevel logistic regression models are increasingly being used to analyze clustered data in medical, public health, epidemiological, and educational research. Procedures for estimating the parameters of such models are available in many statistical software packages. There is currently little evidence on the minimum number of clusters necessary to reliably fit multilevel regression models. We conducted a Monte Carlo study to compare the performance of different statistical software procedures for estimating multilevel logistic regression models when the number of clusters was low. We examined procedures available in BUGS, HLM, R, SAS, and Stata. We found that there were qualitative differences in the performance of different software procedures for estimating multilevel logistic models when the number of clusters was low. Among the likelihood-based procedures, estimation methods based on adaptive Gauss-Hermite approximations to the likelihood (glmer in R and xtlogit in Stata) or adaptive Gaussian quadrature (Proc NLMIXED in SAS) tended to have superior performance for estimating variance components when the number of clusters was small, compared to software procedures based on penalized quasi-likelihood. However, only Bayesian estimation with BUGS allowed for accurate estimation of variance components when there were fewer than 10 clusters. For all statistical software procedures, estimation of variance components tended to be poor when there were only five subjects per cluster, regardless of the number of clusters.

  20. Comparative study of two protocols for quantitative image-analysis of serotonin transporter clustering in lymphocytes, a putative biomarker of therapeutic efficacy in major depression.

    PubMed

    Romay-Tallon, Raquel; Rivera-Baltanas, Tania; Allen, Josh; Olivares, Jose M; Kalynchuk, Lisa E; Caruncho, Hector J

    2017-01-01

    The pattern of serotonin transporter clustering on the plasma membrane of lymphocytes extracted from human whole blood samples has been identified as a putative biomarker of therapeutic efficacy in major depression. Here we evaluated the possibility of performing a similar analysis using blood smears obtained from rats, and from control human subjects and depression patients. We hypothesized that we could optimize a protocol to make the analysis of serotonin protein clustering in blood smears comparable to the analysis of serotonin protein clustering using isolated lymphocytes. Our data indicate that blood smears require a longer fixation time and longer times of incubation with primary and secondary antibodies. In addition, one needs to optimize the image analysis settings for the analysis of smears. When these steps are followed, the quantitative analysis of both the number and size of serotonin transporter clusters on the plasma membrane of lymphocytes is similar using both blood smears and isolated lymphocytes. The development of this novel protocol will greatly facilitate the collection of appropriate samples by eliminating the necessity and cost of specialized personnel for drawing blood samples, and by being a less invasive procedure. Therefore, this protocol will help us advance the validation of membrane protein clustering in lymphocytes as a biomarker of therapeutic efficacy in major depression, and bring it closer to its clinical application.

  1. Properties of star clusters - I. Automatic distance and extinction estimates

    NASA Astrophysics Data System (ADS)

    Buckner, Anne S. M.; Froebrich, Dirk

    2013-12-01

    Determining star cluster distances is essential to analyse their properties and distribution in the Galaxy. In particular, it is desirable to have a reliable, purely photometric distance estimation method for large samples of newly discovered cluster candidates e.g. from the Two Micron All Sky Survey, the UK Infrared Deep Sky Survey Galactic Plane Survey and VVV. Here, we establish an automatic method to estimate distances and reddening from near-infrared photometry alone, without the use of isochrone fitting. We employ a decontamination procedure of JHK photometry to determine the density of stars foreground to clusters and a galactic model to estimate distances. We then calibrate the method using clusters with known properties. This allows us to establish distance estimates with better than 40 per cent accuracy. We apply our method to determine the extinction and distance values to 378 known open clusters and 397 cluster candidates from the list of Froebrich, Scholz & Raftery. We find that the sample is biased towards clusters of a distance of approximately 3 kpc, with typical distances between 2 and 6 kpc. Using the cluster distances and extinction values, we investigate how the average extinction per kiloparsec distance changes as a function of the Galactic longitude. We find a systematic dependence that can be approximated by AH(l) [mag kpc-1] = 0.10 + 0.001 × |l - 180°|/° for regions more than 60° from the Galactic Centre.

  2. WAIS-III index score profiles in the Canadian standardization sample.

    PubMed

    Lange, Rael T

    2007-01-01

    Representative index score profiles were examined in the Canadian standardization sample of the Wechsler Adult Intelligence Scale-Third Edition (WAIS-III). The identification of profile patterns was based on the methodology proposed by Lange, Iverson, Senior, and Chelune (2002) that aims to maximize the influence of profile shape and minimize the influence of profile magnitude on the cluster solution. A two-step cluster analysis procedure was used (i.e., hierarchical and k-means analyses). Cluster analysis of the four index scores (i.e., Verbal Comprehension [VCI], Perceptual Organization [POI], Working Memory [WMI], Processing Speed [PSI]) identified six profiles in this sample. Profiles were differentiated by pattern of performance and were primarily characterized as (a) high VCI/POI, low WMI/PSI, (b) low VCI/POI, high WMI/PSI, (c) high PSI, (d) low PSI, (e) high VCI/WMI, low POI/PSI, and (f) low VCI, high POI. These profiles are potentially useful for determining whether a patient's WAIS-III performance is unusual in a normal population.

  3. The “UV-route” to Search for Blue Straggler Stars in Globular Clusters: First Results from the HST UV Legacy Survey

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Raso, S.; Ferraro, F. R.; Lanzoni, B.

    We used data from the Hubble Space Telescope UV Legacy Survey of Galactic Globular Clusters to select the Blue Straggler Star (BSS) population in four intermediate/high density systems (namely NGC 2808, NGC 6388, NGC 6541, and NGC 7078) through a “UV-guided search.” This procedure consists of using the F275W images in each cluster to construct the master list of detected sources, and then force it to the images acquired in the other filters. Such an approach optimizes the detection of relatively hot stars and allows the detection of a complete sample of BSSs even in the central region of high-densitymore » clusters, because the light from the bright cool giants, which dominates the optical emission in old stellar systems, is sensibly reduced at UV wavelengths. Our UV-guided selections of BSSs have been compared to the samples obtained in previous, optical-driven surveys, clearly demonstrating the efficiency of the UV approach. In each cluster we also measured the parameter A {sup +}, defined as the area enclosed between the cumulative radial distribution of BSSs and that of a reference population, which traces the level of BSS central segregation and the level of dynamical evolution suffered by the system. The values measured for the four clusters studied in this paper nicely fall along the dynamical sequence recently presented for a sample of 25 clusters.« less

  4. Tailoring magnetic properties of Co nanocluster assembled films using hydrogen

    NASA Astrophysics Data System (ADS)

    Romero, C. P.; Volodin, A.; Paddubrouskaya, H.; Van Bael, M. J.; Van Haesendonck, C.; Lievens, P.

    2018-07-01

    Tailoring magnetic properties in nanocluster assembled cobalt (Co) thin films was achieved by admitting a small percentage of H2 gas (∼2%) into the Co gas phase cluster formation chamber prior to deposition. The oxygen content in the films is considerably reduced by the presence of hydrogen during the cluster formation, leading to enhanced magnetic interactions between clusters. Two sets of Co samples were fabricated, one without hydrogen gas and one with hydrogen gas. Magnetic properties of the non-hydrogenated and the hydrogen-treated Co nanocluster assembled films are comparatively studied using magnetic force microscopy and vibrating sample magnetometry. When comparing the two sets of samples the considerably larger coercive field of the H2-treated Co nanocluster film and the extended micrometer-sized magnetic domain structure confirm the enhancement of magnetic interactions between clusters. The thickness of the antiferromagnetic CoO layer is controlled with this procedure and modifies the exchange bias effect in these films. The exchange bias shift is lower for the H2-treated Co nanocluster film, which indicates that a thinner antiferromagnetic CoO reduces the coupling with the ferromagnetic Co. The hydrogen-treatment method can be used to tailor the oxidation levels thus controlling the magnetic properties of ferromagnetic cluster-assembled films.

  5. A Comparison of Heuristic Procedures for Minimum within-Cluster Sums of Squares Partitioning

    ERIC Educational Resources Information Center

    Brusco, Michael J.; Steinley, Douglas

    2007-01-01

    Perhaps the most common criterion for partitioning a data set is the minimization of the within-cluster sums of squared deviation from cluster centroids. Although optimal solution procedures for within-cluster sums of squares (WCSS) partitioning are computationally feasible for small data sets, heuristic procedures are required for most practical…

  6. Sampling in health geography: reconciling geographical objectives and probabilistic methods. An example of a health survey in Vientiane (Lao PDR)

    PubMed Central

    Vallée, Julie; Souris, Marc; Fournet, Florence; Bochaton, Audrey; Mobillion, Virginie; Peyronnie, Karine; Salem, Gérard

    2007-01-01

    Background Geographical objectives and probabilistic methods are difficult to reconcile in a unique health survey. Probabilistic methods focus on individuals to provide estimates of a variable's prevalence with a certain precision, while geographical approaches emphasise the selection of specific areas to study interactions between spatial characteristics and health outcomes. A sample selected from a small number of specific areas creates statistical challenges: the observations are not independent at the local level, and this results in poor statistical validity at the global level. Therefore, it is difficult to construct a sample that is appropriate for both geographical and probability methods. Methods We used a two-stage selection procedure with a first non-random stage of selection of clusters. Instead of randomly selecting clusters, we deliberately chose a group of clusters, which as a whole would contain all the variation in health measures in the population. As there was no health information available before the survey, we selected a priori determinants that can influence the spatial homogeneity of the health characteristics. This method yields a distribution of variables in the sample that closely resembles that in the overall population, something that cannot be guaranteed with randomly-selected clusters, especially if the number of selected clusters is small. In this way, we were able to survey specific areas while minimising design effects and maximising statistical precision. Application We applied this strategy in a health survey carried out in Vientiane, Lao People's Democratic Republic. We selected well-known health determinants with unequal spatial distribution within the city: nationality and literacy. We deliberately selected a combination of clusters whose distribution of nationality and literacy is similar to the distribution in the general population. Conclusion This paper describes the conceptual reasoning behind the construction of the survey sample and shows that it can be advantageous to choose clusters using reasoned hypotheses, based on both probability and geographical approaches, in contrast to a conventional, random cluster selection strategy. PMID:17543100

  7. Sampling in health geography: reconciling geographical objectives and probabilistic methods. An example of a health survey in Vientiane (Lao PDR).

    PubMed

    Vallée, Julie; Souris, Marc; Fournet, Florence; Bochaton, Audrey; Mobillion, Virginie; Peyronnie, Karine; Salem, Gérard

    2007-06-01

    Geographical objectives and probabilistic methods are difficult to reconcile in a unique health survey. Probabilistic methods focus on individuals to provide estimates of a variable's prevalence with a certain precision, while geographical approaches emphasise the selection of specific areas to study interactions between spatial characteristics and health outcomes. A sample selected from a small number of specific areas creates statistical challenges: the observations are not independent at the local level, and this results in poor statistical validity at the global level. Therefore, it is difficult to construct a sample that is appropriate for both geographical and probability methods. We used a two-stage selection procedure with a first non-random stage of selection of clusters. Instead of randomly selecting clusters, we deliberately chose a group of clusters, which as a whole would contain all the variation in health measures in the population. As there was no health information available before the survey, we selected a priori determinants that can influence the spatial homogeneity of the health characteristics. This method yields a distribution of variables in the sample that closely resembles that in the overall population, something that cannot be guaranteed with randomly-selected clusters, especially if the number of selected clusters is small. In this way, we were able to survey specific areas while minimising design effects and maximising statistical precision. We applied this strategy in a health survey carried out in Vientiane, Lao People's Democratic Republic. We selected well-known health determinants with unequal spatial distribution within the city: nationality and literacy. We deliberately selected a combination of clusters whose distribution of nationality and literacy is similar to the distribution in the general population. This paper describes the conceptual reasoning behind the construction of the survey sample and shows that it can be advantageous to choose clusters using reasoned hypotheses, based on both probability and geographical approaches, in contrast to a conventional, random cluster selection strategy.

  8. Statistical Significance for Hierarchical Clustering

    PubMed Central

    Kimes, Patrick K.; Liu, Yufeng; Hayes, D. Neil; Marron, J. S.

    2017-01-01

    Summary Cluster analysis has proved to be an invaluable tool for the exploratory and unsupervised analysis of high dimensional datasets. Among methods for clustering, hierarchical approaches have enjoyed substantial popularity in genomics and other fields for their ability to simultaneously uncover multiple layers of clustering structure. A critical and challenging question in cluster analysis is whether the identified clusters represent important underlying structure or are artifacts of natural sampling variation. Few approaches have been proposed for addressing this problem in the context of hierarchical clustering, for which the problem is further complicated by the natural tree structure of the partition, and the multiplicity of tests required to parse the layers of nested clusters. In this paper, we propose a Monte Carlo based approach for testing statistical significance in hierarchical clustering which addresses these issues. The approach is implemented as a sequential testing procedure guaranteeing control of the family-wise error rate. Theoretical justification is provided for our approach, and its power to detect true clustering structure is illustrated through several simulation studies and applications to two cancer gene expression datasets. PMID:28099990

  9. A Unique Four-Hub Protein Cluster Associates to Glioblastoma Progression

    PubMed Central

    Simeone, Pasquale; Trerotola, Marco; Urbanella, Andrea; Lattanzio, Rossano; Ciavardelli, Domenico; Di Giuseppe, Fabrizio; Eleuterio, Enrica; Sulpizio, Marilisa; Eusebi, Vincenzo; Pession, Annalisa; Piantelli, Mauro; Alberti, Saverio

    2014-01-01

    Gliomas are the most frequent brain tumors. Among them, glioblastomas are malignant and largely resistant to available treatments. Histopathology is the gold standard for classification and grading of brain tumors. However, brain tumor heterogeneity is remarkable and histopathology procedures for glioma classification remain unsatisfactory for predicting disease course as well as response to treatment. Proteins that tightly associate with cancer differentiation and progression, can bear important prognostic information. Here, we describe the identification of protein clusters differentially expressed in high-grade versus low-grade gliomas. Tissue samples from 25 high-grade tumors, 10 low-grade tumors and 5 normal brain cortices were analyzed by 2D-PAGE and proteomic profiling by mass spectrometry. This led to identify 48 differentially expressed protein markers between tumors and normal samples. Protein clustering by multivariate analyses (PCA and PLS-DA) provided discrimination between pathological samples to an unprecedented extent, and revealed a unique network of deranged proteins. We discovered a novel glioblastoma control module centered on four major network hubs: Huntingtin, HNF4α, c-Myc and 14-3-3ζ. Immunohistochemistry, western blotting and unbiased proteome-wide meta-analysis revealed altered expression of this glioblastoma control module in human glioma samples as compared with normal controls. Moreover, the four-hub network was found to cross-talk with both p53 and EGFR pathways. In summary, the findings of this study indicate the existence of a unifying signaling module controlling glioblastoma pathogenesis and malignant progression, and suggest novel targets for development of diagnostic and therapeutic procedures. PMID:25050814

  10. Comparing large covariance matrices under weak conditions on the dependence structure and its application to gene clustering.

    PubMed

    Chang, Jinyuan; Zhou, Wen; Zhou, Wen-Xin; Wang, Lan

    2017-03-01

    Comparing large covariance matrices has important applications in modern genomics, where scientists are often interested in understanding whether relationships (e.g., dependencies or co-regulations) among a large number of genes vary between different biological states. We propose a computationally fast procedure for testing the equality of two large covariance matrices when the dimensions of the covariance matrices are much larger than the sample sizes. A distinguishing feature of the new procedure is that it imposes no structural assumptions on the unknown covariance matrices. Hence, the test is robust with respect to various complex dependence structures that frequently arise in genomics. We prove that the proposed procedure is asymptotically valid under weak moment conditions. As an interesting application, we derive a new gene clustering algorithm which shares the same nice property of avoiding restrictive structural assumptions for high-dimensional genomics data. Using an asthma gene expression dataset, we illustrate how the new test helps compare the covariance matrices of the genes across different gene sets/pathways between the disease group and the control group, and how the gene clustering algorithm provides new insights on the way gene clustering patterns differ between the two groups. The proposed methods have been implemented in an R-package HDtest and are available on CRAN. © 2016, The International Biometric Society.

  11. Kappa statistic for clustered matched-pair data.

    PubMed

    Yang, Zhao; Zhou, Ming

    2014-07-10

    Kappa statistic is widely used to assess the agreement between two procedures in the independent matched-pair data. For matched-pair data collected in clusters, on the basis of the delta method and sampling techniques, we propose a nonparametric variance estimator for the kappa statistic without within-cluster correlation structure or distributional assumptions. The results of an extensive Monte Carlo simulation study demonstrate that the proposed kappa statistic provides consistent estimation and the proposed variance estimator behaves reasonably well for at least a moderately large number of clusters (e.g., K ≥50). Compared with the variance estimator ignoring dependence within a cluster, the proposed variance estimator performs better in maintaining the nominal coverage probability when the intra-cluster correlation is fair (ρ ≥0.3), with more pronounced improvement when ρ is further increased. To illustrate the practical application of the proposed estimator, we analyze two real data examples of clustered matched-pair data. Copyright © 2014 John Wiley & Sons, Ltd.

  12. VizieR Online Data Catalog: WINGS: Deep optical phot. of 77 nearby clusters (Varela+, 2009)

    NASA Astrophysics Data System (ADS)

    Varela, J.; D'Onofrio, M.; Marmo, C.; Fasano, G.; Bettoni, D.; Cava, A.; Couch, J. W.; Dressler, A.; Kjaergaard, P.; Moles, M.; Pignatelli, E.; Poggianti, M. B.; Valentinuzzi, T.

    2009-05-01

    This is the second paper of a series devoted to the WIde Field Nearby Galaxy-cluster Survey (WINGS). WINGS is a long term project which is gathering wide-field, multi-band imaging and spectroscopy of galaxies in a complete sample of 77 X-ray selected, nearby clusters (0.04200deg). The main goal of this project is to establish a local reference for evolutionary studies of galaxies and galaxy clusters. This paper presents the optical (B,V) photometric catalogs of the WINGS sample and describes the procedures followed to construct them. We have paid special care to correctly treat the large extended galaxies (which includes the brightest cluster galaxies) and the reduction of the influence of the bright halos of very bright stars. We have constructed photometric catalogs based on wide-field images in B and V bands using SExtractor. Photometry has been performed on images in which large galaxies and halos of bright stars were removed after modeling them with elliptical isophotes. We publish deep optical photometric catalogs (90% complete at V21.7, which translates to ~ MV* + 6 at mean redshift), giving positions, geometrical parameters, and several total and aperture magnitudes for all the objects detected. For each field we have produced three catalogs containing galaxies, stars and objects of "unknown" classification (~16%). From simulations we found that the uncertainty of our photometry is quite dependent of the light profile of the objects with stars having the most robust photometry and de Vaucouleurs profiles showing higher uncertainties and also an additional bias of ~-0.2m. The star/galaxy classification of the bright objects (V<20) was checked visually making negligible the fraction of misclassified objects. For fainter objects, we found that simulations do not provide reliable estimates of the possible misclassification and therefore we have compared our data with that from deep counts of galaxies and star counts from models of our Galaxy. Both sets turned out to be consistent with our data within ~5% (in the ratio galaxies/total) up to V~24. Finally, we remark that the application of our special procedure to remove large halos improves the photometry of the large galaxies in our sample with respect to the use of blind automatic procedures and increases (~16%) the detection rate of objects projected onto them. (4 data files).

  13. Shape analysis of H II regions - I. Statistical clustering

    NASA Astrophysics Data System (ADS)

    Campbell-White, Justyn; Froebrich, Dirk; Kume, Alfred

    2018-07-01

    We present here our shape analysis method for a sample of 76 Galactic H II regions from MAGPIS 1.4 GHz data. The main goal is to determine whether physical properties and initial conditions of massive star cluster formation are linked to the shape of the regions. We outline a systematic procedure for extracting region shapes and perform hierarchical clustering on the shape data. We identified six groups that categorize H II regions by common morphologies. We confirmed the validity of these groupings by bootstrap re-sampling and the ordinance technique multidimensional scaling. We then investigated associations between physical parameters and the assigned groups. Location is mostly independent of group, with a small preference for regions of similar longitudes to share common morphologies. The shapes are homogeneously distributed across Galactocentric distance and latitude. One group contains regions that are all younger than 0.5 Myr and ionized by low- to intermediate-mass sources. Those in another group are all driven by intermediate- to high-mass sources. One group was distinctly separated from the other five and contained regions at the surface brightness detection limit for the survey. We find that our hierarchical procedure is most sensitive to the spatial sampling resolution used, which is determined for each region from its distance. We discuss how these errors can be further quantified and reduced in future work by utilizing synthetic observations from numerical simulations of H II regions. We also outline how this shape analysis has further applications to other diffuse astronomical objects.

  14. Shape Analysis of HII Regions - I. Statistical Clustering

    NASA Astrophysics Data System (ADS)

    Campbell-White, Justyn; Froebrich, Dirk; Kume, Alfred

    2018-04-01

    We present here our shape analysis method for a sample of 76 Galactic HII regions from MAGPIS 1.4 GHz data. The main goal is to determine whether physical properties and initial conditions of massive star cluster formation is linked to the shape of the regions. We outline a systematic procedure for extracting region shapes and perform hierarchical clustering on the shape data. We identified six groups that categorise HII regions by common morphologies. We confirmed the validity of these groupings by bootstrap re-sampling and the ordinance technique multidimensional scaling. We then investigated associations between physical parameters and the assigned groups. Location is mostly independent of group, with a small preference for regions of similar longitudes to share common morphologies. The shapes are homogeneously distributed across Galactocentric distance and latitude. One group contains regions that are all younger than 0.5 Myr and ionised by low- to intermediate-mass sources. Those in another group are all driven by intermediate- to high-mass sources. One group was distinctly separated from the other five and contained regions at the surface brightness detection limit for the survey. We find that our hierarchical procedure is most sensitive to the spatial sampling resolution used, which is determined for each region from its distance. We discuss how these errors can be further quantified and reduced in future work by utilising synthetic observations from numerical simulations of HII regions. We also outline how this shape analysis has further applications to other diffuse astronomical objects.

  15. Somatotyping using 3D anthropometry: a cluster analysis.

    PubMed

    Olds, Tim; Daniell, Nathan; Petkov, John; David Stewart, Arthur

    2013-01-01

    Somatotyping is the quantification of human body shape, independent of body size. Hitherto, somatotyping (including the most popular method, the Heath-Carter system) has been based on subjective visual ratings, sometimes supported by surface anthropometry. This study used data derived from three-dimensional (3D) whole-body scans as inputs for cluster analysis to objectively derive clusters of similar body shapes. Twenty-nine dimensions normalised for body size were measured on a purposive sample of 301 adults aged 17-56 years who had been scanned using a Vitus Smart laser scanner. K-means Cluster Analysis with v-fold cross-validation was used to determine shape clusters. Three male and three female clusters emerged, and were visualised using those scans closest to the cluster centroid and a caricature defined by doubling the difference between the average scan and the cluster centroid. The male clusters were decidedly endomorphic (high fatness), ectomorphic (high linearity), and endo-mesomorphic (a mixture of fatness and muscularity). The female clusters were clearly endomorphic, ectomorphic, and the ecto-mesomorphic (a mixture of linearity and muscularity). An objective shape quantification procedure combining 3D scanning and cluster analysis yielded shape clusters strikingly similar to traditional somatotyping.

  16. VizieR Online Data Catalog: Star clusters distances and extinctions (Buckner+, 2013)

    NASA Astrophysics Data System (ADS)

    Buckner, A. S. M.; Froebrich, D.

    2014-10-01

    Determining star cluster distances is essential to analyse their properties and distribution in the Galaxy. In particular, it is desirable to have a reliable, purely photometric distance estimation method for large samples of newly discovered cluster candidates e.g. from the Two Micron All Sky Survey, the UK Infrared Deep Sky Survey Galactic Plane Survey and VVV. Here, we establish an automatic method to estimate distances and reddening from near-infrared photometry alone, without the use of isochrone fitting. We employ a decontamination procedure of JHK photometry to determine the density of stars foreground to clusters and a galactic model to estimate distances. We then calibrate the method using clusters with known properties. This allows us to establish distance estimates with better than 40 percent accuracy. We apply our method to determine the extinction and distance values to 378 known open clusters and 397 cluster candidates from the list of Froebrich, Scholz & Raftery (2007MNRAS.374..399F, Cat. J/MNRAS/374/399). We find that the sample is biased towards clusters of a distance of approximately 3kpc, with typical distances between 2 and 6kpc. Using the cluster distances and extinction values, we investigate how the average extinction per kiloparsec distance changes as a function of the Galactic longitude. We find a systematic dependence that can be approximated by AH(l)[mag/kpc]=0.10+0.001x|l-180°|/° for regions more than 60° from the Galactic Centre. (1 data file).

  17. Urban hospital 'clusters' do shift high-risk procedures to key facilities, but more could be done.

    PubMed

    Luke, Roice D; Luke, Tyler; Muller, Nancy

    2011-09-01

    Since the 1990s, rapid consolidation in the hospital sector has resulted in the vast majority of hospitals joining systems that already had a considerable presence within their markets. We refer to these important local and regional systems as "clusters." To determine whether hospital clusters have taken measurable steps aimed at improving the quality of care-specifically, by concentrating low-volume, high-complexity services within selected "lead" facilities-this study examined within-cluster concentrations of high-risk cases for seven surgical procedures. We found that lead hospitals on average performed fairly high percentages of the procedures per cluster, ranging from 59 percent for esophagectomy to 87 percent for aortic valve replacement. The numbers indicate that hospitals might need to work with rival facilities outside their cluster to concentrate cases for the lowest-volume procedures, such as esophagectomies, whereas coordination among cluster members might be sufficient for higher-volume procedures. The results imply that policy makers should focus on clusters' potential for restructuring care and further coordinating services across hospitals in local areas.

  18. Numerical taxonomy and ecology of petroleum-degrading bacteria.

    PubMed Central

    Austin, B; Calomiris, J J; Walker, J D; Colwell, R R

    1977-01-01

    A total of 99 strains of petroleum-degrading bacteria isolated from Chesapeake Bay water and sediment were identified by using numerical taxonomy procedures. The isolates, together with 33 reference cultures, were examined for 48 biochemical, cultural, morphological, and physiological characters. The data were analyzed by computer, using both the simple matching and the Jaccard coefficients. Clustering was achieved by the unweighted average linkage method. From the sorted similarity matrix and dendrogram, 14 phenetic groups, comprising 85 of the petroleum-degrading bacteria, were defined at the 80 to 85% similarity level. These groups were identified as actinomycetes (mycelial forms, four clusters), coryneforms, Enterobacteriaceae, Klebsiella aerogenes, Micrococcus spp. (two clusters), Nocardia species (two clusters), Pseudomonas spp. (two clusters), and Sphaerotilus natans. It is concluded that the degradation of petroleum is accomplished by a diverse range of bacterial taxa, some of which were isolated only at given sampling stations and, more specifically, from sediment collected at a given station. PMID:889329

  19. Planck intermediate results. XLIII. Spectral energy distribution of dust in clusters of galaxies

    NASA Astrophysics Data System (ADS)

    Planck Collaboration; Adam, R.; Ade, P. A. R.; Aghanim, N.; Ashdown, M.; Aumont, J.; Baccigalupi, C.; Banday, A. J.; Barreiro, R. B.; Bartolo, N.; Battaner, E.; Benabed, K.; Benoit-Lévy, A.; Bersanelli, M.; Bielewicz, P.; Bikmaev, I.; Bonaldi, A.; Bond, J. R.; Borrill, J.; Bouchet, F. R.; Burenin, R.; Burigana, C.; Calabrese, E.; Cardoso, J.-F.; Catalano, A.; Chiang, H. C.; Christensen, P. R.; Churazov, E.; Colombo, L. P. L.; Combet, C.; Comis, B.; Couchot, F.; Crill, B. P.; Curto, A.; Cuttaia, F.; Danese, L.; Davis, R. J.; de Bernardis, P.; de Rosa, A.; de Zotti, G.; Delabrouille, J.; Désert, F.-X.; Diego, J. M.; Dole, H.; Doré, O.; Douspis, M.; Ducout, A.; Dupac, X.; Elsner, F.; Enßlin, T. A.; Finelli, F.; Forni, O.; Frailis, M.; Fraisse, A. A.; Franceschi, E.; Galeotta, S.; Ganga, K.; Génova-Santos, R. T.; Giard, M.; Giraud-Héraud, Y.; Gjerløw, E.; González-Nuevo, J.; Górski, K. M.; Gregorio, A.; Gruppuso, A.; Gudmundsson, J. E.; Hansen, F. K.; Harrison, D. L.; Hernández-Monteagudo, C.; Herranz, D.; Hildebrandt, S. R.; Hivon, E.; Hobson, M.; Hornstrup, A.; Hovest, W.; Hurier, G.; Jaffe, A. H.; Jaffe, T. R.; Jones, W. C.; Keihänen, E.; Keskitalo, R.; Khamitov, I.; Kisner, T. S.; Kneissl, R.; Knoche, J.; Kunz, M.; Kurki-Suonio, H.; Lagache, G.; Lähteenmäki, A.; Lamarre, J.-M.; Lasenby, A.; Lattanzi, M.; Lawrence, C. R.; Leonardi, R.; Levrier, F.; Liguori, M.; Lilje, P. B.; Linden-Vørnle, M.; López-Caniego, M.; Macías-Pérez, J. F.; Maffei, B.; Maggio, G.; Mandolesi, N.; Mangilli, A.; Maris, M.; Martin, P. G.; Martínez-González, E.; Masi, S.; Matarrese, S.; Melchiorri, A.; Mennella, A.; Migliaccio, M.; Miville-Deschênes, M.-A.; Moneti, A.; Montier, L.; Morgante, G.; Mortlock, D.; Munshi, D.; Murphy, J. A.; Naselsky, P.; Nati, F.; Natoli, P.; Nørgaard-Nielsen, H. U.; Novikov, D.; Novikov, I.; Oxborrow, C. A.; Pagano, L.; Pajot, F.; Paoletti, D.; Pasian, F.; Perdereau, O.; Perotto, L.; Pettorino, V.; Piacentini, F.; Piat, M.; Plaszczynski, S.; Pointecouteau, E.; Polenta, G.; Ponthieu, N.; Pratt, G. W.; Prunet, S.; Puget, J.-L.; Rachen, J. P.; Rebolo, R.; Reinecke, M.; Remazeilles, M.; Renault, C.; Renzi, A.; Ristorcelli, I.; Rocha, G.; Rosset, C.; Rossetti, M.; Roudier, G.; Rubiño-Martín, J. A.; Rusholme, B.; Santos, D.; Savelainen, M.; Savini, G.; Scott, D.; Stolyarov, V.; Stompor, R.; Sudiwala, R.; Sunyaev, R.; Sutton, D.; Suur-Uski, A.-S.; Sygnet, J.-F.; Tauber, J. A.; Terenzi, L.; Toffolatti, L.; Tomasi, M.; Tristram, M.; Tucci, M.; Valenziano, L.; Valiviita, J.; Van Tent, F.; Vielva, P.; Villa, F.; Wade, L. A.; Wehus, I. K.; Yvon, D.; Zacchei, A.; Zonca, A.

    2016-12-01

    Although infrared (IR) overall dust emission from clusters of galaxies has been statistically detected using data from the Infrared Astronomical Satellite (IRAS), it has not been possible to sample the spectral energy distribution (SED) of this emission over its peak, and thus to break the degeneracy between dust temperature and mass. By complementing the IRAS spectral coverage with Planck satellite data from 100 to 857 GHz, we provide new constraints on the IR spectrum of thermal dust emission in clusters of galaxies. We achieve this by using a stacking approach for a sample of several hundred objects from the Planck cluster sample. This procedure averages out fluctuations from the IR sky, allowing us to reach a significant detection of the faint cluster contribution. We also use the large frequency range probed by Planck, together with component-separation techniques, to remove the contamination from both cosmic microwave background anisotropies and the thermal Sunyaev-Zeldovich effect (tSZ) signal, which dominate at ν ≤ 353 GHz. By excluding dominant spurious signals or systematic effects, averaged detections are reported at frequencies 353 GHz ≤ ν ≤ 5000 GHz. We confirm the presence of dust in clusters of galaxies at low and intermediate redshifts, yielding an SED with a shape similar to that of the Milky Way. Planck's resolution does not allow us to investigate the detailed spatial distribution of this emission (e.g. whether it comes from intergalactic dust or simply the dust content of the cluster galaxies), but the radial distribution of the emission appears to follow that of the stacked SZ signal, and thus the extent of the clusters. The recovered SED allows us to constrain the dust mass responsible for the signal and its temperature.

  20. Planck intermediate results: XLIII. Spectral energy distribution of dust in clusters of galaxies

    DOE PAGES

    Adam, R.; Ade, P. A. R.; Aghanim, N.; ...

    2016-12-12

    Although infrared (IR) overall dust emission from clusters of galaxies has been statistically detected using data from the Infrared Astronomical Satellite (IRAS), it has not been possible to sample the spectral energy distribution (SED) of this emission over its peak, and thus to break the degeneracy between dust temperature and mass. By complementing the IRAS spectral coverage with Planck satellite data from 100 to 857 GHz, we provide in this paper new constraints on the IR spectrum of thermal dust emission in clusters of galaxies. We achieve this by using a stacking approach for a sample of several hundred objectsmore » from the Planck cluster sample. This procedure averages out fluctuations from the IR sky, allowing us to reach a significant detection of the faint cluster contribution. We also use the large frequency range probed by Planck, together with component-separation techniques, to remove the contamination from both cosmic microwave background anisotropies and the thermal Sunyaev-Zeldovich effect (tSZ) signal, which dominate at ν ≤ 353 GHz. By excluding dominant spurious signals or systematic effects, averaged detections are reported at frequencies 353 GHz ≤ ν ≤ 5000 GHz. We confirm the presence of dust in clusters of galaxies at low and intermediate redshifts, yielding an SED with a shape similar to that of the Milky Way. Planck’s resolution does not allow us to investigate the detailed spatial distribution of this emission (e.g. whether it comes from intergalactic dust or simply the dust content of the cluster galaxies), but the radial distribution of the emission appears to follow that of the stacked SZ signal, and thus the extent of the clusters. Finally, the recovered SED allows us to constrain the dust mass responsible for the signal and its temperature.« less

  1. Assessment of repeatability of composition of perfumed waters by high-performance liquid chromatography combined with numerical data analysis based on cluster analysis (HPLC UV/VIS - CA).

    PubMed

    Ruzik, L; Obarski, N; Papierz, A; Mojski, M

    2015-06-01

    High-performance liquid chromatography (HPLC) with UV/VIS spectrophotometric detection combined with the chemometric method of cluster analysis (CA) was used for the assessment of repeatability of composition of nine types of perfumed waters. In addition, the chromatographic method of separating components of the perfume waters under analysis was subjected to an optimization procedure. The chromatograms thus obtained were used as sources of data for the chemometric method of cluster analysis (CA). The result was a classification of a set comprising 39 perfumed water samples with a similar composition at a specified level of probability (level of agglomeration). A comparison of the classification with the manufacturer's declarations reveals a good degree of consistency and demonstrates similarity between samples in different classes. A combination of the chromatographic method with cluster analysis (HPLC UV/VIS - CA) makes it possible to quickly assess the repeatability of composition of perfumed waters at selected levels of probability. © 2014 Society of Cosmetic Scientists and the Société Française de Cosmétologie.

  2. The Minnesota Center for Twin and Family Research Genome-Wide Association Study

    PubMed Central

    Miller, Michael B.; Basu, Saonli; Cunningham, Julie; Eskin, Eleazar; Malone, Steven M.; Oetting, William S.; Schork, Nicholas; Sul, Jae Hoon; Iacono, William G.; Mcgue, Matt

    2012-01-01

    As part of the Genes, Environment and Development Initiative (GEDI), the Minnesota Center for Twin and Family Research (MCTFR) undertook a genome-wide association study (GWAS), which we describe here. A total of 8405 research participants, clustered in 4-member families, have been successfully genotyped on 527,829 single nucleotide polymorphism (SNP) markers using Illumina’s Human660W-Quad array. Quality control screening of samples and markers as well as SNP imputation procedures are described. We also describe methods for ancestry control and how the familial clustering of the MCTFR sample can be accounted for in the analysis using a Rapid Feasible Generalized Least Squares algorithm. The rich longitudinal MCTFR assessments provide numerous opportunities for collaboration. PMID:23363460

  3. Dark Energy Survey Year 1 results: cross-correlation redshifts - methods and systematics characterization

    NASA Astrophysics Data System (ADS)

    Gatti, M.; Vielzeuf, P.; Davis, C.; Cawthon, R.; Rau, M. M.; DeRose, J.; De Vicente, J.; Alarcon, A.; Rozo, E.; Gaztanaga, E.; Hoyle, B.; Miquel, R.; Bernstein, G. M.; Bonnett, C.; Carnero Rosell, A.; Castander, F. J.; Chang, C.; da Costa, L. N.; Gruen, D.; Gschwend, J.; Hartley, W. G.; Lin, H.; MacCrann, N.; Maia, M. A. G.; Ogando, R. L. C.; Roodman, A.; Sevilla-Noarbe, I.; Troxel, M. A.; Wechsler, R. H.; Asorey, J.; Davis, T. M.; Glazebrook, K.; Hinton, S. R.; Lewis, G.; Lidman, C.; Macaulay, E.; Möller, A.; O'Neill, C. R.; Sommer, N. E.; Uddin, S. A.; Yuan, F.; Zhang, B.; Abbott, T. M. C.; Allam, S.; Annis, J.; Bechtol, K.; Brooks, D.; Burke, D. L.; Carollo, D.; Carrasco Kind, M.; Carretero, J.; Cunha, C. E.; D'Andrea, C. B.; DePoy, D. L.; Desai, S.; Eifler, T. F.; Evrard, A. E.; Flaugher, B.; Fosalba, P.; Frieman, J.; García-Bellido, J.; Gerdes, D. W.; Goldstein, D. A.; Gruendl, R. A.; Gutierrez, G.; Honscheid, K.; Hoormann, J. K.; Jain, B.; James, D. J.; Jarvis, M.; Jeltema, T.; Johnson, M. W. G.; Johnson, M. D.; Krause, E.; Kuehn, K.; Kuhlmann, S.; Kuropatkin, N.; Li, T. S.; Lima, M.; Marshall, J. L.; Melchior, P.; Menanteau, F.; Nichol, R. C.; Nord, B.; Plazas, A. A.; Reil, K.; Rykoff, E. S.; Sako, M.; Sanchez, E.; Scarpine, V.; Schubnell, M.; Sheldon, E.; Smith, M.; Smith, R. C.; Soares-Santos, M.; Sobreira, F.; Suchyta, E.; Swanson, M. E. C.; Tarle, G.; Thomas, D.; Tucker, B. E.; Tucker, D. L.; Vikram, V.; Walker, A. R.; Weller, J.; Wester, W.; Wolf, R. C.

    2018-06-01

    We use numerical simulations to characterize the performance of a clustering-based method to calibrate photometric redshift biases. In particular, we cross-correlate the weak lensing source galaxies from the Dark Energy Survey Year 1 sample with redMaGiC galaxies (luminous red galaxies with secure photometric redshifts) to estimate the redshift distribution of the former sample. The recovered redshift distributions are used to calibrate the photometric redshift bias of standard photo-z methods applied to the same source galaxy sample. We apply the method to two photo-z codes run in our simulated data: Bayesian Photometric Redshift and Directional Neighbourhood Fitting. We characterize the systematic uncertainties of our calibration procedure, and find that these systematic uncertainties dominate our error budget. The dominant systematics are due to our assumption of unevolving bias and clustering across each redshift bin, and to differences between the shapes of the redshift distributions derived by clustering versus photo-zs. The systematic uncertainty in the mean redshift bias of the source galaxy sample is Δz ≲ 0.02, though the precise value depends on the redshift bin under consideration. We discuss possible ways to mitigate the impact of our dominant systematics in future analyses.

  4. Clustering algorithm evaluation and the development of a replacement for procedure 1. [for crop inventories

    NASA Technical Reports Server (NTRS)

    Lennington, R. K.; Johnson, J. K.

    1979-01-01

    An efficient procedure which clusters data using a completely unsupervised clustering algorithm and then uses labeled pixels to label the resulting clusters or perform a stratified estimate using the clusters as strata is developed. Three clustering algorithms, CLASSY, AMOEBA, and ISOCLS, are compared for efficiency. Three stratified estimation schemes and three labeling schemes are also considered and compared.

  5. Maladaptive Personality and Neuropsychological Features of Highly Relationally Aggressive Adolescent Girls

    ERIC Educational Resources Information Center

    Savage, Michael; DiBiase, Anne-Marie

    2016-01-01

    The maladaptive personality and neuropsychological features of highly relationally aggressive females were examined in a group of 30 grade 6, 7, and 8 girls and group-matched controls. Employing a multistage cluster sampling procedure, a group of highly, yet almost exclusively, relationally aggressive females were identified and matched on a…

  6. Linking Teacher Competences to Organizational Citizenship Behaviour: The Role of Empowerment

    ERIC Educational Resources Information Center

    Kasekende, Francis; Munene, John C.; Otengei, Samson Omuudu; Ntayi, Joseph Mpeera

    2016-01-01

    Purpose: The purpose of this paper is to examine relationship between teacher competences and organizational citizenship behavior (OCB) with empowerment as a mediating factor. Design/methodology/approach: The study took a cross-sectional descriptive and analytical design. Using cluster and random sampling procedures, data were obtained from 383…

  7. Eye-gaze determination of user intent at the computer interface

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Goldberg, J.H.; Schryver, J.C.

    1993-12-31

    Determination of user intent at the computer interface through eye-gaze monitoring can significantly aid applications for the disabled, as well as telerobotics and process control interfaces. Whereas current eye-gaze control applications are limited to object selection and x/y gazepoint tracking, a methodology was developed here to discriminate a more abstract interface operation: zooming-in or out. This methodology first collects samples of eve-gaze location looking at controlled stimuli, at 30 Hz, just prior to a user`s decision to zoom. The sample is broken into data frames, or temporal snapshots. Within a data frame, all spatial samples are connected into a minimummore » spanning tree, then clustered, according to user defined parameters. Each cluster is mapped to one in the prior data frame, and statistics are computed from each cluster. These characteristics include cluster size, position, and pupil size. A multiple discriminant analysis uses these statistics both within and between data frames to formulate optimal rules for assigning the observations into zooming, zoom-out, or no zoom conditions. The statistical procedure effectively generates heuristics for future assignments, based upon these variables. Future work will enhance the accuracy and precision of the modeling technique, and will empirically test users in controlled experiments.« less

  8. Competing risks regression for clustered data

    PubMed Central

    Zhou, Bingqing; Fine, Jason; Latouche, Aurelien; Labopin, Myriam

    2012-01-01

    A population average regression model is proposed to assess the marginal effects of covariates on the cumulative incidence function when there is dependence across individuals within a cluster in the competing risks setting. This method extends the Fine–Gray proportional hazards model for the subdistribution to situations, where individuals within a cluster may be correlated due to unobserved shared factors. Estimators of the regression parameters in the marginal model are developed under an independence working assumption where the correlation across individuals within a cluster is completely unspecified. The estimators are consistent and asymptotically normal, and variance estimation may be achieved without specifying the form of the dependence across individuals. A simulation study evidences that the inferential procedures perform well with realistic sample sizes. The practical utility of the methods is illustrated with data from the European Bone Marrow Transplant Registry. PMID:22045910

  9. A new method for mapping multidimensional data to lower dimensions

    NASA Technical Reports Server (NTRS)

    Gowda, K. C.

    1983-01-01

    A multispectral mapping method is proposed which is based on the new concept of BEND (Bidimensional Effective Normalised Difference). The method, which involves taking one sample point at a time and finding the interrelationships between its features, is found very economical from the point of view of storage and processing time. It has good dimensionality reduction and clustering properties, and is highly suitable for computer analysis of large amounts of data. The transformed values obtained by this procedure are suitable for either a planar 2-space mapping of geological sample points or for making grayscale and color images of geo-terrains. A few examples are given to justify the efficacy of the proposed procedure.

  10. Quantum wavepacket ab initio molecular dynamics: an approach for computing dynamically averaged vibrational spectra including critical nuclear quantum effects.

    PubMed

    Sumner, Isaiah; Iyengar, Srinivasan S

    2007-10-18

    We have introduced a computational methodology to study vibrational spectroscopy in clusters inclusive of critical nuclear quantum effects. This approach is based on the recently developed quantum wavepacket ab initio molecular dynamics method that combines quantum wavepacket dynamics with ab initio molecular dynamics. The computational efficiency of the dynamical procedure is drastically improved (by several orders of magnitude) through the utilization of wavelet-based techniques combined with the previously introduced time-dependent deterministic sampling procedure measure to achieve stable, picosecond length, quantum-classical dynamics of electrons and nuclei in clusters. The dynamical information is employed to construct a novel cumulative flux/velocity correlation function, where the wavepacket flux from the quantized particle is combined with classical nuclear velocities to obtain the vibrational density of states. The approach is demonstrated by computing the vibrational density of states of [Cl-H-Cl]-, inclusive of critical quantum nuclear effects, and our results are in good agreement with experiment. A general hierarchical procedure is also provided, based on electronic structure harmonic frequencies, classical ab initio molecular dynamics, computation of nuclear quantum-mechanical eigenstates, and employing quantum wavepacket ab initio dynamics to understand vibrational spectroscopy in hydrogen-bonded clusters that display large degrees of anharmonicities.

  11. Harmonic decomposition of magneto-optical signal from suspensions of superparamagnetic nanoparticles

    NASA Astrophysics Data System (ADS)

    Patterson, Cody; Syed, Maarij; Takemura, Yasushi

    2018-04-01

    Magnetic nanoparticles (MNPs) are widely used in biomedical applications. Characterizing dilute suspensions of superparamagnetic iron oxide nanoparticles (SPIONs) in bio-relevant media is particularly valuable for magnetic particle imaging, hyperthermia, drug delivery, etc. Here, we study dilute aqueous suspensions of single-domain magnetite nanoparticles using an AC Faraday rotation (FR) setup. The setup uses an oscillating magnetic field (800 Hz) which generates a multi-harmonic response. Each harmonic is collected and analyzed using the Fourier components of the theoretical signal determined by a Langevin-like magnetization. With this procedure, we determine the average magnetic moment per particle μ , particle number density n, and Verdet constant of the sample. The fitted values of μ and n are shown to be consistent across each harmonic. Additionally, we present the results of these parameters as n is varied. The large values of μ reveal the possibility of clustering as reported in other literature. This suggests that μ is representative of the average magnetic moment per cluster of nanoparticles. Multiple factors, including the external magnetic field, surfactant degradation, and laser absorption, can contribute to dynamic and long-term aggregation leading to FR signals that represent space- and time-averaged sample parameters. Using this powerful analysis procedure, future studies are aimed at determining the clustering mechanisms in this AC system and characterizing SPION suspensions at different frequencies and viscosities.

  12. Stability and change in adolescent spirituality/religiosity: a person-centered approach.

    PubMed

    Good, Marie; Willoughby, Teena; Busseri, Michael A

    2011-03-01

    Although there has been a substantial increase over the past decade in studies that have examined the psychosocial correlates of spirituality/religiosity in adolescence, very little is known about spirituality/religiosity as a domain of development in its own right. To address this limitation, the authors identified configurations of multiple dimensions of spirituality/religiosity across 2 time points with an empirical classification procedure (cluster analysis) and assessed development in these configurations at the sample and individual level. Participants included 756 predominately Canadian-born adolescents (53% female, 47% male) from southern Ontario, Canada, who completed a survey in Grade 11 (M age = 16.41 years) and Grade 12 (M age = 17.36 years). Measures included religious activity involvement, enjoyment of religious activities, the Spiritual Transcendence Index, wondering about spiritual issues, frequency of prayer, and frequency of meditation. Sample-level development (structural stability and change) was assessed by examining whether the structural configurations of the clusters were consistent over time. Individual-level development was assessed by examining intraindividual stability and change in cluster membership over time. Results revealed that a five cluster-solution was optimal at both grades. Clusters were identified as aspiritual/irreligious, disconnected wonderers, high institutional and personal, primarily personal, and meditators. With the exception of the high institutional and personal cluster, the cluster structures were stable over time. There also was significant intraindividual stability in all clusters over time; however, a significant proportion of individuals classified as high institutional and personal in Grade 11 moved into the primarily personal cluster in Grade 12. PsycINFO Database Record (c) 2011 APA, all rights reserved.

  13. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Adam, R.; Ade, P. A. R.; Aghanim, N.

    Although infrared (IR) overall dust emission from clusters of galaxies has been statistically detected using data from the Infrared Astronomical Satellite (IRAS), it has not been possible to sample the spectral energy distribution (SED) of this emission over its peak, and thus to break the degeneracy between dust temperature and mass. By complementing the IRAS spectral coverage with Planck satellite data from 100 to 857 GHz, we provide in this paper new constraints on the IR spectrum of thermal dust emission in clusters of galaxies. We achieve this by using a stacking approach for a sample of several hundred objectsmore » from the Planck cluster sample. This procedure averages out fluctuations from the IR sky, allowing us to reach a significant detection of the faint cluster contribution. We also use the large frequency range probed by Planck, together with component-separation techniques, to remove the contamination from both cosmic microwave background anisotropies and the thermal Sunyaev-Zeldovich effect (tSZ) signal, which dominate at ν ≤ 353 GHz. By excluding dominant spurious signals or systematic effects, averaged detections are reported at frequencies 353 GHz ≤ ν ≤ 5000 GHz. We confirm the presence of dust in clusters of galaxies at low and intermediate redshifts, yielding an SED with a shape similar to that of the Milky Way. Planck’s resolution does not allow us to investigate the detailed spatial distribution of this emission (e.g. whether it comes from intergalactic dust or simply the dust content of the cluster galaxies), but the radial distribution of the emission appears to follow that of the stacked SZ signal, and thus the extent of the clusters. Finally, the recovered SED allows us to constrain the dust mass responsible for the signal and its temperature.« less

  14. D2O clusters isolated in rare-gas solids: Dependence of infrared spectrum on concentration, deposition rate, heating temperature, and matrix material

    NASA Astrophysics Data System (ADS)

    Shimazaki, Yoichi; Arakawa, Ichiro; Yamakawa, Koichiro

    2018-04-01

    The infrared absorption spectra of D2O monomers and clusters isolated in rare-gas matrices were systematically reinvestigated under the control of the following factors: the D2O concentration, deposition rate, heating temperature, and rare-gas species. We clearly show that the cluster-size distribution is dependent on not only the D2O concentration but also the deposition rate of a sample; as the rate got higher, smaller clusters were preferentially formed. Under the heating procedures at different temperatures, the cluster-size growth was successfully observed. Since the monomer diffusion was not enough to balance the changes in the column densities of the clusters, the dimer diffusion was likely to contribute the cluster growth. The frequencies of the bonded-OD stretches of (D2O)k with k = 2-6 were almost linearly correlated with the square root of the critical temperature of the matrix material. Additional absorption peaks of (D2O)2 and (D2O)3 in a Xe matrix were assigned to the species trapped in tight accommodation sites.

  15. ICAP - An Interactive Cluster Analysis Procedure for analyzing remotely sensed data

    NASA Technical Reports Server (NTRS)

    Wharton, S. W.; Turner, B. J.

    1981-01-01

    An Interactive Cluster Analysis Procedure (ICAP) was developed to derive classifier training statistics from remotely sensed data. ICAP differs from conventional clustering algorithms by allowing the analyst to optimize the cluster configuration by inspection, rather than by manipulating process parameters. Control of the clustering process alternates between the algorithm, which creates new centroids and forms clusters, and the analyst, who can evaluate and elect to modify the cluster structure. Clusters can be deleted, or lumped together pairwise, or new centroids can be added. A summary of the cluster statistics can be requested to facilitate cluster manipulation. The principal advantage of this approach is that it allows prior information (when available) to be used directly in the analysis, since the analyst interacts with ICAP in a straightforward manner, using basic terms with which he is more likely to be familiar. Results from testing ICAP showed that an informed use of ICAP can improve classification, as compared to an existing cluster analysis procedure.

  16. Multiple Imputation in Two-Stage Cluster Samples Using The Weighted Finite Population Bayesian Bootstrap.

    PubMed

    Zhou, Hanzhi; Elliott, Michael R; Raghunathan, Trivellore E

    2016-06-01

    Multistage sampling is often employed in survey samples for cost and convenience. However, accounting for clustering features when generating datasets for multiple imputation is a nontrivial task, particularly when, as is often the case, cluster sampling is accompanied by unequal probabilities of selection, necessitating case weights. Thus, multiple imputation often ignores complex sample designs and assumes simple random sampling when generating imputations, even though failing to account for complex sample design features is known to yield biased estimates and confidence intervals that have incorrect nominal coverage. In this article, we extend a recently developed, weighted, finite-population Bayesian bootstrap procedure to generate synthetic populations conditional on complex sample design data that can be treated as simple random samples at the imputation stage, obviating the need to directly model design features for imputation. We develop two forms of this method: one where the probabilities of selection are known at the first and second stages of the design, and the other, more common in public use files, where only the final weight based on the product of the two probabilities is known. We show that this method has advantages in terms of bias, mean square error, and coverage properties over methods where sample designs are ignored, with little loss in efficiency, even when compared with correct fully parametric models. An application is made using the National Automotive Sampling System Crashworthiness Data System, a multistage, unequal probability sample of U.S. passenger vehicle crashes, which suffers from a substantial amount of missing data in "Delta-V," a key crash severity measure.

  17. Multiple Imputation in Two-Stage Cluster Samples Using The Weighted Finite Population Bayesian Bootstrap

    PubMed Central

    Zhou, Hanzhi; Elliott, Michael R.; Raghunathan, Trivellore E.

    2017-01-01

    Multistage sampling is often employed in survey samples for cost and convenience. However, accounting for clustering features when generating datasets for multiple imputation is a nontrivial task, particularly when, as is often the case, cluster sampling is accompanied by unequal probabilities of selection, necessitating case weights. Thus, multiple imputation often ignores complex sample designs and assumes simple random sampling when generating imputations, even though failing to account for complex sample design features is known to yield biased estimates and confidence intervals that have incorrect nominal coverage. In this article, we extend a recently developed, weighted, finite-population Bayesian bootstrap procedure to generate synthetic populations conditional on complex sample design data that can be treated as simple random samples at the imputation stage, obviating the need to directly model design features for imputation. We develop two forms of this method: one where the probabilities of selection are known at the first and second stages of the design, and the other, more common in public use files, where only the final weight based on the product of the two probabilities is known. We show that this method has advantages in terms of bias, mean square error, and coverage properties over methods where sample designs are ignored, with little loss in efficiency, even when compared with correct fully parametric models. An application is made using the National Automotive Sampling System Crashworthiness Data System, a multistage, unequal probability sample of U.S. passenger vehicle crashes, which suffers from a substantial amount of missing data in “Delta-V,” a key crash severity measure. PMID:29226161

  18. Noise/spike detection in phonocardiogram signal as a cyclic random process with non-stationary period interval.

    PubMed

    Naseri, H; Homaeinezhad, M R; Pourkhajeh, H

    2013-09-01

    The major aim of this study is to describe a unified procedure for detecting noisy segments and spikes in transduced signals with a cyclic but non-stationary periodic nature. According to this procedure, the cycles of the signal (onset and offset locations) are detected. Then, the cycles are clustered into a finite number of groups based on appropriate geometrical- and frequency-based time series. Next, the median template of each time series of each cluster is calculated. Afterwards, a correlation-based technique is devised for making a comparison between a test cycle feature and the associated time series of each cluster. Finally, by applying a suitably chosen threshold for the calculated correlation values, a segment is prescribed to be either clean or noisy. As a key merit of this research, the procedure can introduce a decision support for choosing accurately orthogonal-expansion-based filtering or to remove noisy segments. In this paper, the application procedure of the proposed method is comprehensively described by applying it to phonocardiogram (PCG) signals for finding noisy cycles. The database consists of 126 records from several patients of a domestic research station acquired by a 3M Littmann(®) 3200, 4KHz sampling frequency electronic stethoscope. By implementing the noisy segments detection algorithm with this database, a sensitivity of Se=91.41% and a positive predictive value, PPV=92.86% were obtained based on physicians assessments. Copyright © 2013 Elsevier Ltd. All rights reserved.

  19. Chemometric study of Maya Blue from the voltammetry of microparticles approach.

    PubMed

    Doménech, Antonio; Doménech-Carbó, María Teresa; de Agredos Pascual, María Luisa Vazquez

    2007-04-01

    The use of the voltammetry of microparticles at paraffin-impregnated graphite electrodes allows for the characterization of different types of Maya Blue (MB) used in wall paintings from different archaeological sites of Campeche and YucatAn (Mexico). Using voltammetric signals for electron-transfer processes involving palygorskite-associated indigo and quinone functionalities generated by scratching the graphite surface, voltammograms provide information on the composition and texture of MB samples. Application of hierarchical cluster analysis and other chemometric methods allows us to characterize samples from different archaeological sites and to distinguish between samples proceeding from different chronological periods. Comparison between microscopic, spectroscopic, and electrochemical examination of genuine MB samples and synthetic specimens indicated that the preparation procedure of the pigment evolved in time via successive steps anticipating modern synthetic procedures, namely, hybrid organic-inorganic synthesis, temperature control of chemical reactivity, and template-like synthesis.

  20. Cherry-picking functionally relevant substates from long md trajectories using a stratified sampling approach.

    PubMed

    Chandramouli, Balasubramanian; Mancini, Giordano

    2016-01-01

    Classical Molecular Dynamics (MD) simulations can provide insights at the nanoscopic scale into protein dynamics. Currently, simulations of large proteins and complexes can be routinely carried out in the ns-μs time regime. Clustering of MD trajectories is often performed to identify selective conformations and to compare simulation and experimental data coming from different sources on closely related systems. However, clustering techniques are usually applied without a careful validation of results and benchmark studies involving the application of different algorithms to MD data often deal with relatively small peptides instead of average or large proteins; finally clustering is often applied as a means to analyze refined data and also as a way to simplify further analysis of trajectories. Herein, we propose a strategy to classify MD data while carefully benchmarking the performance of clustering algorithms and internal validation criteria for such methods. We demonstrate the method on two showcase systems with different features, and compare the classification of trajectories in real and PCA space. We posit that the prototype procedure adopted here could be highly fruitful in clustering large trajectories of multiple systems or that resulting especially from enhanced sampling techniques like replica exchange simulations. Copyright: © 2016 by Fabrizio Serra editore, Pisa · Roma.

  1. On the relative ages of galactic globular clusters. A new observable, a semi-empirical calibration and problems with the theoretical isochrones

    NASA Astrophysics Data System (ADS)

    Buonanno, R.; Corsi, C. E.; Pulone, L.; Fusi Pecci, F.; Bellazzini, M.

    1998-05-01

    A new procedure is described to derive homogeneous relative ages from the Color-Magnitude Diagrams (CMDs) of Galactic globular clusters (GGCs). It is based on the use of a new observable, Delta V(0.05) , namely the difference in magnitude between an arbitrary point on the upper main sequence (V_{+0.05} -the V magnitude of the MS-ridge, 0.05 mag redder than the Main Sequence (MS) Turn-off, (TO)) and the horizontal branch (HB). The observational error associated to Delta V(0.05) is substantially smaller than that of previous age-indicators, keeping the property of being strictly independent of distance and reddening and of being based on theoretical luminosities rather than on still uncertain theoretical temperatures. As an additional bonus, the theoretical models show that Delta V(0.05) has a low dependence on metallicity. Moreover, the estimates of the relative age so obtained are also sufficiently invariant (to within ~ +/- 1 Gyr) with varying adopted models and transformations. Since the difference in the color difference Delta (B-V)_{TO,RGB} (VandenBerg, Bolte and Stetson 1990 -VBS, Sarajedini and Demarque 1990 -SD) remains the most reliable technique to estimate relative cluster ages for clusters where the horizontal part of the HB is not adequately populated, we have used the differential ages obtained via the "vertical" Delta V(0.05) parameter for a selected sample of clusters (with high quality CMDs, well populated HBs, trustworthy calibrations) to perform an empirical calibration of the "horizontal" observable in terms of [Fe/H] and age. A direct comparison with the corresponding calibration derived from the theoretical models reveals the existence of clear-cut discrepancies, which call into question the model scaling with metallicity in the observational planes. Starting from the global sample of considered clusters, we have thus evaluated, within a homogeneous procedure, relative ages for 33 GGCs having different metallicity, HB-morphologies, and galactocentric distances. These new estimates have also been compared with previous latest determinations (Chaboyer, Demarque and Sarajedini 1996, and Richer {et al. } 1996). The distribution of the cluster ages with varying metallicity and galactocentric distance are briefly discussed: (a) there is no direct indication for any evident age-metallicity relationship; (b) there is some spread in age (still partially compatible with the errors), and the largest dispersion is found for intermediate metal-poor clusters; (c) older clusters populate both the inner and the outer regions of the Milky Way, while the younger globulars are present only in the outer regions, but the sample is far too poor to yield conclusive evidences.

  2. Numerical taxonomy and ecology of petroleum-degrading bacteria

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Austin, B.; Calomiris, J.J.; Walker, J.D.

    1977-07-01

    A total of 99 strains of petroleum-degrading bacteria isolated from Chesapeake Bay water and sediment were identified by using numerical taxonomy procedures. The isolates, together with 33 reference cultures, were examined for 48 biochemical, cultural, morphological, and physiological characters. The data were analyzed by computer, using both the simple matching and the Jaccard coefficients. Clustering was achieved by the unweighted average linkage method. From the sorted similarity matrix and dendrogram, 14 phenetic groups, comprising 85 of the petroleum-degrading bacteria, were defined at the 80 to 85% similarity level. These groups were identified as actinomycetes (mycelial forms, four clusters), coryneforms, Enterobacteriaceae,more » Klebsiella aerogenes, Micrococcus spp. (two clusters), Nocardia species (two clusters), Pseudomonas spp. (two clusters), and Sphaerotilus natans. It is concluded that the degradation of petroleum is accomplished by a diverse range of bacterial taxa, some of which were isolated only at given sampling stations and, more specifically, from sediment collected at a given station.« less

  3. The ASTRODEEP Frontier Fields catalogues. I. Multiwavelength photometry of Abell-2744 and MACS-J0416

    NASA Astrophysics Data System (ADS)

    Merlin, E.; Amorín, R.; Castellano, M.; Fontana, A.; Buitrago, F.; Dunlop, J. S.; Elbaz, D.; Boucaud, A.; Bourne, N.; Boutsia, K.; Brammer, G.; Bruce, V. A.; Capak, P.; Cappelluti, N.; Ciesla, L.; Comastri, A.; Cullen, F.; Derriere, S.; Faber, S. M.; Ferguson, H. C.; Giallongo, E.; Grazian, A.; Lotz, J.; Michałowski, M. J.; Paris, D.; Pentericci, L.; Pilo, S.; Santini, P.; Schreiber, C.; Shu, X.; Wang, T.

    2016-05-01

    Context. The Frontier Fields survey is a pioneering observational program aimed at collecting photometric data, both from space (Hubble Space Telescope and Spitzer Space Telescope) and from ground-based facilities (VLT Hawk-I), for six deep fields pointing at clusters of galaxies and six nearby deep parallel fields, in a wide range of passbands. The analysis of these data is a natural outcome of the Astrodeep project, an EU collaboration aimed at developing methods and tools for extragalactic photometry and creating valuable public photometric catalogues. Aims: We produce multiwavelength photometric catalogues (from B to 4.5 μm) for the first two of the Frontier Fields, Abell-2744 and MACS-J0416 (plus their parallel fields). Methods: To detect faint sources even in the central regions of the clusters, we develop a robust and repeatable procedure that uses the public codes Galapagos and Galfit to model and remove most of the light contribution from both the brightest cluster members, and the intra-cluster light. We perform the detection on the processed HST H160 image to obtain a pure H-selected sample, which is the primary catalogue that we publish. We also add a sample of sources which are undetected in the H160 image but appear on a stacked infrared image. Photometry on the other HST bands is obtained using SExtractor, again on processed images after the procedure for foreground light removal. Photometry on the Hawk-I and IRAC bands is obtained using our PSF-matching deconfusion code t-phot. A similar procedure, but without the need for the foreground light removal, is adopted for the Parallel fields. Results: The procedure of foreground light subtraction allows for the detection and the photometric measurements of ~2500 sources per field. We deliver and release complete photometric H-detected catalogues, with the addition of the complementary sample of infrared-detected sources. All objects have multiwavelength coverage including B to H HST bands, plus K-band from Hawk-I, and 3.6-4.5 μm from Spitzer. full and detailed treatment of photometric errors is included. We perform basic sanity checks on the reliability of our results. Conclusions: The multiwavelength photometric catalogues are available publicly and are ready to be used for scientific purposes. Our procedures allows for the detection of outshone objects near the bright galaxies, which, coupled with the magnification effect of the clusters, can reveal extremely faint high redshift sources. Full analysis on photometric redshifts is presented in Paper II. The catalogues, together with the final processed images for all HST bands (as well as some diagnostic data and images), are publicly available and can be downloaded from the Astrodeep website at http://www.astrodeep.eu/frontier-fields/ and from a dedicated CDS webpage (http://astrodeep.u-strasbg.fr/ff/index.html). The catalogues are also available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/590/A31

  4. Can cluster environment modify the dynamical evolution of spiral galaxies?

    NASA Technical Reports Server (NTRS)

    Amram, P.; Balkowski, C.; Cayatte, V.; Marcelin, M.; Sullivan, W. T., III

    1993-01-01

    Over the past decade many effects of the cluster environment on member galaxies have been established. These effects are manifest in the amount and distribution of gas in cluster spirals, the luminosity and light distributions within galaxies, and the segregation of morphological types. All these effects could indicate a specific dynamical evolution for galaxies in clusters. Nevertheless, a more direct evidence, such as a different mass distribution for spiral galaxies in clusters and in the field, is not yet clearly established. Indeed, Rubin, Whitmore, and Ford (1988) and Whitmore, Forbes, and Rubin (1988) (referred to as RWF) presented evidence that inner cluster spirals have falling rotation curves, unlike those of outer cluster spirals or the great majority of field spirals. If falling rotation curves exist in centers of clusters, as argued by RWF, it would suggest that dark matter halos were absent from cluster spirals, either because the halos had become stripped by interactions with other galaxies or with an intracluster medium, or because the halos had never formed in the first place. Even if they didn't disagree with RWF, other researchers pointed out that the behaviour of the slope of the rotation curves of spiral galaxies (in Virgo) is not so clear. Amram, using a different sample of spiral galaxies in clusters, found only 10% of declining rotation curves (2 declining vs 17 flat or rising) in opposition to RWF who find about 40% of declining rotation curves in their sample (6 declining vs 10 flat or rising), we will hereafter briefly discuss the Amram data paper and compare it to the results of RWF. We have measured the rotation curves for a sample of 21 spiral galaxies in 5 nearby clusters. These rotation curves have been constructed from detailed two-dimensional maps of each galaxy's velocity field as traced by emission from the Ha line. This complete mapping, combined with the sensitivity of our CFHT 3.60 m. + Perot-Fabry + CCD observations, allows the construction of high-quality rotation curves. Details concerning the acquisition and reduction procedures of the data are given in Amram. We present and discuss our preliminary analysis and compare them with RWF's results.

  5. Elderly patients attended in emergency health services in Brazil: a study for victims of falls and traffic accidents.

    PubMed

    de Freitas, Mariana Gonçalves; Bonolo, Palmira de Fátima; de Moraes, Edgar Nunes; Machado, Carla Jorge

    2015-03-01

    The article aims to describe the profile of elderly victims of falls and traffic accidents from the data of the Surveillance Survey of Violence and Accidents (VIVA). The VIVA Survey was conducted in the emergency health-services of the Unified Health System in the capitals of Brazil in 2011. The sample of elderly by type of accident was subjected to the two-step cluster procedure. Of the 2463 elderly persons in question, 79.8% suffered falls and 20.2% were the victims of traffic accidents. The 1812 elderly who fell were grouped together into 4 clusters: Cluster 1, in which all had disabilities; Cluster 2, all were non-white and falls took place in the home; Cluster 3, younger and active seniors; and Cluster 4, with a higher proportion of seniors 80 years old or above who were white. Among cases of traffic accidents, 446 seniors were grouped into two clusters: Cluster 1 of younger elderly, drivers or passengers; Cluster 2, with higher age seniors, mostly pedestrians. The main victims of falls were women with low schooling and unemployed; traffic accident victims were mostly younger and male. Complications were similar in victims of falls and traffic accidents. Clusters allow adoption of targeted measures of care, prevention and health promotion.

  6. Towards a comprehensive knowledge of the star cluster population in the Small Magellanic Cloud

    NASA Astrophysics Data System (ADS)

    Piatti, A. E.

    2018-07-01

    The Small Magellanic Cloud (SMC) has recently been found to harbour an increase of more than 200 per cent in its known cluster population. Here, we provide solid evidence that this unprecedented number of clusters could be greatly overestimated. On the one hand, the fully automatic procedure used to identify such an enormous cluster candidate sample did not recover ˜50 per cent, on average, of the known relatively bright clusters located in the SMC main body. On the other hand, the number of new cluster candidates per time unit as a function of time is noticeably different from the intrinsic SMC cluster frequency (CF), which should not be the case if these new detections were genuine physical systems. We found additionally that the SMC CF varies spatially, in such a way that it resembles an outside-in process coupled with the effects of a relatively recent interaction with the Large Magellanic Cloud. By assuming that clusters and field stars share the same formation history, we showed for the first time that the cluster dissolution rate also depends on position in the galaxy. The cluster dissolution becomes higher as the concentration of galaxy mass increases or if external tidal forces are present.

  7. Surface enhanced Raman spectroscopy (SERS) from a molecule adsorbed on a nanoscale silver particle cluster in a holographic plate

    NASA Astrophysics Data System (ADS)

    Jusinski, Leonard E.; Bahuguna, Ramen; Das, Amrita; Arya, Karamjeet

    2006-02-01

    Surface enhanced Raman spectroscopy has become a viable technique for the detection of single molecules. This highly sensitive technique is due to the very large (up to 14 orders in magnitude) enhancement in the Raman cross section when the molecule is adsorbed on a metal nanoparticle cluster. We report here SERS (Surface Enhanced Raman Spectroscopy) experiments performed by adsorbing analyte molecules on nanoscale silver particle clusters within the gelatin layer of commercially available holographic plates which have been developed and fixed. The Ag particles range in size between 5 - 30 nanometers (nm). Sample preparation was performed by immersing the prepared holographic plate in an analyte solution for a few minutes. We report here the production of SERS signals from Rhodamine 6G (R6G) molecules of nanomolar concentration. These measurements demonstrate a fast, low cost, reproducible technique of producing SERS substrates in a matter of minutes compared to the conventional procedure of preparing Ag clusters from colloidal solutions. SERS active colloidal solutions require up to a full day to prepare. In addition, the preparations of colloidal aggregates are not consistent in shape, contain additional interfering chemicals, and do not generate consistent SERS enhancement. Colloidal solutions require the addition of KCl or NaCl to increase the ionic strength to allow aggregation and cluster formation. We find no need to add KCl or NaCl to create SERS active clusters in the holographic gelatin matrix. These holographic plates, prepared using simple, conventional procedures, can be stored in an inert environment and preserve SERS activity after several weeks subsequent to preparation.

  8. Pain Behavior in Rheumatoid Arthritis Patients: Identification of Pain Behavior Subgroups

    PubMed Central

    Waters, Sandra J.; Riordan, Paul A.; Keefe, Francis J.; Lefebvre, John C.

    2008-01-01

    This study used Ward’s minimum variance hierarchical cluster analysis to identify homogeneous subgroups of rheumatoid arthritis patients suffering from chronic pain who exhibited similar pain behavior patterns during a videotaped behavior sample. Ninety-two rheumatoid arthritis patients were divided into two samples. Six motor pain behaviors were examined: guarding, bracing, active rubbing, rigidity, grimacing, and sighing. The cluster analysis procedure identified four similar subgroups in Sample 1 and Sample 2. The first subgroup exhibited low levels of all pain behaviors. The second subgroup exhibited a high level of guarding and low levels of other pain behaviors. The third subgroup exhibited high levels of guarding and rigidity and low levels of other pain behaviors. The fourth subgroup exhibited high levels of guarding and active rubbing and low levels of other pain behaviors. Sample 1 contained a fifth subgroup that exhibited a high level of active rubbing and low levels of other pain measures. The results of this study suggest that there are homogeneous subgroups within rheumatoid arthritis patient populations who differ in the motor pain behaviors they exhibit. PMID:18358682

  9. Simultaneous contrast: evidence from licking microstructure and cross-solution comparisons.

    PubMed

    Dwyer, Dominic M; Lydall, Emma S; Hayward, Andrew J

    2011-04-01

    The microstructure of rats' licking responses was analyzed to investigate both "classic" simultaneous contrast (e.g., Flaherty & Largen, 1975) and a novel discrete-trial contrast procedure where access to an 8% test solution of sucrose was preceded by a sample of either 2%, 8%, or 32% sucrose (Experiments 1 and 2, respectively). Consumption of a given concentration of sucrose was higher when consumed alongside a low rather than high concentration comparison solution (positive contrast) and consumption of a given concentration of sucrose was lower when consumed alongside a high rather than a low concentration comparison solution (negative contrast). Furthermore, positive contrast increased the size of lick clusters while negative contrast decreased the size of lick clusters. Lick cluster size has a positive monotonic relationship with the concentration of palatable solutions and so positive and negative contrasts produced changes in lick cluster size that were analogous to raising or lowering the concentration of the test solution respectively. Experiment 3 utilized the discrete-trial procedure and compared contrast between two solutions of the same type (sucrose-sucrose or maltodextrin-maltodextrin) or contrast across solutions (sucrose-maltodextrin or maltodextrin-sucrose). Contrast effects on consumption were present, but reduced in size, in the cross-solution conditions. Moreover, lick cluster sizes were not affected at all by cross-solution contrasts as they were by same-solution contrasts. These results are consistent with the idea that simultaneous contrast effects depend, at least partially, on sensory mechanisms.

  10. Age and metallicity effects in single stellar populations: application to M 31 clusters.

    NASA Astrophysics Data System (ADS)

    de Freitas Pacheco, J. A.

    1997-03-01

    We have recently calculated (Borges et al. 1995AJ....110.2408B) integrated metallicity indices for single stellar populations (SSP). Effects of age, metallicity and abundances were taken into account. In particular, the explicit dependence of the indices Mg_2_ and NaD respectively on the ratios [Mg/Fe] and [Na/Fe] was included in the calibration. We report in this work an application of those models to a sample of 12 globular clusters in M 31. A fitting procedure was used to obtain age, metallicity and the [Mg/Fe] ratio for each object, which best reproduce the data. The mean age of the sample is 15+/-2.8Gyr and the mean [Mg/Fe] ratio is 0.35+/-0.10. These values and the derived metallicity spread are comparable to those found in galactic counterparts.

  11. The Mucciardi-Gose Clustering Algorithm and Its Applications in Automatic Pattern Recognition.

    DTIC Science & Technology

    A procedure known as the Mucciardi- Gose clustering algorithm, CLUSTR, for determining the geometrical or statistical relationships among groups of N...discussion of clustering algorithms is given; the particular advantages of the Mucciardi- Gose procedure are described. The mathematical basis for, and the

  12. Optimized Clustering Estimators for BAO Measurements Accounting for Significant Redshift Uncertainty

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ross, Ashley J.; Banik, Nilanjan; Avila, Santiago

    2017-05-15

    We determine an optimized clustering statistic to be used for galaxy samples with significant redshift uncertainty, such as those that rely on photometric redshifts. To do so, we study the BAO information content as a function of the orientation of galaxy clustering modes with respect to their angle to the line-of-sight (LOS). The clustering along the LOS, as observed in a redshift-space with significant redshift uncertainty, has contributions from clustering modes with a range of orientations with respect to the true LOS. For redshift uncertaintymore » $$\\sigma_z \\geq 0.02(1+z)$$ we find that while the BAO information is confined to transverse clustering modes in the true space, it is spread nearly evenly in the observed space. Thus, measuring clustering in terms of the projected separation (regardless of the LOS) is an efficient and nearly lossless compression of the signal for $$\\sigma_z \\geq 0.02(1+z)$$. For reduced redshift uncertainty, a more careful consideration is required. We then use more than 1700 realizations of galaxy simulations mimicking the Dark Energy Survey Year 1 sample to validate our analytic results and optimized analysis procedure. We find that using the correlation function binned in projected separation, we can achieve uncertainties that are within 10 per cent of of those predicted by Fisher matrix forecasts. We predict that DES Y1 should achieve a 5 per cent distance measurement using our optimized methods. We expect the results presented here to be important for any future BAO measurements made using photometric redshift data.« less

  13. Optimized clustering estimators for BAO measurements accounting for significant redshift uncertainty

    NASA Astrophysics Data System (ADS)

    Ross, Ashley J.; Banik, Nilanjan; Avila, Santiago; Percival, Will J.; Dodelson, Scott; Garcia-Bellido, Juan; Crocce, Martin; Elvin-Poole, Jack; Giannantonio, Tommaso; Manera, Marc; Sevilla-Noarbe, Ignacio

    2017-12-01

    We determine an optimized clustering statistic to be used for galaxy samples with significant redshift uncertainty, such as those that rely on photometric redshifts. To do so, we study the baryon acoustic oscillation (BAO) information content as a function of the orientation of galaxy clustering modes with respect to their angle to the line of sight (LOS). The clustering along the LOS, as observed in a redshift-space with significant redshift uncertainty, has contributions from clustering modes with a range of orientations with respect to the true LOS. For redshift uncertainty σz ≥ 0.02(1 + z), we find that while the BAO information is confined to transverse clustering modes in the true space, it is spread nearly evenly in the observed space. Thus, measuring clustering in terms of the projected separation (regardless of the LOS) is an efficient and nearly lossless compression of the signal for σz ≥ 0.02(1 + z). For reduced redshift uncertainty, a more careful consideration is required. We then use more than 1700 realizations (combining two separate sets) of galaxy simulations mimicking the Dark Energy Survey Year 1 (DES Y1) sample to validate our analytic results and optimized analysis procedure. We find that using the correlation function binned in projected separation, we can achieve uncertainties that are within 10 per cent of those predicted by Fisher matrix forecasts. We predict that DES Y1 should achieve a 5 per cent distance measurement using our optimized methods. We expect the results presented here to be important for any future BAO measurements made using photometric redshift data.

  14. Fast clustering algorithm for large ECG data sets based on CS theory in combination with PCA and K-NN methods.

    PubMed

    Balouchestani, Mohammadreza; Krishnan, Sridhar

    2014-01-01

    Long-term recording of Electrocardiogram (ECG) signals plays an important role in health care systems for diagnostic and treatment purposes of heart diseases. Clustering and classification of collecting data are essential parts for detecting concealed information of P-QRS-T waves in the long-term ECG recording. Currently used algorithms do have their share of drawbacks: 1) clustering and classification cannot be done in real time; 2) they suffer from huge energy consumption and load of sampling. These drawbacks motivated us in developing novel optimized clustering algorithm which could easily scan large ECG datasets for establishing low power long-term ECG recording. In this paper, we present an advanced K-means clustering algorithm based on Compressed Sensing (CS) theory as a random sampling procedure. Then, two dimensionality reduction methods: Principal Component Analysis (PCA) and Linear Correlation Coefficient (LCC) followed by sorting the data using the K-Nearest Neighbours (K-NN) and Probabilistic Neural Network (PNN) classifiers are applied to the proposed algorithm. We show our algorithm based on PCA features in combination with K-NN classifier shows better performance than other methods. The proposed algorithm outperforms existing algorithms by increasing 11% classification accuracy. In addition, the proposed algorithm illustrates classification accuracy for K-NN and PNN classifiers, and a Receiver Operating Characteristics (ROC) area of 99.98%, 99.83%, and 99.75% respectively.

  15. Washington photometry of 14 intermediate-age to old star clusters in the Small Magellanic Cloud

    NASA Astrophysics Data System (ADS)

    Piatti, Andrés E.; Clariá, Juan J.; Bica, Eduardo; Geisler, Doug; Ahumada, Andrea V.; Girardi, Léo

    2011-10-01

    We present CCD photometry in the Washington system C, T1 and T2 passbands down to T1˜ 23 in the fields of L3, L28, HW 66, L100, HW 79, IC 1708, L106, L108, L109, NGC 643, L112, HW 84, HW 85 and HW 86, 14 Small Magellanic Cloud (SMC) clusters, most of them poorly studied objects. We measured T1 magnitudes and C-T1 and T1-T2 colours for a total of 213 516 stars spread throughout cluster areas of 14.7 × 14.7 arcmin2 each. We carried out an in-depth analysis of the field star contamination of the colour-magnitude diagrams (CMDs) and statistically cleaned the cluster CMDs. Based on the best fits of isochrones computed by the Padova group to the (T1, C-T1) CMDs, as well as from the δ(T1) index and the standard giant branch procedure, we derived ages and metallicities for the cluster sample. With the exception of IC 1708, a relatively metal-poor Hyades-age cluster, the remaining 13 objects are between intermediate and old age (from 1.0 to 6.3 Gyr), their [Fe/H] values ranging from -1.4 to -0.7 dex. By combining these results with others available in the literature, we compiled a sample of 43 well-known SMC clusters older than 1 Gyr, with which we produced a revised age distribution. We found that the present clusters' age distribution reveals two primary excesses of clusters at t˜ 2 and 5 Gyr, which engraves the SMC with clear signs of enhanced formation episodes at both ages. In addition, we found that from the birth of the SMC cluster system until approximately the first 4 Gyr of its lifetime, the cluster formation resembles that of a constant formation rate scenario.

  16. WINGS: A WIde-field Nearby Galaxy-cluster Survey. II. Deep optical photometry of 77 nearby clusters

    NASA Astrophysics Data System (ADS)

    Varela, J.; D'Onofrio, M.; Marmo, C.; Fasano, G.; Bettoni, D.; Cava, A.; Couch, W. J.; Dressler, A.; Kjærgaard, P.; Moles, M.; Pignatelli, E.; Poggianti, B. M.; Valentinuzzi, T.

    2009-04-01

    Context: This is the second paper of a series devoted to the WIde Field Nearby Galaxy-cluster Survey (WINGS). WINGS is a long term project which is gathering wide-field, multi-band imaging and spectroscopy of galaxies in a complete sample of 77 X-ray selected, nearby clusters (0.04 < z < 0.07) located far from the galactic plane (|b|≥ 20°). The main goal of this project is to establish a local reference for evolutionary studies of galaxies and galaxy clusters. Aims: This paper presents the optical (B,V) photometric catalogs of the WINGS sample and describes the procedures followed to construct them. We have paid special care to correctly treat the large extended galaxies (which includes the brightest cluster galaxies) and the reduction of the influence of the bright halos of very bright stars. Methods: We have constructed photometric catalogs based on wide-field images in B and V bands using SExtractor. Photometry has been performed on images in which large galaxies and halos of bright stars were removed after modeling them with elliptical isophotes. Results: We publish deep optical photometric catalogs (90% complete at V ~ 21.7, which translates to ˜ M^*_V+6 at mean redshift), giving positions, geometrical parameters, and several total and aperture magnitudes for all the objects detected. For each field we have produced three catalogs containing galaxies, stars and objects of “unknown” classification (~6%). From simulations we found that the uncertainty of our photometry is quite dependent of the light profile of the objects with stars having the most robust photometry and de Vaucouleurs profiles showing higher uncertainties and also an additional bias of ~-0.2^m. The star/galaxy classification of the bright objects (V < 20) was checked visually making negligible the fraction of misclassified objects. For fainter objects, we found that simulations do not provide reliable estimates of the possible misclassification and therefore we have compared our data with that from deep counts of galaxies and star counts from models of our Galaxy. Both sets turned out to be consistent with our data within ~5% (in the ratio galaxies/total) up to V ~ 24. Finally, we remark that the application of our special procedure to remove large halos improves the photometry of the large galaxies in our sample with respect to the use of blind automatic procedures and increases (~16%) the detection rate of objects projected onto them. Based on observations taken at the Issac Newton Telescope (2.5 m-INT) sited at Roque de los Muchachos (La Palma, Spain), and the MPG/ESO-2.2 m Telescope sited at La Silla (Chile). Appendices are only available in electronic form at http://www.aanda.org Catalog is only available in electronic form at the CDS via anonymous ftp to cdsarc.u-strasbg.fr (130.79.128.5) or via http://cdsweb.u-strasbg.fr/cgi-bin/qcat?J/A+A/497/667

  17. Finding Groups Using Model-Based Cluster Analysis: Heterogeneous Emotional Self-Regulatory Processes and Heavy Alcohol Use Risk

    ERIC Educational Resources Information Center

    Mun, Eun Young; von Eye, Alexander; Bates, Marsha E.; Vaschillo, Evgeny G.

    2008-01-01

    Model-based cluster analysis is a new clustering procedure to investigate population heterogeneity utilizing finite mixture multivariate normal densities. It is an inferentially based, statistically principled procedure that allows comparison of nonnested models using the Bayesian information criterion to compare multiple models and identify the…

  18. Formation of vacancy clusters and cavities in He-implanted silicon studied by slow-positron annihilation spectroscopy

    NASA Astrophysics Data System (ADS)

    Brusa, Roberto S.; Karwasz, Grzegorz P.; Tiengo, Nadia; Zecca, Antonio; Corni, Federico; Tonini, Rita; Ottaviani, Gianpiero

    2000-04-01

    The depth profile of open volume defects has been measured in Si implanted with He at an energy of 20 keV, by means of a slow-positron beam and the Doppler broadening technique. The evolution of defect distributions has been studied as a function of isochronal annealing in two series of samples implanted at the fluence of 5×1015 and 2×1016 He cm-2. A fitting procedure has been applied to the experimental data to extract a positron parameter characterizing each open volume defect. The defects have been identified by comparing this parameter with recent theoretical calculations. In as-implanted samples the major part of vacancies and divacancies produced by implantation is passivated by the presence of He. The mean depth of defects as seen by the positron annihilation technique is about five times less than the helium projected range. During the successive isochronal annealing the number of positron traps decreases, then increases and finally, at the highest annealing temperatures, disappears only in the samples implanted at the lowest fluence. A minimum of open volume defects is reached at the annealing temperature of 250 °C in both series. The increase of open volume defects at temperatures higher than 250 °C is due to the appearance of vacancy clusters of increasing size, with a mean depth distribution that moves towards the He projected range. The appearance of vacancy clusters is strictly related to the out diffusion of He. In the samples implanted at 5×1015 cm-2 the vacancy clusters are mainly four vacancy agglomerates stabilized by He related defects. They disappear starting from an annealing temperature of 700 °C. In the samples implanted at 2×1016 cm-2 and annealed at 850-900 °C the vacancy clusters disappear and only a distribution of cavities centered around the He projected range remains. The role of vacancies in the formation of He clusters, which evolve in bubble and then in cavities, is discussed.

  19. Coping profiles, perceived stress and health-related behaviors: a cluster analysis approach.

    PubMed

    Doron, Julie; Trouillet, Raphael; Maneveau, Anaïs; Ninot, Grégory; Neveu, Dorine

    2015-03-01

    Using cluster analytical procedure, this study aimed (i) to determine whether people could be differentiated on the basis of coping profiles (or unique combinations of coping strategies); and (ii) to examine the relationships between these profiles and perceived stress and health-related behaviors. A sample of 578 French students (345 females, 233 males; M(age)= 21.78, SD(age)= 2.21) completed the Perceived Stress Scale-14 ( Bruchon-Schweitzer, 2002), the Brief COPE ( Muller and Spitz, 2003) and a series of items measuring health-related behaviors. A two-phased cluster analytic procedure (i.e. hierarchical and non-hierarchical-k-means) was employed to derive clusters of coping strategy profiles. The results yielded four distinctive coping profiles: High Copers, Adaptive Copers, Avoidant Copers and Low Copers. The results showed that clusters differed significantly in perceived stress and health-related behaviors. High Copers and Avoidant Copers displayed higher levels of perceived stress and engaged more in unhealthy behavior, compared with Adaptive Copers and Low Copers who reported lower levels of stress and engaged more in healthy behaviors. These findings suggested that individuals' relative reliance on some strategies and de-emphasis on others may be a more advantageous way of understanding the manner in which individuals cope with stress. Therefore, cluster analysis approach may provide an advantage over more traditional statistical techniques by identifying distinct coping profiles that might best benefit from interventions. Future research should consider coping profiles to provide a deeper understanding of the relationships between coping strategies and health outcomes and to identify risk groups. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  20. Analysis of temporal gene expression profiles: clustering by simulated annealing and determining the optimal number of clusters.

    PubMed

    Lukashin, A V; Fuchs, R

    2001-05-01

    Cluster analysis of genome-wide expression data from DNA microarray hybridization studies has proved to be a useful tool for identifying biologically relevant groupings of genes and samples. In the present paper, we focus on several important issues related to clustering algorithms that have not yet been fully studied. We describe a simple and robust algorithm for the clustering of temporal gene expression profiles that is based on the simulated annealing procedure. In general, this algorithm guarantees to eventually find the globally optimal distribution of genes over clusters. We introduce an iterative scheme that serves to evaluate quantitatively the optimal number of clusters for each specific data set. The scheme is based on standard approaches used in regular statistical tests. The basic idea is to organize the search of the optimal number of clusters simultaneously with the optimization of the distribution of genes over clusters. The efficiency of the proposed algorithm has been evaluated by means of a reverse engineering experiment, that is, a situation in which the correct distribution of genes over clusters is known a priori. The employment of this statistically rigorous test has shown that our algorithm places greater than 90% genes into correct clusters. Finally, the algorithm has been tested on real gene expression data (expression changes during yeast cell cycle) for which the fundamental patterns of gene expression and the assignment of genes to clusters are well understood from numerous previous studies.

  1. Dark Energy Survey Year 1 Results: Cross-Correlation Redshifts - Methods and Systematics Characterization

    DOE PAGES

    Gatti, M.

    2018-02-22

    We use numerical simulations to characterize the performance of a clustering-based method to calibrate photometric redshift biases. In particular, we cross-correlate the weak lensing (WL) source galaxies from the Dark Energy Survey Year 1 (DES Y1) sample with redMaGiC galaxies (luminous red galaxies with secure photometric red- shifts) to estimate the redshift distribution of the former sample. The recovered redshift distributions are used to calibrate the photometric redshift bias of standard photo-z methods applied to the same source galaxy sample. We also apply the method to three photo-z codes run in our simulated data: Bayesian Photometric Redshift (BPZ), Directional Neighborhoodmore » Fitting (DNF), and Random Forest-based photo-z (RF). We characterize the systematic uncertainties of our calibration procedure, and find that these systematic uncertainties dominate our error budget. The dominant systematics are due to our assumption of unevolving bias and clustering across each redshift bin, and to differences between the shapes of the redshift distributions derived by clustering vs photo-z's. The systematic uncertainty in the mean redshift bias of the source galaxy sample is z ≲ 0.02, though the precise value depends on the redshift bin under consideration. Here, we discuss possible ways to mitigate the impact of our dominant systematics in future analyses.« less

  2. Dark Energy Survey Year 1 Results: Cross-Correlation Redshifts - Methods and Systematics Characterization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gatti, M.

    We use numerical simulations to characterize the performance of a clustering-based method to calibrate photometric redshift biases. In particular, we cross-correlate the weak lensing (WL) source galaxies from the Dark Energy Survey Year 1 (DES Y1) sample with redMaGiC galaxies (luminous red galaxies with secure photometric red- shifts) to estimate the redshift distribution of the former sample. The recovered redshift distributions are used to calibrate the photometric redshift bias of standard photo-z methods applied to the same source galaxy sample. We also apply the method to three photo-z codes run in our simulated data: Bayesian Photometric Redshift (BPZ), Directional Neighborhoodmore » Fitting (DNF), and Random Forest-based photo-z (RF). We characterize the systematic uncertainties of our calibration procedure, and find that these systematic uncertainties dominate our error budget. The dominant systematics are due to our assumption of unevolving bias and clustering across each redshift bin, and to differences between the shapes of the redshift distributions derived by clustering vs photo-z's. The systematic uncertainty in the mean redshift bias of the source galaxy sample is z ≲ 0.02, though the precise value depends on the redshift bin under consideration. Here, we discuss possible ways to mitigate the impact of our dominant systematics in future analyses.« less

  3. Energy spectra of X-ray clusters of galaxies

    NASA Technical Reports Server (NTRS)

    Avni, Y.

    1976-01-01

    A procedure for estimating the ranges of parameters that describe the spectra of X-rays from clusters of galaxies is presented. The applicability of the method is proved by statistical simulations of cluster spectra; such a proof is necessary because of the nonlinearity of the spectral functions. Implications for the spectra of the Perseus, Coma, and Virgo clusters are discussed. The procedure can be applied in more general problems of parameter estimation.

  4. Behavior of optical properties of coagulated blood sample at 633 nm wavelength

    NASA Astrophysics Data System (ADS)

    Morales Cruzado, Beatriz; Vázquez y Montiel, Sergio; Delgado Atencio, José Alberto

    2011-03-01

    Determination of tissue optical parameters is fundamental for application of light in either diagnostics or therapeutical procedures. However, in samples of biological tissue in vitro, the optical properties are modified by cellular death or cellular agglomeration that can not be avoided. This phenomena change the propagation of light within the biological sample. Optical properties of human blood tissue were investigated in vitro at 633 nm using an optical setup that includes a double integrating sphere system. We measure the diffuse transmittance and diffuse reflectance of the blood sample and compare these physical properties with those obtained by Monte Carlo Multi-Layered (MCML). The extraction of the optical parameters: absorption coefficient μa, scattering coefficient μs and anisotropic factor g from the measurements were carried out using a Genetic Algorithm, in which the search procedure is based in the evolution of a population due to selection of the best individual, evaluated by a function that compares the diffuse transmittance and diffuse reflectance of those individuals with the experimental ones. The algorithm converges rapidly to the best individual, extracting the optical parameters of the sample. We compare our results with those obtained by using other retrieve procedures. We found that the scattering coefficient and the anisotropic factor change dramatically due to the formation of clusters.

  5. Markov Chain Monte Carlo Joint Analysis of Chandra X-Ray Imaging Spectroscopy and Sunyaev-Zel'dovich Effect Data

    NASA Technical Reports Server (NTRS)

    Bonamente, Massimillano; Joy, Marshall K.; Carlstrom, John E.; Reese, Erik D.; LaRoque, Samuel J.

    2004-01-01

    X-ray and Sunyaev-Zel'dovich effect data can be combined to determine the distance to galaxy clusters. High-resolution X-ray data are now available from Chandra, which provides both spatial and spectral information, and Sunyaev-Zel'dovich effect data were obtained from the BIMA and Owens Valley Radio Observatory (OVRO) arrays. We introduce a Markov Chain Monte Carlo procedure for the joint analysis of X-ray and Sunyaev- Zel'dovich effect data. The advantages of this method are the high computational efficiency and the ability to measure simultaneously the probability distribution of all parameters of interest, such as the spatial and spectral properties of the cluster gas and also for derivative quantities such as the distance to the cluster. We demonstrate this technique by applying it to the Chandra X-ray data and the OVRO radio data for the galaxy cluster A611. Comparisons with traditional likelihood ratio methods reveal the robustness of the method. This method will be used in follow-up paper to determine the distances to a large sample of galaxy cluster.

  6. Determination of Cluster Distances from Chandra Imaging Spectroscopy and Sunyaev-Zeldovich Effect Measurements. I; Analysis Methods and Initial Results

    NASA Technical Reports Server (NTRS)

    Bonamente, Massimiliano; Joy, Marshall K.; Carlstrom, John E.; LaRoque, Samuel J.

    2004-01-01

    X-ray and Sunyaev-Zeldovich Effect data ca,n be combined to determine the distance to galaxy clusters. High-resolution X-ray data are now available from the Chandra Observatory, which provides both spatial and spectral information, and interferometric radio measurements of the Sunyam-Zeldovich Effect are available from the BIMA and 0VR.O arrays. We introduce a Monte Carlo Markov chain procedure for the joint analysis of X-ray and Sunyaev-Zeldovich Effect data. The advantages of this method are the high computational efficiency and the ability to measure the full probability distribution of all parameters of interest, such as the spatial and spectral properties of the cluster gas and the cluster distance. We apply this technique to the Chandra X-ray data and the OVRO radio data for the galaxy cluster Abell 611. Comparisons with traditional likelihood-ratio methods reveal the robustness of the method. This method will be used in a follow-up paper to determine the distance of a large sample of galaxy clusters for which high-resolution Chandra X-ray and BIMA/OVRO radio data are available.

  7. Identifying and Assessing Interesting Subgroups in a Heterogeneous Population.

    PubMed

    Lee, Woojoo; Alexeyenko, Andrey; Pernemalm, Maria; Guegan, Justine; Dessen, Philippe; Lazar, Vladimir; Lehtiö, Janne; Pawitan, Yudi

    2015-01-01

    Biological heterogeneity is common in many diseases and it is often the reason for therapeutic failures. Thus, there is great interest in classifying a disease into subtypes that have clinical significance in terms of prognosis or therapy response. One of the most popular methods to uncover unrecognized subtypes is cluster analysis. However, classical clustering methods such as k-means clustering or hierarchical clustering are not guaranteed to produce clinically interesting subtypes. This could be because the main statistical variability--the basis of cluster generation--is dominated by genes not associated with the clinical phenotype of interest. Furthermore, a strong prognostic factor might be relevant for a certain subgroup but not for the whole population; thus an analysis of the whole sample may not reveal this prognostic factor. To address these problems we investigate methods to identify and assess clinically interesting subgroups in a heterogeneous population. The identification step uses a clustering algorithm and to assess significance we use a false discovery rate- (FDR-) based measure. Under the heterogeneity condition the standard FDR estimate is shown to overestimate the true FDR value, but this is remedied by an improved FDR estimation procedure. As illustrations, two real data examples from gene expression studies of lung cancer are provided.

  8. Reweighted mass center based object-oriented sparse subspace clustering for hyperspectral images

    NASA Astrophysics Data System (ADS)

    Zhai, Han; Zhang, Hongyan; Zhang, Liangpei; Li, Pingxiang

    2016-10-01

    Considering the inevitable obstacles faced by the pixel-based clustering methods, such as salt-and-pepper noise, high computational complexity, and the lack of spatial information, a reweighted mass center based object-oriented sparse subspace clustering (RMC-OOSSC) algorithm for hyperspectral images (HSIs) is proposed. First, the mean-shift segmentation method is utilized to oversegment the HSI to obtain meaningful objects. Second, a distance reweighted mass center learning model is presented to extract the representative and discriminative features for each object. Third, assuming that all the objects are sampled from a union of subspaces, it is natural to apply the SSC algorithm to the HSI. Faced with the high correlation among the hyperspectral objects, a weighting scheme is adopted to ensure that the highly correlated objects are preferred in the procedure of sparse representation, to reduce the representation errors. Two widely used hyperspectral datasets were utilized to test the performance of the proposed RMC-OOSSC algorithm, obtaining high clustering accuracies (overall accuracy) of 71.98% and 89.57%, respectively. The experimental results show that the proposed method clearly improves the clustering performance with respect to the other state-of-the-art clustering methods, and it significantly reduces the computational time.

  9. Refining a complex diagnostic construct: subtyping Dysthymia with the Shedler-Westen Assessment Procedure-II.

    PubMed

    Huprich, Steven K; Defife, Jared; Westen, Drew

    2014-01-01

    We sought to determine whether meaningful subtypes of Dysthymic patients could be identified when grouping them by similar personality profiles. A random, national sample of psychiatrists and clinical psychologists (n=1201) described a randomly selected current patient with personality pathology using the descriptors in the Shedler-Westen Assessment Procedure-II (SWAP-II), completed assessments of patients' adaptive functioning, and provided DSM-IV Axis I and II diagnoses. We applied Q-factor cluster analyses to those patients diagnosed with Dysthymic Disorder. Four clusters were identified-High Functioning, Anxious/Dysphoric, Emotionally Dysregulated, and Narcissistic. These factor scores corresponded with a priori hypotheses regarding diagnostic comorbidity and level of adaptive functioning. We compared these groups to diagnostic constructs described and empirically identified in the past literature. The results converge with past and current ideas about the ways in which chronic depression and personality are related and offer an enhanced means by which to understand a heterogeneous diagnostic category that is empirically grounded and clinically useful. © 2013 Published by Elsevier B.V.

  10. A New Variable Weighting and Selection Procedure for K-Means Cluster Analysis

    ERIC Educational Resources Information Center

    Steinley, Douglas; Brusco, Michael J.

    2008-01-01

    A variance-to-range ratio variable weighting procedure is proposed. We show how this weighting method is theoretically grounded in the inherent variability found in data exhibiting cluster structure. In addition, a variable selection procedure is proposed to operate in conjunction with the variable weighting technique. The performances of these…

  11. GC-MS analyses and chemometric processing to discriminate the local and long-distance sources of PAHs associated to atmospheric PM2.5.

    PubMed

    Masiol, Mauro; Centanni, Elena; Squizzato, Stefania; Hofer, Angelika; Pecorari, Eliana; Rampazzo, Giancarlo; Pavoni, Bruno

    2012-09-01

    This study presents a procedure to differentiate the local and remote sources of particulate-bound polycyclic aromatic hydrocarbons (PAHs). Data were collected during an extended PM(2.5) sampling campaign (2009-2010) carried out for 1 year in Venice-Mestre, Italy, at three stations with different emissive scenarios: urban, industrial, and semirural background. Diagnostic ratios and factor analysis were initially applied to point out the most probable sources. In a second step, the areal distribution of the identified sources was studied by applying the discriminant analysis on factor scores. Third, samples collected in days with similar atmospheric circulation patterns were grouped using a cluster analysis on wind data. Local contributions to PM(2.5) and PAHs were then assessed by interpreting cluster results with chemical data. Results evidenced that significantly lower levels of PM(2.5) and PAHs were found when faster winds changed air masses, whereas in presence of scarce ventilation, locally emitted pollutants were trapped and concentrations increased. This way, an estimation of pollutant loads due to local sources can be derived from data collected in days with similar wind patterns. Long-range contributions were detected by a cluster analysis on the air mass back-trajectories. Results revealed that PM(2.5) concentrations were relatively high when air masses had passed over the Po Valley. However, external sources do not significantly contribute to the PAHs load. The proposed procedure can be applied to other environments with minor modifications, and the obtained information can be useful to design local and national air pollution control strategies.

  12. Milestoning with coarse memory

    NASA Astrophysics Data System (ADS)

    Hawk, Alexander T.

    2013-04-01

    Milestoning is a method used to calculate the kinetics of molecular processes occurring on timescales inaccessible to traditional molecular dynamics (MD) simulations. In the method, the phase space of the system is partitioned by milestones (hypersurfaces), trajectories are initialized on each milestone, and short MD simulations are performed to calculate transitions between neighboring milestones. Long trajectories of the system are then reconstructed with a semi-Markov process from the observed statistics of transition. The procedure is typically justified by the assumption that trajectories lose memory between crossing successive milestones. Here we present Milestoning with Coarse Memory (MCM), a generalization of Milestoning that relaxes the memory loss assumption of conventional Milestoning. In the method, milestones are defined and sample transitions are calculated in the standard Milestoning way. Then, after it is clear where trajectories sample milestones, the milestones are broken up into distinct neighborhoods (clusters), and each sample transition is associated with two clusters: the cluster containing the coordinates the trajectory was initialized in, and the cluster (on the terminal milestone) containing trajectory's final coordinates. Long trajectories of the system are then reconstructed with a semi-Markov process in an extended state space built from milestone and cluster indices. To test the method, we apply it to a process that is particularly ill suited for Milestoning: the dynamics of a polymer confined to a narrow cylinder. We show that Milestoning calculations of both the mean first passage time and the mean transit time of reversal—which occurs when the end-to-end vector reverses direction—are significantly improved when MCM is applied. Finally, we note the overhead of performing MCM on top of conventional Milestoning is negligible.

  13. Cross-correlation redshift calibration without spectroscopic calibration samples in DES Science Verification Data

    NASA Astrophysics Data System (ADS)

    Davis, C.; Rozo, E.; Roodman, A.; Alarcon, A.; Cawthon, R.; Gatti, M.; Lin, H.; Miquel, R.; Rykoff, E. S.; Troxel, M. A.; Vielzeuf, P.; Abbott, T. M. C.; Abdalla, F. B.; Allam, S.; Annis, J.; Bechtol, K.; Benoit-Lévy, A.; Bertin, E.; Brooks, D.; Buckley-Geer, E.; Burke, D. L.; Carnero Rosell, A.; Carrasco Kind, M.; Carretero, J.; Castander, F. J.; Crocce, M.; Cunha, C. E.; D'Andrea, C. B.; da Costa, L. N.; Desai, S.; Diehl, H. T.; Doel, P.; Drlica-Wagner, A.; Fausti Neto, A.; Flaugher, B.; Fosalba, P.; Frieman, J.; García-Bellido, J.; Gaztanaga, E.; Gerdes, D. W.; Giannantonio, T.; Gruen, D.; Gruendl, R. A.; Gutierrez, G.; Honscheid, K.; Jain, B.; James, D. J.; Jeltema, T.; Krause, E.; Kuehn, K.; Kuhlmann, S.; Kuropatkin, N.; Lahav, O.; Li, T. S.; Lima, M.; March, M.; Marshall, J. L.; Martini, P.; Melchior, P.; Ogando, R. L. C.; Plazas, A. A.; Romer, A. K.; Sanchez, E.; Scarpine, V.; Schindler, R.; Schubnell, M.; Sevilla-Noarbe, I.; Smith, M.; Soares-Santos, M.; Sobreira, F.; Suchyta, E.; Swanson, M. E. C.; Tarle, G.; Thomas, D.; Vikram, V.; Walker, A. R.; Wechsler, R. H.

    2018-06-01

    Galaxy cross-correlations with high-fidelity redshift samples hold the potential to precisely calibrate systematic photometric redshift uncertainties arising from the unavailability of complete and representative training and validation samples of galaxies. However, application of this technique in the Dark Energy Survey (DES) is hampered by the relatively low number density, small area, and modest redshift overlap between photometric and spectroscopic samples. We propose instead using photometric catalogues with reliable photometric redshifts for photo-z calibration via cross-correlations. We verify the viability of our proposal using redMaPPer clusters from the Sloan Digital Sky Survey (SDSS) to successfully recover the redshift distribution of SDSS spectroscopic galaxies. We demonstrate how to combine photo-z with cross-correlation data to calibrate photometric redshift biases while marginalizing over possible clustering bias evolution in either the calibration or unknown photometric samples. We apply our method to DES Science Verification (DES SV) data in order to constrain the photometric redshift distribution of a galaxy sample selected for weak lensing studies, constraining the mean of the tomographic redshift distributions to a statistical uncertainty of Δz ˜ ±0.01. We forecast that our proposal can, in principle, control photometric redshift uncertainties in DES weak lensing experiments at a level near the intrinsic statistical noise of the experiment over the range of redshifts where redMaPPer clusters are available. Our results provide strong motivation to launch a programme to fully characterize the systematic errors from bias evolution and photo-z shapes in our calibration procedure.

  14. Cross-correlation redshift calibration without spectroscopic calibration samples in DES Science Verification Data

    DOE PAGES

    Davis, C.; Rozo, E.; Roodman, A.; ...

    2018-03-26

    Galaxy cross-correlations with high-fidelity redshift samples hold the potential to precisely calibrate systematic photometric redshift uncertainties arising from the unavailability of complete and representative training and validation samples of galaxies. However, application of this technique in the Dark Energy Survey (DES) is hampered by the relatively low number density, small area, and modest redshift overlap between photometric and spectroscopic samples. We propose instead using photometric catalogs with reliable photometric redshifts for photo-z calibration via cross-correlations. We verify the viability of our proposal using redMaPPer clusters from the Sloan Digital Sky Survey (SDSS) to successfully recover the redshift distribution of SDSS spectroscopic galaxies. We demonstrate how to combine photo-z with cross-correlation data to calibrate photometric redshift biases while marginalizing over possible clustering bias evolution in either the calibration or unknown photometric samples. We apply our method to DES Science Verification (DES SV) data in order to constrain the photometric redshift distribution of a galaxy sample selected for weak lensing studies, constraining the mean of the tomographic redshift distributions to a statistical uncertainty ofmore » $$\\Delta z \\sim \\pm 0.01$$. We forecast that our proposal can in principle control photometric redshift uncertainties in DES weak lensing experiments at a level near the intrinsic statistical noise of the experiment over the range of redshifts where redMaPPer clusters are available. Here, our results provide strong motivation to launch a program to fully characterize the systematic errors from bias evolution and photo-z shapes in our calibration procedure.« less

  15. Cross-correlation redshift calibration without spectroscopic calibration samples in DES Science Verification Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Davis, C.; Rozo, E.; Roodman, A.

    Galaxy cross-correlations with high-fidelity redshift samples hold the potential to precisely calibrate systematic photometric redshift uncertainties arising from the unavailability of complete and representative training and validation samples of galaxies. However, application of this technique in the Dark Energy Survey (DES) is hampered by the relatively low number density, small area, and modest redshift overlap between photometric and spectroscopic samples. We propose instead using photometric catalogs with reliable photometric redshifts for photo-z calibration via cross-correlations. We verify the viability of our proposal using redMaPPer clusters from the Sloan Digital Sky Survey (SDSS) to successfully recover the redshift distribution of SDSS spectroscopic galaxies. We demonstrate how to combine photo-z with cross-correlation data to calibrate photometric redshift biases while marginalizing over possible clustering bias evolution in either the calibration or unknown photometric samples. We apply our method to DES Science Verification (DES SV) data in order to constrain the photometric redshift distribution of a galaxy sample selected for weak lensing studies, constraining the mean of the tomographic redshift distributions to a statistical uncertainty ofmore » $$\\Delta z \\sim \\pm 0.01$$. We forecast that our proposal can in principle control photometric redshift uncertainties in DES weak lensing experiments at a level near the intrinsic statistical noise of the experiment over the range of redshifts where redMaPPer clusters are available. Here, our results provide strong motivation to launch a program to fully characterize the systematic errors from bias evolution and photo-z shapes in our calibration procedure.« less

  16. Method for evaluating wind turbine wake effects on wind farm performance

    NASA Technical Reports Server (NTRS)

    Neustadter, H. E.; Spera, D. A.

    1985-01-01

    A method of testing the performance of a cluster of wind turbine units an data analysis equations are presented which together form a simple and direct procedure for determining the reduction in energy output caused by the wake of an upwind turbine. This method appears to solve the problems presented by data scatter and wind variability. Test data from the three-unit Mod-2 wind turbine cluster at Goldendale, Washington, are analyzed to illustrate the application of the proposed method. In this sample case the reduction in energy was found to be about 10 percent when the Mod-2 units were separated a distance equal to seven diameters and winds were below rated.

  17. Source Apportionment of Atmospheric Particles by Electron Probe X-Ray Microanalysis and Receptor Models.

    NASA Astrophysics Data System (ADS)

    van Borm, Werner August

    Electron probe X-ray microanalysis (EPXMA) in combination with an automation system and an energy-dispersive X-ray detection system was used to analyse thousands of microscopical particles, originating from the ambient atmosphere. The huge amount of data was processed by a newly developed X-ray correction method and a number of data reduction procedures. A standardless ZAF procedure for EPXMA was developed for quick semi-quantitative analysis of particles starting from simple corrections, valid for bulk samples and modified taking into account the particle finit diameter, assuming a spherical shape. Tested on a limited database of bulk and particulate samples, the compromise between calculation speed and accuracy yielded for elements with Z > 14 accuracies on concentrations less than 10% while absolute deviations remained below 4 weight%, thus being only important for low concentrations. Next, the possibilities for the use of supervised and unsupervised multivariate particle classification were investigated for source apportionment of individual particles. In a detailed study of the unsupervised cluster analysis technique several aspects were considered, that have a severe influence on the final cluster analysis results, i.e. data acquisition, X-ray peak identification, data normalization, scaling, variable selection, similarity measure, cluster strategy, cluster significance and error propagation. A supervised approach was developed using an expert system-like approach in which identification rules are builded to describe the particle classes in a unique manner. Applications are presented for particles sampled (1) near a zinc smelter (Vieille-Montagne, Balen, Belgium), analyzed for heavy metals, (2) in an urban aerosol (Antwerp, Belgium), analyzed for over 20 elements and (3) in a rural aerosol originating from a swiss mountain area (Bern). Thus is was possible to pinpoint a number of known and unknown sources and characterize their emissions in terms of particles abundance and particle composition. Alternatively, the bulk analysis of filters (total, fine and coarse mode) using Particle Induced X -Ray Emission (PIXE) and the application of a receptor modeling approach provided for complementary information on a macroscopical level. A computer program was developed incorporating an absolute factor analysis based receptor modeling procedure. Source profiles and contributions are described by elemental concentrations and an atmospheric mass balance is put forward. The latter method was applied in a two year study of the Antwerp urban aerosol and for the swiss aerosol, revealing a number of previously known and unknown sources. Both methods were successfully combined to increase the source resolution.

  18. Computer program documentation: ISOCLS iterative self-organizing clustering program, program C094

    NASA Technical Reports Server (NTRS)

    Minter, R. T. (Principal Investigator)

    1972-01-01

    The author has identified the following significant results. This program implements an algorithm which, ideally, sorts a given set of multivariate data points into similar groups or clusters. The program is intended for use in the evaluation of multispectral scanner data; however, the algorithm could be used for other data types as well. The user may specify a set of initial estimated cluster means to begin the procedure, or he may begin with the assumption that all the data belongs to one cluster. The procedure is initiatized by assigning each data point to the nearest (in absolute distance) cluster mean. If no initial cluster means were input, all of the data is assigned to cluster 1. The means and standard deviations are calculated for each cluster.

  19. Knowledge and attitude towards total knee arthroplasty among the public in Saudi Arabia: a nationwide population-based study.

    PubMed

    Al-Mohrej, Omar A; Alshammari, Faris O; Aljuraisi, Abdulrahman M; Bin Amer, Lujain A; Masuadi, Emad M; Al-Kenani, Nader S

    2018-04-01

    Studies on total knee arthroplasty (TKA) in Saudi Arabia are scarce, and none have reported the knowledge and attitude of the procedure in Saudi Arabia. Our study aims to measure the knowledge and attitude of TKA among the adult Saudi population. To encompass a representative sample of this cross-sectional survey, all 13 administrative areas were used as ready-made geographical clusters. For each cluster, stratified random sampling was performed to maximize participation in the study. In each area, random samples of mobile phone numbers were selected with a probability proportional to the administrative area population size. Sample size calculation was based on the assumption that 50% of the participants would have some level of knowledge, with a 2% margin of error and 95% confidence level. To reach our intended sample size of 1540, we contacted 1722 participants with a response rate of 89.4%. The expected percentage of public knowledge was 50%; however, the actual percentage revealed by this study was much lower (29.7%). A stepwise multiple logistic regression was used to assess the factors that positively affected the knowledge score regarding TKA. Age [P = 0.016 with OR of 0.47], higher income [P = 0.001 with OR of 0.52] and participants with a positive history of TKA or that have known someone who underwent the surgery [P < 0.001 with OR of 0.15] had a positive impact on the total knowledge score. There are still misconceptions among the public in Saudi Arabia concerning TKA, its indications and results. We recommend that doctors use the results of our survey to assess their conversations with their patients, and to determine whether the results of the procedure are adequately clarified.

  20. Significance tests for functional data with complex dependence structure.

    PubMed

    Staicu, Ana-Maria; Lahiri, Soumen N; Carroll, Raymond J

    2015-01-01

    We propose an L 2 -norm based global testing procedure for the null hypothesis that multiple group mean functions are equal, for functional data with complex dependence structure. Specifically, we consider the setting of functional data with a multilevel structure of the form groups-clusters or subjects-units, where the unit-level profiles are spatially correlated within the cluster, and the cluster-level data are independent. Orthogonal series expansions are used to approximate the group mean functions and the test statistic is estimated using the basis coefficients. The asymptotic null distribution of the test statistic is developed, under mild regularity conditions. To our knowledge this is the first work that studies hypothesis testing, when data have such complex multilevel functional and spatial structure. Two small-sample alternatives, including a novel block bootstrap for functional data, are proposed, and their performance is examined in simulation studies. The paper concludes with an illustration of a motivating experiment.

  1. Automation of disbond detection in aircraft fuselage through thermal image processing

    NASA Technical Reports Server (NTRS)

    Prabhu, D. R.; Winfree, W. P.

    1992-01-01

    A procedure for interpreting thermal images obtained during the nondestructive evaluation of aircraft bonded joints is presented. The procedure operates on time-derivative thermal images and resulted in a disbond image with disbonds highlighted. The size of the 'black clusters' in the output disbond image is a quantitative measure of disbond size. The procedure is illustrated using simulation data as well as data obtained through experimental testing of fabricated samples and aircraft panels. Good results are obtained, and, except in pathological cases, 'false calls' in the cases studied appeared only as noise in the output disbond image which was easily filtered out. The thermal detection technique coupled with an automated image interpretation capability will be a very fast and effective method for inspecting bonded joints in an aircraft structure.

  2. Object Tracking Using Adaptive Covariance Descriptor and Clustering-Based Model Updating for Visual Surveillance

    PubMed Central

    Qin, Lei; Snoussi, Hichem; Abdallah, Fahed

    2014-01-01

    We propose a novel approach for tracking an arbitrary object in video sequences for visual surveillance. The first contribution of this work is an automatic feature extraction method that is able to extract compact discriminative features from a feature pool before computing the region covariance descriptor. As the feature extraction method is adaptive to a specific object of interest, we refer to the region covariance descriptor computed using the extracted features as the adaptive covariance descriptor. The second contribution is to propose a weakly supervised method for updating the object appearance model during tracking. The method performs a mean-shift clustering procedure among the tracking result samples accumulated during a period of time and selects a group of reliable samples for updating the object appearance model. As such, the object appearance model is kept up-to-date and is prevented from contamination even in case of tracking mistakes. We conducted comparing experiments on real-world video sequences, which confirmed the effectiveness of the proposed approaches. The tracking system that integrates the adaptive covariance descriptor and the clustering-based model updating method accomplished stable object tracking on challenging video sequences. PMID:24865883

  3. Design of partially supervised classifiers for multispectral image data

    NASA Technical Reports Server (NTRS)

    Jeon, Byeungwoo; Landgrebe, David

    1993-01-01

    A partially supervised classification problem is addressed, especially when the class definition and corresponding training samples are provided a priori only for just one particular class. In practical applications of pattern classification techniques, a frequently observed characteristic is the heavy, often nearly impossible requirements on representative prior statistical class characteristics of all classes in a given data set. Considering the effort in both time and man-power required to have a well-defined, exhaustive list of classes with a corresponding representative set of training samples, this 'partially' supervised capability would be very desirable, assuming adequate classifier performance can be obtained. Two different classification algorithms are developed to achieve simplicity in classifier design by reducing the requirement of prior statistical information without sacrificing significant classifying capability. The first one is based on optimal significance testing, where the optimal acceptance probability is estimated directly from the data set. In the second approach, the partially supervised classification is considered as a problem of unsupervised clustering with initially one known cluster or class. A weighted unsupervised clustering procedure is developed to automatically define other classes and estimate their class statistics. The operational simplicity thus realized should make these partially supervised classification schemes very viable tools in pattern classification.

  4. Automated modal parameter estimation using correlation analysis and bootstrap sampling

    NASA Astrophysics Data System (ADS)

    Yaghoubi, Vahid; Vakilzadeh, Majid K.; Abrahamsson, Thomas J. S.

    2018-02-01

    The estimation of modal parameters from a set of noisy measured data is a highly judgmental task, with user expertise playing a significant role in distinguishing between estimated physical and noise modes of a test-piece. Various methods have been developed to automate this procedure. The common approach is to identify models with different orders and cluster similar modes together. However, most proposed methods based on this approach suffer from high-dimensional optimization problems in either the estimation or clustering step. To overcome this problem, this study presents an algorithm for autonomous modal parameter estimation in which the only required optimization is performed in a three-dimensional space. To this end, a subspace-based identification method is employed for the estimation and a non-iterative correlation-based method is used for the clustering. This clustering is at the heart of the paper. The keys to success are correlation metrics that are able to treat the problems of spatial eigenvector aliasing and nonunique eigenvectors of coalescent modes simultaneously. The algorithm commences by the identification of an excessively high-order model from frequency response function test data. The high number of modes of this model provides bases for two subspaces: one for likely physical modes of the tested system and one for its complement dubbed the subspace of noise modes. By employing the bootstrap resampling technique, several subsets are generated from the same basic dataset and for each of them a model is identified to form a set of models. Then, by correlation analysis with the two aforementioned subspaces, highly correlated modes of these models which appear repeatedly are clustered together and the noise modes are collected in a so-called Trashbox cluster. Stray noise modes attracted to the mode clusters are trimmed away in a second step by correlation analysis. The final step of the algorithm is a fuzzy c-means clustering procedure applied to a three-dimensional feature space to assign a degree of physicalness to each cluster. The proposed algorithm is applied to two case studies: one with synthetic data and one with real test data obtained from a hammer impact test. The results indicate that the algorithm successfully clusters similar modes and gives a reasonable quantification of the extent to which each cluster is physical.

  5. I. Excluded volume effects in Ising cluster distributions and nuclear multifragmentation. II. Multiple-chance effects in alpha-particle evaporation

    NASA Astrophysics Data System (ADS)

    Breus, Dimitry Eugene

    In Part I, geometric clusters of the Ising model are studied as possible model clusters for nuclear multifragmentation. These clusters may not be considered as non-interacting (ideal gas) due to excluded volume effect which predominantly is the artifact of the cluster's finite size. Interaction significantly complicates the use of clusters in the analysis of thermodynamic systems. Stillinger's theory is used as a basis for the analysis, which within the RFL (Reiss, Frisch, Lebowitz) fluid-of-spheres approximation produces a prediction for cluster concentrations well obeyed by geometric clusters of the Ising model. If thermodynamic condition of phase coexistence is met, these concentrations can be incorporated into a differential equation procedure of moderate complexity to elucidate the liquid-vapor phase diagram of the system with cluster interaction included. The drawback of increased complexity is outweighted by the reward of greater accuracy of the phase diagram, as it is demonstrated by the Ising model. A novel nuclear-cluster analysis procedure is developed by modifying Fisher's model to contain cluster interaction and employing the differential equation procedure to obtain thermodynamic variables. With this procedure applied to geometric clusters, the guidelines are developed to look for excluded volume effect in nuclear multifragmentation. In Part II, an explanation is offered for the recently observed oscillations in the energy spectra of alpha-particles emitted from hot compound nuclei. Contrary to what was previously expected, the oscillations are assumed to be caused by the multiple-chance nature of alpha-evaporation. In a semi-empirical fashion this assumption is successfully confirmed by a technique of two-spectra decomposition which treats experimental alpha-spectra as having contributions from at least two independent emitters. Building upon the success of the multiple-chance explanation of the oscillations, Moretto's single-chance evaporation theory is augmented to include multiple-chance emission and tested on experimental data to yield positive results.

  6. Discrimination and chemical phylogenetic study of seven species of Dendrobium using infrared spectroscopy combined with cluster analysis

    NASA Astrophysics Data System (ADS)

    Luo, Congpei; He, Tao; Chun, Ze

    2013-04-01

    Dendrobium is a commonly used and precious herb in Traditional Chinese Medicine. The high biodiversity of Dendrobium and the therapeutic needs require tools for the correct and fast discrimination of different Dendrobium species. This study investigates Fourier transform infrared spectroscopy followed by cluster analysis for discrimination and chemical phylogenetic study of seven Dendrobium species. Despite the general pattern of the IR spectra, different intensities, shapes, peak positions were found in the IR spectra of these samples, especially in the range of 1800-800 cm-1. The second derivative transformation and alcoholic extracting procedure obviously enlarged the tiny spectral differences among these samples. The results indicated each Dendrobium species had a characteristic IR spectra profile, which could be used to discriminate them. The similarity coefficients among the samples were analyzed based on their second derivative IR spectra, which ranged from 0.7632 to 0.9700, among the seven Dendrobium species, and from 0.5163 to 0.9615, among the ethanol extracts. A dendrogram was constructed based on cluster analysis the IR spectra for studying the chemical phylogenetic relationships among the samples. The results indicated that D. denneanum and D. crepidatum could be the alternative resources to substitute D. chrysotoxum, D. officinale and D. nobile which were officially recorded in Chinese Pharmacopoeia. In conclusion, with the advantages of high resolution, speediness and convenience, the experimental approach can successfully discriminate and construct the chemical phylogenetic relationships of the seven Dendrobium species.

  7. The neural substrates of procrastination: A voxel-based morphometry study.

    PubMed

    Hu, Yue; Liu, Peiwei; Guo, Yiqun; Feng, Tingyong

    2018-03-01

    Procrastination is a pervasive phenomenon across different cultures and brings about lots of serious consequences, including performance, subjective well-being, and even public policy. However, little is known about the neural substrates of procrastination. In order to shed light upon this question, we investigated the neuroanatomical substrates of procrastination across two independent samples using voxel-based morphometry (VBM) method. The whole-brain analysis showed procrastination was positively correlated with the graymatter (GM) volume of clusters in the parahippocampal gyrus (PHG) and the orbital frontal cortex (OFC), while negatively correlated with the GM volume of clusters in the inferior frontal gyrus (IFG) and the middle frontal gyrus (MFG) in sample one (151 participants). We further conducted a verification procedure on another sample (108 participants) using region-of-interest analysis to examine the reliability of these results. Results showed procrastination can be predicted by the GM volume of the OFC and the MFG. The present findings suggest that the MFG and OFC, which are the key regions of self-control and emotion regulation, may play an important role in procrastination. Copyright © 2018 Elsevier Inc. All rights reserved.

  8. Using Cluster Bootstrapping to Analyze Nested Data With a Few Clusters.

    PubMed

    Huang, Francis L

    2018-04-01

    Cluster randomized trials involving participants nested within intact treatment and control groups are commonly performed in various educational, psychological, and biomedical studies. However, recruiting and retaining intact groups present various practical, financial, and logistical challenges to evaluators and often, cluster randomized trials are performed with a low number of clusters (~20 groups). Although multilevel models are often used to analyze nested data, researchers may be concerned of potentially biased results due to having only a few groups under study. Cluster bootstrapping has been suggested as an alternative procedure when analyzing clustered data though it has seen very little use in educational and psychological studies. Using a Monte Carlo simulation that varied the number of clusters, average cluster size, and intraclass correlations, we compared standard errors using cluster bootstrapping with those derived using ordinary least squares regression and multilevel models. Results indicate that cluster bootstrapping, though more computationally demanding, can be used as an alternative procedure for the analysis of clustered data when treatment effects at the group level are of primary interest. Supplementary material showing how to perform cluster bootstrapped regressions using R is also provided.

  9. [Space-time suicide clustering in the community of Antequera (Spain)].

    PubMed

    Pérez-Costillas, Lucía; Blasco-Fontecilla, Hilario; Benítez, Nicolás; Comino, Raquel; Antón, José Miguel; Ramos-Medina, Valentín; Lopez, Amalia; Palomo, José Luis; Madrigal, Lucía; Alcalde, Javier; Perea-Millá, Emilio; Artieda-Urrutia, Paula; de León-Martínez, Victoria; de Diego Otero, Yolanda

    2015-01-01

    Approximately 3,500 people commit suicide every year in Spain. The main aim of this study is to explore if a spatial and temporal clustering of suicide exists in the region of Antequera (Málaga, España). Sample and procedure: All suicides from January 1, 2004 to December 31, 2008 were identified using data from the Forensic Pathology Department of the Institute of Legal Medicine, Málaga (España). Geolocalisation. Google Earth was used to calculate the coordinates for each suicide decedent's address. Statistical analysis. A spatiotemporal permutation scan statistic and the Ripley's K function were used to explore spatiotemporal clustering. Pearson's chi-squared was used to determine whether there were differences between suicides inside and outside the spatiotemporal clusters. A total of 120 individuals committed suicide within the region of Antequera, of which 96 (80%) were included in our analyses. Statistically significant evidence for 7 spatiotemporal suicide clusters emerged within critical limits for the 0-2.5 km distance and for the first and second semanas (P<.05 in both cases) after suicide. There was not a single subject diagnosed with a current psychotic disorder, among suicides within clusters, whereas outside clusters, 20% had this diagnosis (X2=4.13; df=1; P<.05). There are spatiotemporal suicide clusters in the area surrounding Antequera. Patients diagnosed with current psychotic disorder are less likely to be influenced by the factors explaining suicide clustering. Copyright © 2013 SEP y SEPB. Published by Elsevier España. All rights reserved.

  10. The development and cross-validation of an MMPI typology of murderers.

    PubMed

    Holcomb, W R; Adams, N A; Ponder, H M

    1985-06-01

    A sample of 80 male offenders charged with premeditated murder were divided into five personality types using MMPI scores. A hierarchical clustering procedure was used with a subsequent internal cross-validation analysis using a second sample of 80 premeditated murderers. A Discriminant Analysis resulted in a 96.25% correct classification of subjects from the second sample into the five types. Clinical data from a mental status interview schedule supported the external validity of these types. There were significant differences among the five types in hallucinations, disorientation, hostility, depression, and paranoid thinking. Both similarities and differences of the present typology with prior research was discussed. Additional research questions were suggested.

  11. A Refined Methodology for Defining Plant Communities Using Postagricultural Data from the Neotropics

    PubMed Central

    Myster, Randall W.

    2012-01-01

    How best to define and quantify plant communities was investigated using long-term plot data sampled from a recovering pasture in Puerto Rico and abandoned sugarcane and banana plantations in Ecuador. Significant positive associations between pairs of old field species were first computed and then clustered together into larger and larger species groups. I found that (1) no pasture or plantation had more than 5% of the possible significant positive associations, (2) clustering metrics showed groups of species participating in similar clusters among the five pasture/plantations over a gradient of decreasing association strength, and (3) there was evidence for repeatable communities—especially after banana cultivation—suggesting that past crops not only persist after abandonment but also form significant associations with invading plants. I then showed how the clustering hierarchy could be used to decide if any two pasture/plantation plots were in the same community, that is, to define old field communities. Finally, I suggested a similar procedure could be used for any plant community where the mechanisms and tolerances of species form the “cohesion” that produces clustering, making plant communities different than random assemblages of species. PMID:22536137

  12. Poly(A)-tag deep sequencing data processing to extract poly(A) sites.

    PubMed

    Wu, Xiaohui; Ji, Guoli; Li, Qingshun Quinn

    2015-01-01

    Polyadenylation [poly(A)] is an essential posttranscriptional processing step in the maturation of eukaryotic mRNA. The advent of next-generation sequencing (NGS) technology has offered feasible means to generate large-scale data and new opportunities for intensive study of polyadenylation, particularly deep sequencing of the transcriptome targeting the junction of 3'-UTR and the poly(A) tail of the transcript. To take advantage of this unprecedented amount of data, we present an automated workflow to identify polyadenylation sites by integrating NGS data cleaning, processing, mapping, normalizing, and clustering. In this pipeline, a series of Perl scripts are seamlessly integrated to iteratively map the single- or paired-end sequences to the reference genome. After mapping, the poly(A) tags (PATs) at the same genome coordinate are grouped into one cleavage site, and the internal priming artifacts removed. Then the ambiguous region is introduced to parse the genome annotation for cleavage site clustering. Finally, cleavage sites within a close range of 24 nucleotides and from different samples can be clustered into poly(A) clusters. This procedure could be used to identify thousands of reliable poly(A) clusters from millions of NGS sequences in different tissues or treatments.

  13. Sample size calculation for stepped wedge and other longitudinal cluster randomised trials.

    PubMed

    Hooper, Richard; Teerenstra, Steven; de Hoop, Esther; Eldridge, Sandra

    2016-11-20

    The sample size required for a cluster randomised trial is inflated compared with an individually randomised trial because outcomes of participants from the same cluster are correlated. Sample size calculations for longitudinal cluster randomised trials (including stepped wedge trials) need to take account of at least two levels of clustering: the clusters themselves and times within clusters. We derive formulae for sample size for repeated cross-section and closed cohort cluster randomised trials with normally distributed outcome measures, under a multilevel model allowing for variation between clusters and between times within clusters. Our formulae agree with those previously described for special cases such as crossover and analysis of covariance designs, although simulation suggests that the formulae could underestimate required sample size when the number of clusters is small. Whether using a formula or simulation, a sample size calculation requires estimates of nuisance parameters, which in our model include the intracluster correlation, cluster autocorrelation, and individual autocorrelation. A cluster autocorrelation less than 1 reflects a situation where individuals sampled from the same cluster at different times have less correlated outcomes than individuals sampled from the same cluster at the same time. Nuisance parameters could be estimated from time series obtained in similarly clustered settings with the same outcome measure, using analysis of variance to estimate variance components. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  14. X-ray versus infrared selection of distant galaxy clusters: A case study using the XMM-LSS and SpARCS cluster samples

    NASA Astrophysics Data System (ADS)

    Willis, J. P.; Ramos-Ceja, M. E.; Muzzin, A.; Pacaud, F.; Yee, H. K. C.; Wilson, G.

    2018-04-01

    We present a comparison of two samples of z > 0.8 galaxy clusters selected using different wavelength-dependent techniques and examine the physical differences between them. We consider 18 clusters from the X-ray selected XMM-LSS distant cluster survey and 92 clusters from the optical-MIR selected SpARCS cluster survey. Both samples are selected from the same approximately 9 square degree sky area and we examine them using common XMM-Newton, Spitzer-SWIRE and CFHT Legacy Survey data. Clusters from each sample are compared employing aperture measures of X-ray and MIR emission. We divide the SpARCS distant cluster sample into three sub-samples: a) X-ray bright, b) X-ray faint, MIR bright, and c) X-ray faint, MIR faint clusters. We determine that X-ray and MIR selected clusters display very similar surface brightness distributions of galaxy MIR light. In addition, the average location and amplitude of the galaxy red sequence as measured from stacked colour histograms is very similar in the X-ray and MIR-selected samples. The sub-sample of X-ray faint, MIR bright clusters displays a distribution of BCG-barycentre position offsets which extends to higher values than all other samples. This observation indicates that such clusters may exist in a more disturbed state compared to the majority of the distant cluster population sampled by XMM-LSS and SpARCS. This conclusion is supported by stacked X-ray images for the X-ray faint, MIR bright cluster sub-sample that display weak, centrally-concentrated X-ray emission, consistent with a population of growing clusters accreting from an extended envelope of material.

  15. Sampling procedures for throughfall monitoring: A simulation study

    NASA Astrophysics Data System (ADS)

    Zimmermann, Beate; Zimmermann, Alexander; Lark, Richard Murray; Elsenbeer, Helmut

    2010-01-01

    What is the most appropriate sampling scheme to estimate event-based average throughfall? A satisfactory answer to this seemingly simple question has yet to be found, a failure which we attribute to previous efforts' dependence on empirical studies. Here we try to answer this question by simulating stochastic throughfall fields based on parameters for statistical models of large monitoring data sets. We subsequently sampled these fields with different sampling designs and variable sample supports. We evaluated the performance of a particular sampling scheme with respect to the uncertainty of possible estimated means of throughfall volumes. Even for a relative error limit of 20%, an impractically large number of small, funnel-type collectors would be required to estimate mean throughfall, particularly for small events. While stratification of the target area is not superior to simple random sampling, cluster random sampling involves the risk of being less efficient. A larger sample support, e.g., the use of trough-type collectors, considerably reduces the necessary sample sizes and eliminates the sensitivity of the mean to outliers. Since the gain in time associated with the manual handling of troughs versus funnels depends on the local precipitation regime, the employment of automatically recording clusters of long troughs emerges as the most promising sampling scheme. Even so, a relative error of less than 5% appears out of reach for throughfall under heterogeneous canopies. We therefore suspect a considerable uncertainty of input parameters for interception models derived from measured throughfall, in particular, for those requiring data of small throughfall events.

  16. Identifying and Assessing Interesting Subgroups in a Heterogeneous Population

    PubMed Central

    Lee, Woojoo; Alexeyenko, Andrey; Pernemalm, Maria; Guegan, Justine; Dessen, Philippe; Lazar, Vladimir; Lehtiö, Janne; Pawitan, Yudi

    2015-01-01

    Biological heterogeneity is common in many diseases and it is often the reason for therapeutic failures. Thus, there is great interest in classifying a disease into subtypes that have clinical significance in terms of prognosis or therapy response. One of the most popular methods to uncover unrecognized subtypes is cluster analysis. However, classical clustering methods such as k-means clustering or hierarchical clustering are not guaranteed to produce clinically interesting subtypes. This could be because the main statistical variability—the basis of cluster generation—is dominated by genes not associated with the clinical phenotype of interest. Furthermore, a strong prognostic factor might be relevant for a certain subgroup but not for the whole population; thus an analysis of the whole sample may not reveal this prognostic factor. To address these problems we investigate methods to identify and assess clinically interesting subgroups in a heterogeneous population. The identification step uses a clustering algorithm and to assess significance we use a false discovery rate- (FDR-) based measure. Under the heterogeneity condition the standard FDR estimate is shown to overestimate the true FDR value, but this is remedied by an improved FDR estimation procedure. As illustrations, two real data examples from gene expression studies of lung cancer are provided. PMID:26339613

  17. X-ray versus infrared selection of distant galaxy clusters: a case study using the XMM-LSS and SpARCS cluster samples

    NASA Astrophysics Data System (ADS)

    Willis, J. P.; Ramos-Ceja, M. E.; Muzzin, A.; Pacaud, F.; Yee, H. K. C.; Wilson, G.

    2018-07-01

    We present a comparison of two samples of z> 0.8 galaxy clusters selected using different wavelength-dependent techniques and examine the physical differences between them. We consider 18 clusters from the X-ray-selected XMM Large Scale Structure (LSS) distant cluster survey and 92 clusters from the optical-mid-infrared (MIR)-selected Spitzer Adaptation of the Red Sequence Cluster survey (SpARCS) cluster survey. Both samples are selected from the same approximately 9 sq deg sky area and we examine them using common XMM-Newton, Spitizer Wide-Area Infrared Extra-galactic (SWIRE) survey, and Canada-France-Hawaii Telescope Legacy Survey data. Clusters from each sample are compared employing aperture measures of X-ray and MIR emission. We divide the SpARCS distant cluster sample into three sub-samples: (i) X-ray bright, (ii) X-ray faint, MIR bright, and (iii) X-ray faint, MIR faint clusters. We determine that X-ray- and MIR-selected clusters display very similar surface brightness distributions of galaxy MIR light. In addition, the average location and amplitude of the galaxy red sequence as measured from stacked colour histograms is very similar in the X-ray- and MIR-selected samples. The sub-sample of X-ray faint, MIR bright clusters displays a distribution of brightest cluster galaxy-barycentre position offsets which extends to higher values than all other samples. This observation indicates that such clusters may exist in a more disturbed state compared to the majority of the distant cluster population sampled by XMM-LSS and SpARCS. This conclusion is supported by stacked X-ray images for the X-ray faint, MIR bright cluster sub-sample that display weak, centrally concentrated X-ray emission, consistent with a population of growing clusters accreting from an extended envelope of material.

  18. A Granular Self-Organizing Map for Clustering and Gene Selection in Microarray Data.

    PubMed

    Ray, Shubhra Sankar; Ganivada, Avatharam; Pal, Sankar K

    2016-09-01

    A new granular self-organizing map (GSOM) is developed by integrating the concept of a fuzzy rough set with the SOM. While training the GSOM, the weights of a winning neuron and the neighborhood neurons are updated through a modified learning procedure. The neighborhood is newly defined using the fuzzy rough sets. The clusters (granules) evolved by the GSOM are presented to a decision table as its decision classes. Based on the decision table, a method of gene selection is developed. The effectiveness of the GSOM is shown in both clustering samples and developing an unsupervised fuzzy rough feature selection (UFRFS) method for gene selection in microarray data. While the superior results of the GSOM, as compared with the related clustering methods, are provided in terms of β -index, DB-index, Dunn-index, and fuzzy rough entropy, the genes selected by the UFRFS are not only better in terms of classification accuracy and a feature evaluation index, but also statistically more significant than the related unsupervised methods. The C-codes of the GSOM and UFRFS are available online at http://avatharamg.webs.com/software-code.

  19. Impact of Sampling Density on the Extent of HIV Clustering

    PubMed Central

    Novitsky, Vlad; Moyo, Sikhulile; Lei, Quanhong; DeGruttola, Victor

    2014-01-01

    Abstract Identifying and monitoring HIV clusters could be useful in tracking the leading edge of HIV transmission in epidemics. Currently, greater specificity in the definition of HIV clusters is needed to reduce confusion in the interpretation of HIV clustering results. We address sampling density as one of the key aspects of HIV cluster analysis. The proportion of viral sequences in clusters was estimated at sampling densities from 1.0% to 70%. A set of 1,248 HIV-1C env gp120 V1C5 sequences from a single community in Botswana was utilized in simulation studies. Matching numbers of HIV-1C V1C5 sequences from the LANL HIV Database were used as comparators. HIV clusters were identified by phylogenetic inference under bootstrapped maximum likelihood and pairwise distance cut-offs. Sampling density below 10% was associated with stochastic HIV clustering with broad confidence intervals. HIV clustering increased linearly at sampling density >10%, and was accompanied by narrowing confidence intervals. Patterns of HIV clustering were similar at bootstrap thresholds 0.7 to 1.0, but the extent of HIV clustering decreased with higher bootstrap thresholds. The origin of sampling (local concentrated vs. scattered global) had a substantial impact on HIV clustering at sampling densities ≥10%. Pairwise distances at 10% were estimated as a threshold for cluster analysis of HIV-1 V1C5 sequences. The node bootstrap support distribution provided additional evidence for 10% sampling density as the threshold for HIV cluster analysis. The detectability of HIV clusters is substantially affected by sampling density. A minimal genotyping density of 10% and sampling density of 50–70% are suggested for HIV-1 V1C5 cluster analysis. PMID:25275430

  20. Disordered eating in a Swedish community sample of adolescent girls: subgroups, stability, and associations with body esteem, deliberate self-harm and other difficulties.

    PubMed

    Viborg, Njördur; Wångby-Lundh, Margit; Lundh, Lars-Gunnar; Wallin, Ulf; Johnsson, Per

    2018-01-01

    The developmental study of subtypes of disordered eating (DE) during adolescence may be relevant to understand the development of eating disorders. The purpose of the present study was to identify subgroups with different profiles of DE in a community sample of adolescent girls aged 13-15 years, and to study the stability of these profiles and subgroups over a one-year interval in order to find patterns that may need to be addressed in further research and prevention. Cluster analysis according to the LICUR procedure was performed on five aspects of DE, and the structural and individual stability of these clusters was analysed. The clusters were compared with regard to BMI, body esteem, deliberate self-harm, and other kinds of psychological difficulties. The analysis revealed six clusters (Multiple eating problems including purging, Multiple eating problems without purging, Social eating problems, Weight concerns, Fear of not being able to stop eating, and No eating problems) all of which had structurally stable profiles and five of which showed stability at the individual level. The more pronounced DE clusters (Multiple eating problems including/without purging) were consistently associated with higher levels of psychological difficulties and lower levels of body esteem. Furthermore, girls that reported purging reported engaging in self-harm to a larger extent. Subgroups of 13-15 year old girls show stable patterns of disordered eating that are associated with higher rates of psychological impairment and lower body esteem. The subgroup of girls who engage in purging also engage in more deliberate self-harm.

  1. Observed intra-cluster correlation coefficients in a cluster survey sample of patient encounters in general practice in Australia

    PubMed Central

    Knox, Stephanie A; Chondros, Patty

    2004-01-01

    Background Cluster sample study designs are cost effective, however cluster samples violate the simple random sample assumption of independence of observations. Failure to account for the intra-cluster correlation of observations when sampling through clusters may lead to an under-powered study. Researchers therefore need estimates of intra-cluster correlation for a range of outcomes to calculate sample size. We report intra-cluster correlation coefficients observed within a large-scale cross-sectional study of general practice in Australia, where the general practitioner (GP) was the primary sampling unit and the patient encounter was the unit of inference. Methods Each year the Bettering the Evaluation and Care of Health (BEACH) study recruits a random sample of approximately 1,000 GPs across Australia. Each GP completes details of 100 consecutive patient encounters. Intra-cluster correlation coefficients were estimated for patient demographics, morbidity managed and treatments received. Intra-cluster correlation coefficients were estimated for descriptive outcomes and for associations between outcomes and predictors and were compared across two independent samples of GPs drawn three years apart. Results Between April 1999 and March 2000, a random sample of 1,047 Australian general practitioners recorded details of 104,700 patient encounters. Intra-cluster correlation coefficients for patient demographics ranged from 0.055 for patient sex to 0.451 for language spoken at home. Intra-cluster correlations for morbidity variables ranged from 0.005 for the management of eye problems to 0.059 for management of psychological problems. Intra-cluster correlation for the association between two variables was smaller than the descriptive intra-cluster correlation of each variable. When compared with the April 2002 to March 2003 sample (1,008 GPs) the estimated intra-cluster correlation coefficients were found to be consistent across samples. Conclusions The demonstrated precision and reliability of the estimated intra-cluster correlations indicate that these coefficients will be useful for calculating sample sizes in future general practice surveys that use the GP as the primary sampling unit. PMID:15613248

  2. Multi-Sample Cluster Analysis Using Akaike’s Information Criterion.

    DTIC Science & Technology

    1982-12-20

    Intervals. For more details on these test procedures refer to Gabriel [7J, Krishnaiah (CIlUj, [11]), Srivastava [16), and others. -3- As noted in Consul...723. (4] Consul, P. C. (1969), "The Exact Distributions of Likelihood Criteria for Different Hypotheses," in P. R. Krishnaiah (Ed.), Multivariate...1178. [7] Gabriel, K. R. (1969), "A Comparison of Some lethods of Simultaneous Inference in MANOVA," in P. R. Krishnaiah (Ed.), Multivariate Analysis-lI

  3. Design-based and model-based inference in surveys of freshwater mollusks

    USGS Publications Warehouse

    Dorazio, R.M.

    1999-01-01

    Well-known concepts in statistical inference and sampling theory are used to develop recommendations for planning and analyzing the results of quantitative surveys of freshwater mollusks. Two methods of inference commonly used in survey sampling (design-based and model-based) are described and illustrated using examples relevant in surveys of freshwater mollusks. The particular objectives of a survey and the type of information observed in each unit of sampling can be used to help select the sampling design and the method of inference. For example, the mean density of a sparsely distributed population of mollusks can be estimated with higher precision by using model-based inference or by using design-based inference with adaptive cluster sampling than by using design-based inference with conventional sampling. More experience with quantitative surveys of natural assemblages of freshwater mollusks is needed to determine the actual benefits of different sampling designs and inferential procedures.

  4. THE PREPARATION OF CURRICULUM MATERIALS AND THE DEVELOPMENT OF TEACHERS FOR AN EXPERIMENTAL APPLICATION OF THE CLUSTER CONCEPT OF VOCATIONAL EDUCATION AT THE SECONDARY SCHOOL LEVEL. VOLUME II, INSTRUCTIONAL PLANS FOR THE CONSTRUCTION CLUSTER.

    ERIC Educational Resources Information Center

    MALEY, DONALD

    DESIGNED FOR USE WITH 11TH AND 12TH GRADE STUDENTS, THIS CURRICULUM GUIDE FOR THE OCCUPATIONAL CLUSTER IN CONSTRUCTION WAS DEVELOPED BY PARTICIPATING TEACHERS FROM RESULTS OF THE RESEARCH PROCEDURES DESCRIBED IN VOLUME I (VT 004 162). THE COURSE DESCRIPTION, NEED FOR THE COURSE, COURSE OBJECTIVES, PROCEDURE, AND INSTRUCTIONAL PLAN ARE DISCUSSED…

  5. Hundreds of new cluster candidates in the VISTA Variables in the Vía Láctea survey DR1

    NASA Astrophysics Data System (ADS)

    Barbá, R. H.; Roman-Lopes, A.; Nilo Castellón, J. L.; Firpo, V.; Minniti, D.; Lucas, P.; Emerson, J. P.; Hempel, M.; Soto, M.; Saito, R. K.

    2015-09-01

    Context. VISTA variables in the Vía Láctea is an ESO Public survey dedicated to scanning the bulge and an adjacent portion of the Galactic disk in the fourth quadrant using the VISTA telescope and its near-infrared camera VIRCAM. One of the leading goals of the VVV survey is to contribute to knowledge of the star cluster population of the Milky Way. Aims: To improve the census of Galactic star clusters, we performed a systematic and careful scan of the JHKs images of the Galactic plane section of the VVV survey. Methods: Our detection procedure is based on a combination of stellar density maps and visual inspection of promising features in the J-, H-, and KS-band images. The material examined are VVV JHKS color-composite images corresponding to Data Release 1 of VVV. Results: We report the discovery of 493 new infrared star cluster candidates. The analysis of the spatial distribution show that the clusters are very concentrated in the Galactic plane, presenting some local maxima around the position of large star-forming complexes, such as G305, RCW 95, and RCW 106. The vast majority of the new star cluster candidates are quite compact and generally surrounded by bright and/or dark nebulosities. IRAS point sources are associated with 59% of the sample, while 88% are associated with MSX point sources. GLIMPSE 8 μm images of the cluster candidates show a variety of morphologies, with 292 clusters dominated by knotty sources, while 361 clusters show some kind of nebulosity in this wavelength regime. Spatial cross-correlation with young stellar objects, masers, and extended green-object catalogs suggest that a large sample of the new cluster candidates are extremely young. In particular, 104 star clusters associated with methanol masers are excellent candidates for ongoing massive star formation. Also, there is a special set of sixteen cluster candidates that present clear signposts of star-forming activity having associated simultaneosly dark nebulae, young stellar objects, extended green objects, and masers. Full Tables 1-3 are only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (ftp://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/581/A120

  6. Semisupervised Clustering by Iterative Partition and Regression with Neuroscience Applications

    PubMed Central

    Qian, Guoqi; Wu, Yuehua; Ferrari, Davide; Qiao, Puxue; Hollande, Frédéric

    2016-01-01

    Regression clustering is a mixture of unsupervised and supervised statistical learning and data mining method which is found in a wide range of applications including artificial intelligence and neuroscience. It performs unsupervised learning when it clusters the data according to their respective unobserved regression hyperplanes. The method also performs supervised learning when it fits regression hyperplanes to the corresponding data clusters. Applying regression clustering in practice requires means of determining the underlying number of clusters in the data, finding the cluster label of each data point, and estimating the regression coefficients of the model. In this paper, we review the estimation and selection issues in regression clustering with regard to the least squares and robust statistical methods. We also provide a model selection based technique to determine the number of regression clusters underlying the data. We further develop a computing procedure for regression clustering estimation and selection. Finally, simulation studies are presented for assessing the procedure, together with analyzing a real data set on RGB cell marking in neuroscience to illustrate and interpret the method. PMID:27212939

  7. The Next Generation Virgo Cluster Survey (NGVS). XVIII. Measurement and Calibration of Surface Brightness Fluctuation Distances for Bright Galaxies in Virgo (and Beyond)

    NASA Astrophysics Data System (ADS)

    Cantiello, Michele; Blakeslee, John P.; Ferrarese, Laura; Côté, Patrick; Roediger, Joel C.; Raimondo, Gabriella; Peng, Eric W.; Gwyn, Stephen; Durrell, Patrick R.; Cuillandre, Jean-Charles

    2018-04-01

    We describe a program to measure surface brightness fluctuation (SBF) distances to galaxies observed in the Next Generation Virgo Cluster Survey (NGVS), a photometric imaging survey covering 104 deg2 of the Virgo cluster in the u*, g, i, and z bandpasses with the Canada–France–Hawaii Telescope. We describe the selection of the sample galaxies, the procedures for measuring the apparent i-band SBF magnitude {\\overline{m}}i, and the calibration of the absolute Mibar as a function of observed stellar population properties. The multiband NGVS data set provides multiple options for calibrating the SBF distances, and we explore various calibrations involving individual color indices as well as combinations of two different colors. Within the color range of the present sample, the two-color calibrations do not significantly improve the scatter with respect to wide-baseline, single-color calibrations involving u*. We adopt the ({u}* -z) calibration as a reference for the present galaxy sample, with an observed scatter of 0.11 mag. For a few cases that lack good u* photometry, we use an alternative relation based on a combination of (g-i) and (g-z) colors, with only a slightly larger observed scatter of 0.12 mag. The agreement of our measurements with the best existing distance estimates provides confidence that our measurements are accurate. We present a preliminary catalog of distances for 89 galaxies brighter than B T ≈ 13.0 mag within the survey footprint, including members of the background M and W Clouds at roughly twice the distance of the main body of the Virgo cluster. The extension of the present work to fainter and bluer galaxies is in progress.

  8. Personalized identification of differentially expressed pathways in pediatric sepsis.

    PubMed

    Li, Binjie; Zeng, Qiyi

    2017-10-01

    Sepsis is a leading killer of children worldwide with numerous differentially expressed genes reported to be associated with sepsis. Identifying core pathways in an individual is important for understanding septic mechanisms and for the future application of custom therapeutic decisions. Samples used in the study were from a control group (n=18) and pediatric sepsis group (n=52). Based on Kauffman's attractor theory, differentially expressed pathways associated with pediatric sepsis were detected as attractors. When the distribution results of attractors are consistent with the distribution of total data assessed using support vector machine, the individualized pathway aberrance score (iPAS) was calculated to distinguish differences. Through attractor and Kyoto Encyclopedia of Genes and Genomes functional analysis, 277 enriched pathways were identified as attractors. There were 81 pathways with P<0.05 and 59 pathways with P<0.01. Distribution outcomes of screened attractors were mostly consistent with the total data demonstrated by the six classifying parameters, which suggested the efficiency of attractors. Cluster analysis of pediatric sepsis using the iPAS method identified seven pathway clusters and four sample clusters. Thus, in the majority pediatric sepsis samples, core pathways can be detected as different from accumulated normal samples. In conclusion, a novel procedure that identified the dysregulated attractors in individuals with pediatric sepsis was constructed. Attractors can be markers to identify pathways involved in pediatric sepsis. iPAS may provide a correlation score for each of the signaling pathways present in an individual patient. This process may improve the personalized interpretation of disease mechanisms and may be useful in the forthcoming era of personalized medicine.

  9. Effect of DNA extraction and sample preservation method on rumen bacterial population.

    PubMed

    Fliegerova, Katerina; Tapio, Ilma; Bonin, Aurelie; Mrazek, Jakub; Callegari, Maria Luisa; Bani, Paolo; Bayat, Alireza; Vilkki, Johanna; Kopečný, Jan; Shingfield, Kevin J; Boyer, Frederic; Coissac, Eric; Taberlet, Pierre; Wallace, R John

    2014-10-01

    The comparison of the bacterial profile of intracellular (iDNA) and extracellular DNA (eDNA) isolated from cow rumen content stored under different conditions was conducted. The influence of rumen fluid treatment (cheesecloth squeezed, centrifuged, filtered), storage temperature (RT, -80 °C) and cryoprotectants (PBS-glycerol, ethanol) on quality and quantity parameters of extracted DNA was evaluated by bacterial DGGE analysis, real-time PCR quantification and metabarcoding approach using high-throughput sequencing. Samples clustered according to the type of extracted DNA due to considerable differences between iDNA and eDNA bacterial profiles, while storage temperature and cryoprotectants additives had little effect on sample clustering. The numbers of Firmicutes and Bacteroidetes were lower (P < 0.01) in eDNA samples. The qPCR indicated significantly higher amount of Firmicutes in iDNA sample frozen with glycerol (P < 0.01). Deep sequencing analysis of iDNA samples revealed the prevalence of Bacteroidetes and similarity of samples frozen with and without cryoprotectants, which differed from sample stored with ethanol at room temperature. Centrifugation and consequent filtration of rumen fluid subjected to the eDNA isolation procedure considerably changed the ratio of molecular operational taxonomic units (MOTUs) of Bacteroidetes and Firmicutes. Intracellular DNA extraction using bead-beating method from cheesecloth sieved rumen content mixed with PBS-glycerol and stored at -80 °C was found as the optimal method to study ruminal bacterial profile. Copyright © 2013 Elsevier Ltd. All rights reserved.

  10. THE PREPARATION OF CURRICULUM MATERIALS AND THE DEVELOPMENT OF TEACHERS FOR AN EXPERIMENTAL APPLICATION OF THE CLUSTER CONCEPT OF VOCATIONAL EDUCATION AT THE SECONDARY SCHOOL LEVEL. VOLUME III, INSTRUCTIONAL PLANS FOR THE METAL FORMING AND FABRICATION CLUSTER.

    ERIC Educational Resources Information Center

    MALEY, DONALD

    DESIGNED FOR USE WITH 11TH AND 12TH GRADE STUDENTS, THIS CURRICULUM GUIDE FOR THE OCCUPATIONAL CLUSTER IN METAL FORMING AND FABRICATION WAS DEVELOPED BY PARTICIPATING TEACHERS FROM RESULTS OF THE RESEARCH PROCEDURES DESCRIBED IN VOLUME I (VT 004 162). THE COURSE DESCRIPTION, NEED FOR THE COURSE, COURSE OBJECTIVES, PROCEDURES AND INSTRUCTIONAL PLAN…

  11. THE PREPARATION OF CURRICULUM MATERIALS AND THE DEVELOPMENT OF TEACHERS FOR AN EXPERIMENTAL APPLICATION OF THE CLUSTER CONCEPT OF VOCATIONAL EDUCATION AT THE SECONDARY SCHOOL LEVEL. VOLUME IV, INSTRUCTIONAL PLANS FOR THE ELECTRO-MECHANICAL CLUSTER.

    ERIC Educational Resources Information Center

    MALEY, DONALD

    DESIGNED FOR USE WITH 11TH AND 12TH GRADE STUDENTS, THIS CURRICULUM GUIDE FOR THE OCCUPATIONAL CLUSTER IN ELECTRO-MECHANICAL INSTALLATION AND REPAIR WAS DEVELOPED BY PARTICIPATING TEACHERS FROM RESULTS OF THE RESEARCH PROCEDURES DESCRIBED IN VOLUME I (VT 004 162). THE COURSE DESCRIPTIONS, NEED FOR THE COURSE, COURSE OBJECTIVES, PROCEDURES, AND…

  12. Confidence intervals for a difference between lognormal means in cluster randomization trials.

    PubMed

    Poirier, Julia; Zou, G Y; Koval, John

    2017-04-01

    Cluster randomization trials, in which intact social units are randomized to different interventions, have become popular in the last 25 years. Outcomes from these trials in many cases are positively skewed, following approximately lognormal distributions. When inference is focused on the difference between treatment arm arithmetic means, existent confidence interval procedures either make restricting assumptions or are complex to implement. We approach this problem by assuming log-transformed outcomes from each treatment arm follow a one-way random effects model. The treatment arm means are functions of multiple parameters for which separate confidence intervals are readily available, suggesting that the method of variance estimates recovery may be applied to obtain closed-form confidence intervals. A simulation study showed that this simple approach performs well in small sample sizes in terms of empirical coverage, relatively balanced tail errors, and interval widths as compared to existing methods. The methods are illustrated using data arising from a cluster randomization trial investigating a critical pathway for the treatment of community acquired pneumonia.

  13. Relative efficiency and sample size for cluster randomized trials with variable cluster sizes.

    PubMed

    You, Zhiying; Williams, O Dale; Aban, Inmaculada; Kabagambe, Edmond Kato; Tiwari, Hemant K; Cutter, Gary

    2011-02-01

    The statistical power of cluster randomized trials depends on two sample size components, the number of clusters per group and the numbers of individuals within clusters (cluster size). Variable cluster sizes are common and this variation alone may have significant impact on study power. Previous approaches have taken this into account by either adjusting total sample size using a designated design effect or adjusting the number of clusters according to an assessment of the relative efficiency of unequal versus equal cluster sizes. This article defines a relative efficiency of unequal versus equal cluster sizes using noncentrality parameters, investigates properties of this measure, and proposes an approach for adjusting the required sample size accordingly. We focus on comparing two groups with normally distributed outcomes using t-test, and use the noncentrality parameter to define the relative efficiency of unequal versus equal cluster sizes and show that statistical power depends only on this parameter for a given number of clusters. We calculate the sample size required for an unequal cluster sizes trial to have the same power as one with equal cluster sizes. Relative efficiency based on the noncentrality parameter is straightforward to calculate and easy to interpret. It connects the required mean cluster size directly to the required sample size with equal cluster sizes. Consequently, our approach first determines the sample size requirements with equal cluster sizes for a pre-specified study power and then calculates the required mean cluster size while keeping the number of clusters unchanged. Our approach allows adjustment in mean cluster size alone or simultaneous adjustment in mean cluster size and number of clusters, and is a flexible alternative to and a useful complement to existing methods. Comparison indicated that we have defined a relative efficiency that is greater than the relative efficiency in the literature under some conditions. Our measure of relative efficiency might be less than the measure in the literature under some conditions, underestimating the relative efficiency. The relative efficiency of unequal versus equal cluster sizes defined using the noncentrality parameter suggests a sample size approach that is a flexible alternative and a useful complement to existing methods.

  14. A comparison of regional flood frequency analysis approaches in a simulation framework

    NASA Astrophysics Data System (ADS)

    Ganora, D.; Laio, F.

    2016-07-01

    Regional frequency analysis (RFA) is a well-established methodology to provide an estimate of the flood frequency curve at ungauged (or scarcely gauged) sites. Different RFA approaches exist, depending on the way the information is transferred to the site of interest, but it is not clear in the literature if a specific method systematically outperforms the others. The aim of this study is to provide a framework wherein carrying out the intercomparison by building up a virtual environment based on synthetically generated data. The considered regional approaches include: (i) a unique regional curve for the whole region; (ii) a multiple-region model where homogeneous subregions are determined through cluster analysis; (iii) a Region-of-Influence model which defines a homogeneous subregion for each site; (iv) a spatially smooth estimation procedure where the parameters of the regional model vary continuously along the space. Virtual environments are generated considering different patterns of heterogeneity, including step change and smooth variations. If the region is heterogeneous, with the parent distribution changing continuously within the region, the spatially smooth regional approach outperforms the others, with overall errors 10-50% lower than the other methods. In the case of a step-change, the spatially smooth and clustering procedures perform similarly if the heterogeneity is moderate, while clustering procedures work better when the step-change is severe. To extend our findings, an extensive sensitivity analysis has been performed to investigate the effect of sample length, number of virtual stations, return period of the predicted quantile, variability of the scale parameter of the parent distribution, number of predictor variables and different parent distribution. Overall, the spatially smooth approach appears as the most robust approach as its performances are more stable across different patterns of heterogeneity, especially when short records are considered.

  15. Adaptive clustering procedure for continuous gravitational wave searches

    NASA Astrophysics Data System (ADS)

    Singh, Avneet; Papa, Maria Alessandra; Eggenstein, Heinz-Bernd; Walsh, Sinéad

    2017-10-01

    In hierarchical searches for continuous gravitational waves, clustering of candidates is an important post-processing step because it reduces the number of noise candidates that are followed up at successive stages [J. Aasi et al., Phys. Rev. Lett. 88, 102002 (2013), 10.1103/PhysRevD.88.102002; B. Behnke, M. A. Papa, and R. Prix, Phys. Rev. D 91, 064007 (2015), 10.1103/PhysRevD.91.064007; M. A. Papa et al., Phys. Rev. D 94, 122006 (2016), 10.1103/PhysRevD.94.122006]. Previous clustering procedures bundled together nearby candidates ascribing them to the same root cause (be it a signal or a disturbance), based on a predefined cluster volume. In this paper, we present a procedure that adapts the cluster volume to the data itself and checks for consistency of such volume with what is expected from a signal. This significantly improves the noise rejection capabilities at fixed detection threshold, and at fixed computing resources for the follow-up stages, this results in an overall more sensitive search. This new procedure was employed in the first Einstein@Home search on data from the first science run of the advanced LIGO detectors (O1) [LIGO Scientific Collaboration and Virgo Collaboration, arXiv:1707.02669 [Phys. Rev. D (to be published)

  16. Data Driven Performance Evaluation of Wireless Sensor Networks

    PubMed Central

    Frery, Alejandro C.; Ramos, Heitor S.; Alencar-Neto, José; Nakamura, Eduardo; Loureiro, Antonio A. F.

    2010-01-01

    Wireless Sensor Networks are presented as devices for signal sampling and reconstruction. Within this framework, the qualitative and quantitative influence of (i) signal granularity, (ii) spatial distribution of sensors, (iii) sensors clustering, and (iv) signal reconstruction procedure are assessed. This is done by defining an error metric and performing a Monte Carlo experiment. It is shown that all these factors have significant impact on the quality of the reconstructed signal. The extent of such impact is quantitatively assessed. PMID:22294920

  17. ICAP: An Interactive Cluster Analysis Procedure for analyzing remotely sensed data. [to classify the radiance data to produce a thematic map

    NASA Technical Reports Server (NTRS)

    Wharton, S. W.

    1980-01-01

    An Interactive Cluster Analysis Procedure (ICAP) was developed to derive classifier training statistics from remotely sensed data. The algorithm interfaces the rapid numerical processing capacity of a computer with the human ability to integrate qualitative information. Control of the clustering process alternates between the algorithm, which creates new centroids and forms clusters and the analyst, who evaluate and elect to modify the cluster structure. Clusters can be deleted or lumped pairwise, or new centroids can be added. A summary of the cluster statistics can be requested to facilitate cluster manipulation. The ICAP was implemented in APL (A Programming Language), an interactive computer language. The flexibility of the algorithm was evaluated using data from different LANDSAT scenes to simulate two situations: one in which the analyst is assumed to have no prior knowledge about the data and wishes to have the clusters formed more or less automatically; and the other in which the analyst is assumed to have some knowledge about the data structure and wishes to use that information to closely supervise the clustering process. For comparison, an existing clustering method was also applied to the two data sets.

  18. Tests for informative cluster size using a novel balanced bootstrap scheme.

    PubMed

    Nevalainen, Jaakko; Oja, Hannu; Datta, Somnath

    2017-07-20

    Clustered data are often encountered in biomedical studies, and to date, a number of approaches have been proposed to analyze such data. However, the phenomenon of informative cluster size (ICS) is a challenging problem, and its presence has an impact on the choice of a correct analysis methodology. For example, Dutta and Datta (2015, Biometrics) presented a number of marginal distributions that could be tested. Depending on the nature and degree of informativeness of the cluster size, these marginal distributions may differ, as do the choices of the appropriate test. In particular, they applied their new test to a periodontal data set where the plausibility of the informativeness was mentioned, but no formal test for the same was conducted. We propose bootstrap tests for testing the presence of ICS. A balanced bootstrap method is developed to successfully estimate the null distribution by merging the re-sampled observations with closely matching counterparts. Relying on the assumption of exchangeability within clusters, the proposed procedure performs well in simulations even with a small number of clusters, at different distributions and against different alternative hypotheses, thus making it an omnibus test. We also explain how to extend the ICS test to a regression setting and thereby enhancing its practical utility. The methodologies are illustrated using the periodontal data set mentioned earlier. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  19. Patterning ecological risk of pesticide contamination at the river basin scale.

    PubMed

    Faggiano, Leslie; de Zwart, Dick; García-Berthou, Emili; Lek, Sovan; Gevrey, Muriel

    2010-05-01

    Ecological risk assessment was conducted to determine the risk posed by pesticide mixtures to the Adour-Garonne river basin (south-western France). The objectives of this study were to assess the general state of this basin with regard to pesticide contamination using a risk assessment procedure and to detect patterns in toxic mixture assemblages through a self-organizing map (SOM) methodology in order to identify the locations at risk. Exposure assessment, risk assessment with species sensitivity distribution, and mixture toxicity rules were used to compute six relative risk predictors for different toxic modes of action: the multi-substance potentially affected fraction of species depending on the toxic mode of action of compounds found in the mixture (msPAF CA(TMoA) values). Those predictors computed for the 131 sampling sites assessed in this study were then patterned through the SOM learning process. Four clusters of sampling sites exhibiting similar toxic assemblages were identified. In the first cluster, which comprised 83% of the sampling sites, the risk caused by pesticide mixture toward aquatic species was weak (mean msPAF value for those sites<0.0036%), while in another cluster the risk was significant (mean msPAF<1.09%). GIS mapping allowed an interesting spatial pattern of the distribution of sampling sites for each cluster to be highlighted with a significant and highly localized risk in the French department called "Lot et Garonne". The combined use of the SOM methodology, mixture toxicity modelling and a clear geo-referenced representation of results not only revealed the general state of the Adour-Garonne basin with regard to contamination by pesticides but also enabled to analyze the spatial pattern of toxic mixture assemblage in order to prioritize the locations at risk and to detect the group of compounds causing the greatest risk at the basin scale. Copyright 2010 Elsevier B.V. All rights reserved.

  20. Profiling of new psychoactive substances (NPS) by using stable isotope ratio mass spectrometry (IRMS): study on the synthetic cannabinoid 5F-PB-22.

    PubMed

    Münster-Müller, S; Scheid, N; Holdermann, T; Schneiders, S; Pütz, M

    2018-05-21

    In this paper results of a pilot study on the profiling of the synthetic cannabinoid receptor agonist 5F-PB-22 (5F-QUPIC, pentylfluoro-1H-indole-3-carboxylic acid-8-quinolinyl ester) via isotope ratio mass spectrometry are presented. It is focused on δ 13 C, δ 15 N and δ 2 H isotope ratios, which are determined using elemental analyser (EA) and high temperature elemental analyser (TC/EA) coupled to an isotope ratio mass spectrometer (IRMS). By means of a sample of pure material of 5F-PB-22 it is shown that the extraction of 5F-PB-22 from herbal material, a rapid clean-up procedure, or preparative column chromatography had no influences on the isotope ratios. Furthermore, 5F-PB-22 was extracted from fourteen different herbal blend samples ("Spice products" from police seizures) and analysed via IRMS, yielding three clusters containing seven, five and two samples, distinguishable through their isotopic composition, respectively. It is assumed that herbal blends in each cluster have been manufactured from individual batches of 5F-PB-22. This article is protected by copyright. All rights reserved.

  1. Microbiota in Exhaled Breath Condensate and the Lung.

    PubMed

    Glendinning, Laura; Wright, Steven; Tennant, Peter; Gill, Andrew C; Collie, David; McLachlan, Gerry

    2017-06-15

    The lung microbiota is commonly sampled using relatively invasive bronchoscopic procedures. Exhaled breath condensate (EBC) collection potentially offers a less invasive alternative for lung microbiota sampling. We compared lung microbiota samples retrieved by protected specimen brushings (PSB) and exhaled breath condensate collection. We also sought to assess whether aerosolized antibiotic treatment would influence the lung microbiota and whether this change could be detected in EBC. EBC was collected from 6 conscious sheep and then from the same anesthetized sheep during mechanical ventilation. Following the latter EBC collection, PSB samples were collected from separate sites within each sheep lung. On the subsequent day, each sheep was then treated with nebulized colistimethate sodium. Two days after nebulization, EBC and PSB samples were again collected. Bacterial DNA was quantified using 16S rRNA gene quantitative PCR. The V2-V3 region of the 16S rRNA gene was amplified by PCR and sequenced using Illumina MiSeq. Quality control and operational taxonomic unit (OTU) clustering were performed with mothur. The EBC samples contained significantly less bacterial DNA than the PSB samples. The EBC samples from anesthetized animals clustered separately by their bacterial community compositions in comparison to the PSB samples, and 37 bacterial OTUs were identified as differentially abundant between the two sample types. Despite only low concentrations of colistin being detected in bronchoalveolar lavage fluid, PSB samples were found to differ by their bacterial compositions before and after colistimethate sodium treatment. Our findings indicate that microbiota in EBC samples and PSB samples are not equivalent. IMPORTANCE Sampling of the lung microbiota usually necessitates performing bronchoscopic procedures that involve a hospital visit for human participants and the use of trained staff. The inconvenience and perceived discomfort of participating in this kind of research may deter healthy volunteers and may not be a safe option for patients with advanced lung disease. This study set out to evaluate a less invasive method for collecting lung microbiota samples by comparing samples taken via protected specimen brushings (PSB) to those taken via exhaled breath condensate (EBC) collection. We found that there was less bacterial DNA in EBC samples compared with that in PSB samples and that there were differences between the bacterial communities in the two sample types. We conclude that while EBC and PSB samples do not produce equivalent microbiota samples, the study of the EBC microbiota may still be of interest. Copyright © 2017 Glendinning et al.

  2. Microstructure-based modelling of arbitrary deformation histories of filler-reinforced elastomers

    NASA Astrophysics Data System (ADS)

    Lorenz, H.; Klüppel, M.

    2012-11-01

    A physically motivated theory of rubber reinforcement based on filler cluster mechanics is presented considering the mechanical behaviour of quasi-statically loaded elastomeric materials subjected to arbitrary deformation histories. This represents an extension of a previously introduced model describing filler induced stress softening and hysteresis of highly strained elastomers. These effects are referred to the hydrodynamic reinforcement of rubber elasticity due to strain amplification by stiff filler clusters and cyclic breakdown and re-aggregation (healing) of softer, already damaged filler clusters. The theory is first developed for the special case of outer stress-strain cycles with successively increasing maximum strain. In this more simple case, all soft clusters are broken at the turning points of the cycle and the mechanical energy stored in the strained clusters is completely dissipated, i.e. only irreversible stress contributions result. Nevertheless, the description of outer cycles involves already all material parameters of the theory and hence they can be used for a fitting procedure. In the general case of an arbitrary deformation history, the cluster mechanics of the material is complicated due to the fact that not all soft clusters are broken at the turning points of a cycle. For that reason additional reversible stress contributions considering the relaxation of clusters upon retraction have to be taken into account for the description of inner cycles. A special recursive algorithm is developed constituting a frame of the mechanical response of encapsulated inner cycles. Simulation and measurement are found to be in fair agreement for CB and silica filled SBR/BR and EPDM samples, loaded in compression and tension along various deformation histories.

  3. Occurrence of Radio Minihalos in a Mass-Limited Sample of Galaxy Clusters

    NASA Technical Reports Server (NTRS)

    Giacintucci, Simona; Markevitch, Maxim; Cassano, Rossella; Venturi, Tiziana; Clarke, Tracy E.; Brunetti, Gianfranco

    2017-01-01

    We investigate the occurrence of radio minihalos-diffuse radio sources of unknown origin observed in the cores of some galaxy clusters-in a statistical sample of 58 clusters drawn from the Planck Sunyaev-Zeldovich cluster catalog using a mass cut (M(sub 500) greater than 6 x 10(exp 14) solar mass). We supplement our statistical sample with a similarly sized nonstatistical sample mostly consisting of clusters in the ACCEPT X-ray catalog with suitable X-ray and radio data, which includes lower-mass clusters. Where necessary (for nine clusters), we reanalyzed the Very Large Array archival radio data to determine whether a minihalo is present. Our total sample includes all 28 currently known and recently discovered radio minihalos, including six candidates. We classify clusters as cool-core or non-cool-core according to the value of the specific entropy floor in the cluster center, rederived or newly derived from the Chandra X-ray density and temperature profiles where necessary (for 27 clusters). Contrary to the common wisdom that minihalos are rare, we find that almost all cool cores-at least 12 out of 15 (80%)-in our complete sample of massive clusters exhibit minihalos. The supplementary sample shows that the occurrence of minihalos may be lower in lower-mass cool-core clusters. No minihalos are found in non-cool cores or "warm cores." These findings will help test theories of the origin of minihalos and provide information on the physical processes and energetics of the cluster cores.

  4. RosettaAntibodyDesign (RAbD): A general framework for computational antibody design

    PubMed Central

    Adolf-Bryfogle, Jared; Kalyuzhniy, Oleks; Kubitz, Michael; Hu, Xiaozhen; Adachi, Yumiko; Schief, William R.

    2018-01-01

    A structural-bioinformatics-based computational methodology and framework have been developed for the design of antibodies to targets of interest. RosettaAntibodyDesign (RAbD) samples the diverse sequence, structure, and binding space of an antibody to an antigen in highly customizable protocols for the design of antibodies in a broad range of applications. The program samples antibody sequences and structures by grafting structures from a widely accepted set of the canonical clusters of CDRs (North et al., J. Mol. Biol., 406:228–256, 2011). It then performs sequence design according to amino acid sequence profiles of each cluster, and samples CDR backbones using a flexible-backbone design protocol incorporating cluster-based CDR constraints. Starting from an existing experimental or computationally modeled antigen-antibody structure, RAbD can be used to redesign a single CDR or multiple CDRs with loops of different length, conformation, and sequence. We rigorously benchmarked RAbD on a set of 60 diverse antibody–antigen complexes, using two design strategies—optimizing total Rosetta energy and optimizing interface energy alone. We utilized two novel metrics for measuring success in computational protein design. The design risk ratio (DRR) is equal to the frequency of recovery of native CDR lengths and clusters divided by the frequency of sampling of those features during the Monte Carlo design procedure. Ratios greater than 1.0 indicate that the design process is picking out the native more frequently than expected from their sampled rate. We achieved DRRs for the non-H3 CDRs of between 2.4 and 4.0. The antigen risk ratio (ARR) is the ratio of frequencies of the native amino acid types, CDR lengths, and clusters in the output decoys for simulations performed in the presence and absence of the antigen. For CDRs, we achieved cluster ARRs as high as 2.5 for L1 and 1.5 for H2. For sequence design simulations without CDR grafting, the overall recovery for the native amino acid types for residues that contact the antigen in the native structures was 72% in simulations performed in the presence of the antigen and 48% in simulations performed without the antigen, for an ARR of 1.5. For the non-contacting residues, the ARR was 1.08. This shows that the sequence profiles are able to maintain the amino acid types of these conserved, buried sites, while recovery of the exposed, contacting residues requires the presence of the antigen-antibody interface. We tested RAbD experimentally on both a lambda and kappa antibody–antigen complex, successfully improving their affinities 10 to 50 fold by replacing individual CDRs of the native antibody with new CDR lengths and clusters. PMID:29702641

  5. RosettaAntibodyDesign (RAbD): A general framework for computational antibody design.

    PubMed

    Adolf-Bryfogle, Jared; Kalyuzhniy, Oleks; Kubitz, Michael; Weitzner, Brian D; Hu, Xiaozhen; Adachi, Yumiko; Schief, William R; Dunbrack, Roland L

    2018-04-01

    A structural-bioinformatics-based computational methodology and framework have been developed for the design of antibodies to targets of interest. RosettaAntibodyDesign (RAbD) samples the diverse sequence, structure, and binding space of an antibody to an antigen in highly customizable protocols for the design of antibodies in a broad range of applications. The program samples antibody sequences and structures by grafting structures from a widely accepted set of the canonical clusters of CDRs (North et al., J. Mol. Biol., 406:228-256, 2011). It then performs sequence design according to amino acid sequence profiles of each cluster, and samples CDR backbones using a flexible-backbone design protocol incorporating cluster-based CDR constraints. Starting from an existing experimental or computationally modeled antigen-antibody structure, RAbD can be used to redesign a single CDR or multiple CDRs with loops of different length, conformation, and sequence. We rigorously benchmarked RAbD on a set of 60 diverse antibody-antigen complexes, using two design strategies-optimizing total Rosetta energy and optimizing interface energy alone. We utilized two novel metrics for measuring success in computational protein design. The design risk ratio (DRR) is equal to the frequency of recovery of native CDR lengths and clusters divided by the frequency of sampling of those features during the Monte Carlo design procedure. Ratios greater than 1.0 indicate that the design process is picking out the native more frequently than expected from their sampled rate. We achieved DRRs for the non-H3 CDRs of between 2.4 and 4.0. The antigen risk ratio (ARR) is the ratio of frequencies of the native amino acid types, CDR lengths, and clusters in the output decoys for simulations performed in the presence and absence of the antigen. For CDRs, we achieved cluster ARRs as high as 2.5 for L1 and 1.5 for H2. For sequence design simulations without CDR grafting, the overall recovery for the native amino acid types for residues that contact the antigen in the native structures was 72% in simulations performed in the presence of the antigen and 48% in simulations performed without the antigen, for an ARR of 1.5. For the non-contacting residues, the ARR was 1.08. This shows that the sequence profiles are able to maintain the amino acid types of these conserved, buried sites, while recovery of the exposed, contacting residues requires the presence of the antigen-antibody interface. We tested RAbD experimentally on both a lambda and kappa antibody-antigen complex, successfully improving their affinities 10 to 50 fold by replacing individual CDRs of the native antibody with new CDR lengths and clusters.

  6. Sampling designs for HIV molecular epidemiology with application to Honduras.

    PubMed

    Shepherd, Bryan E; Rossini, Anthony J; Soto, Ramon Jeremias; De Rivera, Ivette Lorenzana; Mullins, James I

    2005-11-01

    Proper sampling is essential to characterize the molecular epidemiology of human immunodeficiency virus (HIV). HIV sampling frames are difficult to identify, so most studies use convenience samples. We discuss statistically valid and feasible sampling techniques that overcome some of the potential for bias due to convenience sampling and ensure better representation of the study population. We employ a sampling design called stratified cluster sampling. This first divides the population into geographical and/or social strata. Within each stratum, a population of clusters is chosen from groups, locations, or facilities where HIV-positive individuals might be found. Some clusters are randomly selected within strata and individuals are randomly selected within clusters. Variation and cost help determine the number of clusters and the number of individuals within clusters that are to be sampled. We illustrate the approach through a study designed to survey the heterogeneity of subtype B strains in Honduras.

  7. The ROSAT Brightest Cluster Sample - I. The compilation of the sample and the cluster log N-log S distribution

    NASA Astrophysics Data System (ADS)

    Ebeling, H.; Edge, A. C.; Bohringer, H.; Allen, S. W.; Crawford, C. S.; Fabian, A. C.; Voges, W.; Huchra, J. P.

    1998-12-01

    We present a 90 per cent flux-complete sample of the 201 X-ray-brightest clusters of galaxies in the northern hemisphere (delta>=0 deg), at high Galactic latitudes (|b|>=20 deg), with measured redshifts z<=0.3 and fluxes higher than 4.4x10^-12 erg cm^-2 s^-1 in the 0.1-2.4 keV band. The sample, called the ROSAT Brightest Cluster Sample (BCS), is selected from ROSAT All-Sky Survey data and is the largest X-ray-selected cluster sample compiled to date. In addition to Abell clusters, which form the bulk of the sample, the BCS also contains the X-ray-brightest Zwicky clusters and other clusters selected from their X-ray properties alone. Effort has been made to ensure the highest possible completeness of the sample and the smallest possible contamination by non-cluster X-ray sources. X-ray fluxes are computed using an algorithm tailored for the detection and characterization of X-ray emission from galaxy clusters. These fluxes are accurate to better than 15 per cent (mean 1sigma error). We find the cumulative logN-logS distribution of clusters to follow a power law kappa S^alpha with alpha=1.31^+0.06_-0.03 (errors are the 10th and 90th percentiles) down to fluxes of 2x10^-12 erg cm^-2 s^-1, i.e. considerably below the BCS flux limit. Although our best-fitting slope disagrees formally with the canonical value of -1.5 for a Euclidean distribution, the BCS logN-logS distribution is consistent with a non-evolving cluster population if cosmological effects are taken into account. Our sample will allow us to examine large-scale structure in the northern hemisphere, determine the spatial cluster-cluster correlation function, investigate correlations between the X-ray and optical properties of the clusters, establish the X-ray luminosity function for galaxy clusters, and discuss the implications of the results for cluster evolution.

  8. The cosmological analysis of X-ray cluster surveys. III. 4D X-ray observable diagrams

    NASA Astrophysics Data System (ADS)

    Pierre, M.; Valotti, A.; Faccioli, L.; Clerc, N.; Gastaud, R.; Koulouridis, E.; Pacaud, F.

    2017-11-01

    Context. Despite compelling theoretical arguments, the use of clusters as cosmological probes is, in practice, frequently questioned because of the many uncertainties surrounding cluster-mass estimates. Aims: Our aim is to develop a fully self-consistent cosmological approach of X-ray cluster surveys, exclusively based on observable quantities rather than masses. This procedure is justified given the possibility to directly derive the cluster properties via ab initio modelling, either analytically or by using hydrodynamical simulations. In this third paper, we evaluate the method on cluster toy-catalogues. Methods: We model the population of detected clusters in the count-rate - hardness-ratio - angular size - redshift space and compare the corresponding four-dimensional diagram with theoretical predictions. The best cosmology+physics parameter configuration is determined using a simple minimisation procedure; errors on the parameters are estimated by averaging the results from ten independent survey realisations. The method allows a simultaneous fit of the cosmological parameters of the cluster evolutionary physics and of the selection effects. Results: When using information from the X-ray survey alone plus redshifts, this approach is shown to be as accurate as the modelling of the mass function for the cosmological parameters and to perform better for the cluster physics, for a similar level of assumptions on the scaling relations. It enables the identification of degenerate combinations of parameter values. Conclusions: Given the considerably shorter computer times involved for running the minimisation procedure in the observed parameter space, this method appears to clearly outperform traditional mass-based approaches when X-ray survey data alone are available.

  9. Model for spectral and chromatographic data

    DOEpatents

    Jarman, Kristin [Richland, WA; Willse, Alan [Richland, WA; Wahl, Karen [Richland, WA; Wahl, Jon [Richland, WA

    2002-11-26

    A method and apparatus using a spectral analysis technique are disclosed. In one form of the invention, probabilities are selected to characterize the presence (and in another form, also a quantification of a characteristic) of peaks in an indexed data set for samples that match a reference species, and other probabilities are selected for samples that do not match the reference species. An indexed data set is acquired for a sample, and a determination is made according to techniques exemplified herein as to whether the sample matches or does not match the reference species. When quantification of peak characteristics is undertaken, the model is appropriately expanded, and the analysis accounts for the characteristic model and data. Further techniques are provided to apply the methods and apparatuses to process control, cluster analysis, hypothesis testing, analysis of variance, and other procedures involving multiple comparisons of indexed data.

  10. The structure of deposited metal clusters generated by laser evaporation

    NASA Astrophysics Data System (ADS)

    Faust, P.; Brandstättner, M.; Ding, A.

    1991-09-01

    Metal clusters have been produced using a laser evaporation source. A Nd-YAG laser beam focused onto a solid silver rod was used to evaporate the material, which was then cooled to form clusters with the help of a pulsed high pressure He beam. TOF mass spectra of these clusters reveal a strong occurrence of small and medium sized clusters ( n<100). Clusters were also deposited onto grid supported thin layers of carbon-films which were investigated by transmission electron microscopy. Very high resolution pictures of these grids were used to analyze the size distribution and the structure of the deposited clusters. The diffraction pattern caused by crystalline structure of the clusters reveals 3-and 5-fold symmetries as well as fcc bulk structure. This can be explained in terms of icosahedron and cuboctahedron type clusters deposited on the surface of the carbon layer. There is strong evidence that part of these cluster geometries had already been formed before the depostion process. The non-linear dependence of the cluster size and the cluster density on the generating conditions is discussed. Therefore the samples were observed in HREM in the stable DEEKO 100 microscope of the Fritz-Haber-Institut operating at 100 KV with the spherical aberration c S =0.5 mm. The quality of the pictures was improved by using the conditions of minimum phase contrast hollow cone illumination. This procedure led to a minimum of phase contrast artefacts. Among the well-crystallized particles were a great amount of five- and three-fold symmetries, icosahedra and cuboctahedra respectively. The largest clusters with five- and three-fold symmetries have been found with diameters of 7 nm; the smallest particles displaying the same undistorted symmetries were of about 2 mm. Even smaller ones with strong distortions could be observed although their classification is difficult. The quality of the images was improved by applying Fourier filtering techniques.

  11. Challenges in performance of food safety management systems: a case of fish processing companies in Tanzania.

    PubMed

    Kussaga, Jamal B; Luning, Pieternel A; Tiisekwa, Bendantunguka P M; Jacxsens, Liesbeth

    2014-04-01

    This study provides insight for food safety (FS) performance in light of the current performance of core FS management system (FSMS) activities and context riskiness of these systems to identify the opportunities for improvement of the FSMS. A FSMS diagnostic instrument was applied to assess the performance levels of FSMS activities regarding context riskiness and FS performance in 14 fish processing companies in Tanzania. Two clusters (cluster I and II) with average FSMS (level 2) operating under moderate-risk context (score 2) were identified. Overall, cluster I had better (score 3) FS performance than cluster II (score 2 to 3). However, a majority of the fish companies need further improvement of their FSMS and reduction of context riskiness to assure good FS performance. The FSMS activity levels could be improved through hygienic design of equipment and facilities, strict raw material control, proper follow-up of critical control point analysis, developing specific sanitation procedures and company-specific sampling design and measuring plans, independent validation of preventive measures, and establishing comprehensive documentation and record-keeping systems. The risk level of the context could be reduced through automation of production processes (such as filleting, packaging, and sanitation) to restrict people's interference, recruitment of permanent high-skilled technological staff, and setting requirements on product use (storage and distribution conditions) on customers. However, such intervention measures for improvement could be taken in phases, starting with less expensive ones (such as sanitation procedures) that can be implemented in the short term to more expensive interventions (setting up assurance activities) to be adopted in the long term. These measures are essential for fish processing companies to move toward FSMS that are more effective.

  12. Hierarchical modeling of cluster size in wildlife surveys

    USGS Publications Warehouse

    Royle, J. Andrew

    2008-01-01

    Clusters or groups of individuals are the fundamental unit of observation in many wildlife sampling problems, including aerial surveys of waterfowl, marine mammals, and ungulates. Explicit accounting of cluster size in models for estimating abundance is necessary because detection of individuals within clusters is not independent and detectability of clusters is likely to increase with cluster size. This induces a cluster size bias in which the average cluster size in the sample is larger than in the population at large. Thus, failure to account for the relationship between delectability and cluster size will tend to yield a positive bias in estimates of abundance or density. I describe a hierarchical modeling framework for accounting for cluster-size bias in animal sampling. The hierarchical model consists of models for the observation process conditional on the cluster size distribution and the cluster size distribution conditional on the total number of clusters. Optionally, a spatial model can be specified that describes variation in the total number of clusters per sample unit. Parameter estimation, model selection, and criticism may be carried out using conventional likelihood-based methods. An extension of the model is described for the situation where measurable covariates at the level of the sample unit are available. Several candidate models within the proposed class are evaluated for aerial survey data on mallard ducks (Anas platyrhynchos).

  13. A Hierarchical Bayesian Procedure for Two-Mode Cluster Analysis

    ERIC Educational Resources Information Center

    DeSarbo, Wayne S.; Fong, Duncan K. H.; Liechty, John; Saxton, M. Kim

    2004-01-01

    This manuscript introduces a new Bayesian finite mixture methodology for the joint clustering of row and column stimuli/objects associated with two-mode asymmetric proximity, dominance, or profile data. That is, common clusters are derived which partition both the row and column stimuli/objects simultaneously into the same derived set of clusters.…

  14. A cluster expansion model for predicting activation barrier of atomic processes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rehman, Tafizur; Jaipal, M.; Chatterjee, Abhijit, E-mail: achatter@iitk.ac.in

    2013-06-15

    We introduce a procedure based on cluster expansion models for predicting the activation barrier of atomic processes encountered while studying the dynamics of a material system using the kinetic Monte Carlo (KMC) method. Starting with an interatomic potential description, a mathematical derivation is presented to show that the local environment dependence of the activation barrier can be captured using cluster interaction models. Next, we develop a systematic procedure for training the cluster interaction model on-the-fly, which involves: (i) obtaining activation barriers for handful local environments using nudged elastic band (NEB) calculations, (ii) identifying the local environment by analyzing the NEBmore » results, and (iii) estimating the cluster interaction model parameters from the activation barrier data. Once a cluster expansion model has been trained, it is used to predict activation barriers without requiring any additional NEB calculations. Numerical studies are performed to validate the cluster expansion model by studying hop processes in Ag/Ag(100). We show that the use of cluster expansion model with KMC enables efficient generation of an accurate process rate catalog.« less

  15. Characterization of edible seaweed harvested on the Galician coast (northwestern Spain) using pattern recognition techniques and major and trace element data.

    PubMed

    Romarís-Hortas, Vanessa; García-Sartal, Cristina; Barciela-Alonso, María Carmen; Moreda-Piñeiro, Antonio; Bermejo-Barrera, Pilar

    2010-02-10

    Major and trace elements in North Atlantic seaweed originating from Galicia (northwestern Spain) were determined by using inductively coupled plasma-optical emission spectrometry (ICP-OES) (Ba, Ca, Cu, K, Mg, Mn, Na, Sr, and Zn), inductively coupled plasma-mass spectrometry (ICP-MS) (Br and I) and hydride generation-atomic fluorescence spectrometry (HG-AFS) (As). Pattern recognition techniques were then used to classify the edible seaweed according to their type (red, brown, and green seaweed) and also their variety (Wakame, Fucus, Sea Spaghetti, Kombu, Dulse, Nori, and Sea Lettuce). Principal component analysis (PCA) and cluster analysis (CA) were used as exploratory techniques, and linear discriminant analysis (LDA) and soft independent modeling of class analogy (SIMCA) were used as classification procedures. In total, t12 elements were determined in a range of 35 edible seaweed samples (20 brown seaweed, 10 red seaweed, 4 green seaweed, and 1 canned seaweed). Natural groupings of the samples (brown, red, and green types) were observed using PCA and CA (squared Euclidean distance between objects and Ward method as clustering procedure). The application of LDA gave correct assignation percentages of 100% for brown, red, and green types at a significance level of 5%. However, a satisfactory classification (recognition and prediction) using SIMCA was obtained only for red seaweed (100% of cases correctly classified), whereas percentages of 89 and 80% were obtained for brown seaweed for recognition (training set) and prediction (testing set), respectively.

  16. The Chandra Strong Lens Sample: Revealing Baryonic Physics In Strong Lensing Selected Clusters

    NASA Astrophysics Data System (ADS)

    Bayliss, Matthew

    2017-08-01

    We propose for Chandra imaging of the hot intra-cluster gas in a unique new sample of 29 galaxy clusters selected purely on their strong gravitational lensing signatures. This will be the first program targeting a purely strong lensing selected cluster sample, enabling new comparisons between the ICM properties and scaling relations of strong lensing and mass/ICM selected cluster samples. Chandra imaging, combined with high precision strong lens models, ensures powerful constraints on the distribution and state of matter in the cluster cores. This represents a novel angle from which we can address the role played by baryonic physics |*| the infamous |*|gastrophysics|*| in shaping the cores of massive clusters, and opens up an exciting new galaxy cluster discovery space with Chandra.

  17. The Chandra Strong Lens Sample: Revealing Baryonic Physics In Strong Lensing Selected Clusters

    NASA Astrophysics Data System (ADS)

    Bayliss, Matthew

    2017-09-01

    We propose for Chandra imaging of the hot intra-cluster gas in a unique new sample of 29 galaxy clusters selected purely on their strong gravitational lensing signatures. This will be the first program targeting a purely strong lensing selected cluster sample, enabling new comparisons between the ICM properties and scaling relations of strong lensing and mass/ICM selected cluster samples. Chandra imaging, combined with high precision strong lens models, ensures powerful constraints on the distribution and state of matter in the cluster cores. This represents a novel angle from which we can address the role played by baryonic physics -- the infamous ``gastrophysics''-- in shaping the cores of massive clusters, and opens up an exciting new galaxy cluster discovery space with Chandra.

  18. A fast learning method for large scale and multi-class samples of SVM

    NASA Astrophysics Data System (ADS)

    Fan, Yu; Guo, Huiming

    2017-06-01

    A multi-class classification SVM(Support Vector Machine) fast learning method based on binary tree is presented to solve its low learning efficiency when SVM processing large scale multi-class samples. This paper adopts bottom-up method to set up binary tree hierarchy structure, according to achieved hierarchy structure, sub-classifier learns from corresponding samples of each node. During the learning, several class clusters are generated after the first clustering of the training samples. Firstly, central points are extracted from those class clusters which just have one type of samples. For those which have two types of samples, cluster numbers of their positive and negative samples are set respectively according to their mixture degree, secondary clustering undertaken afterwards, after which, central points are extracted from achieved sub-class clusters. By learning from the reduced samples formed by the integration of extracted central points above, sub-classifiers are obtained. Simulation experiment shows that, this fast learning method, which is based on multi-level clustering, can guarantee higher classification accuracy, greatly reduce sample numbers and effectively improve learning efficiency.

  19. Simplified multi-element analysis of ground and instant coffees by ICP-OES and FAAS.

    PubMed

    Szymczycha-Madeja, Anna; Welna, Maja; Pohl, Pawel

    2015-01-01

    A simplified alternative to the wet digestion sample preparation procedure for roasted ground and instant coffees has been developed and validated for the determination of different elements by inductively coupled plasma optical emission spectrometry (ICP-OES) (Al, Ba, Cd, Co, Cr, Cu, Mn, Ni, Pb, Sr, Zn) and flame atomic absorption spectrometry (FAAS) (Ca, Fe, K, Mg, Na). The proposed procedure, i.e. the ultrasound-assisted solubilisation in aqua regia, is quite fast and simple, requires minimal use of reagents, and demonstrated good analytical performance, i.e. accuracy from -4.7% to 1.9%, precision within 0.5-8.6% and recovery in the range 93.5-103%. Detection limits of elements were from 0.086 ng ml(-1) (Sr) to 40 ng ml(-1) (Fe). A preliminary classification of 18 samples of ground and instant coffees was successfully made based on concentrations of selected elements and using principal component analysis and hierarchic cluster analysis.

  20. Facing the problem of "false positives": re-assessment and improvement of a multiplex RT-PCR procedure for the diagnosis of A. flavus mycotoxin producers.

    PubMed

    Degola, F; Berni, E; Spotti, E; Ferrero, I; Restivo, F M

    2009-02-28

    The aim of our research project was to consolidate a multiplex RT-PCR protocol to detect aflatoxigenic strains of Aspergillus flavus. Several independent A. flavus strains were isolated from corn and flour samples from the North of Italy and from three European countries. Aflatoxin producing/not producing phenotype was assessed by qualitative and quantitative assays at day five of growth in aflatoxin inducing conditions. Expression of 16 genes belonging to the aflatoxin cluster was assayed by multiplex or monomeric RT-PCR. There is a good correlation between gene expression and aflatoxin production. Strains that apparently transcribed all the relevant genes but did not release aflatoxin in the medium ("false positives") were re-assessed for mycotoxin production after extended growth in inducing condition. All the "false positive" strains in actual fact were positive when aflatoxin determination was performed after 10 days of growth. These strains should then be re-classified as "slow aflatoxin accumulators". To optimise the diagnostic procedure, a quintuplex RT-PCR procedure was designed consisting of a primer set directed against four informative aflatoxin cluster genes and the beta-tubulin gene as an internal amplification control. In conclusion we have provided evidence for the robustness and reliability of our RT-PCR protocol in discriminating mycotoxin producer from non-producer strains of A. flavus. and the molecular procedure we devised is a promising tool with which to screen and control the endemic population of A. flavus colonising different areas of the World.

  1. Evaluation of primary immunization coverage of infants under universal immunization programme in an urban area of bangalore city using cluster sampling and lot quality assurance sampling techniques.

    PubMed

    K, Punith; K, Lalitha; G, Suman; Bs, Pradeep; Kumar K, Jayanth

    2008-07-01

    Is LQAS technique better than cluster sampling technique in terms of resources to evaluate the immunization coverage in an urban area? To assess and compare the lot quality assurance sampling against cluster sampling in the evaluation of primary immunization coverage. Population-based cross-sectional study. Areas under Mathikere Urban Health Center. Children aged 12 months to 23 months. 220 in cluster sampling, 76 in lot quality assurance sampling. Percentages and Proportions, Chi square Test. (1) Using cluster sampling, the percentage of completely immunized, partially immunized and unimmunized children were 84.09%, 14.09% and 1.82%, respectively. With lot quality assurance sampling, it was 92.11%, 6.58% and 1.31%, respectively. (2) Immunization coverage levels as evaluated by cluster sampling technique were not statistically different from the coverage value as obtained by lot quality assurance sampling techniques. Considering the time and resources required, it was found that lot quality assurance sampling is a better technique in evaluating the primary immunization coverage in urban area.

  2. An unsupervised classification approach for analysis of Landsat data to monitor land reclamation in Belmont county, Ohio

    NASA Technical Reports Server (NTRS)

    Brumfield, J. O.; Bloemer, H. H. L.; Campbell, W. J.

    1981-01-01

    Two unsupervised classification procedures for analyzing Landsat data used to monitor land reclamation in a surface mining area in east central Ohio are compared for agreement with data collected from the corresponding locations on the ground. One procedure is based on a traditional unsupervised-clustering/maximum-likelihood algorithm sequence that assumes spectral groupings in the Landsat data in n-dimensional space; the other is based on a nontraditional unsupervised-clustering/canonical-transformation/clustering algorithm sequence that not only assumes spectral groupings in n-dimensional space but also includes an additional feature-extraction technique. It is found that the nontraditional procedure provides an appreciable improvement in spectral groupings and apparently increases the level of accuracy in the classification of land cover categories.

  3. Cluster analysis of commercial samples of Bauhinia spp. using HPLC-UV/PDA and MCR-ALS/PCA without peak alignment procedure.

    PubMed

    Ardila, Jorge Armando; Funari, Cristiano Soleo; Andrade, André Marques; Cavalheiro, Alberto José; Carneiro, Renato Lajarim

    2015-01-01

    Bauhinia forficata Link. is recognised by the Brazilian Health Ministry as a treatment of hypoglycemia and diabetes. Analytical methods are useful to assess the plant identity due the similarities found in plants from Bauhinia spp. HPLC-UV/PDA in combination with chemometric tools is an alternative widely used and suitable for authentication of plant material, however, the shifts of retention times for similar compounds in different samples is a problem. To perform comparisons between the authentic medicinal plant (Bauhinia forficata Link.) and samples commercially available in drugstores claiming to be "Bauhinia spp. to treat diabetes" and to evaluate the performance of multivariate curve resolution - alternating least squares (MCR-ALS) associated to principal component analysis (PCA) when compared to pure PCA. HPLC-UV/PDA data obtained from extracts of leaves were evaluated employing a combination of MCR-ALS and PCA, which allowed the use of the full chromatographic and spectrometric information without the need of peak alignment procedures. The use of MCR-ALS/PCA showed better results than the conventional PCA using only one wavelength. Only two of nine commercial samples presented characteristics similar to the authentic Bauhinia forficata spp., considering the full HPLC-UV/PDA data. The combination of MCR-ALS and PCA is very useful when applied to a group of samples where a general alignment procedure could not be applied due to the different chromatographic profiles. This work also demonstrates the need of more strict control from the health authorities regarding herbal products available on the market. Copyright © 2015 John Wiley & Sons, Ltd.

  4. Differences in Coping Styles among Persons with Spinal Cord Injury: A Cluster-Analytic Approach.

    ERIC Educational Resources Information Center

    Frank, Robert G.; And Others

    1987-01-01

    Identified and validated two subgroups in group of 53 persons with spinal cord injury by applying cluster-analytic procedures to subjects' self-reported coping and health locus of control belief scores. Cluster 1 coped less effectively and tended to be psychologically distressed; Cluster 2 subjects emphasized internal health attributions and…

  5. Stochastic coupled cluster theory: Efficient sampling of the coupled cluster expansion

    NASA Astrophysics Data System (ADS)

    Scott, Charles J. C.; Thom, Alex J. W.

    2017-09-01

    We consider the sampling of the coupled cluster expansion within stochastic coupled cluster theory. Observing the limitations of previous approaches due to the inherently non-linear behavior of a coupled cluster wavefunction representation, we propose new approaches based on an intuitive, well-defined condition for sampling weights and on sampling the expansion in cluster operators of different excitation levels. We term these modifications even and truncated selections, respectively. Utilising both approaches demonstrates dramatically improved calculation stability as well as reduced computational and memory costs. These modifications are particularly effective at higher truncation levels owing to the large number of terms within the cluster expansion that can be neglected, as demonstrated by the reduction of the number of terms to be sampled when truncating at triple excitations by 77% and hextuple excitations by 98%.

  6. Occurrence of Radio Minihalos in a Mass-limited Sample of Galaxy Clusters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Giacintucci, Simona; Clarke, Tracy E.; Markevitch, Maxim

    2017-06-01

    We investigate the occurrence of radio minihalos—diffuse radio sources of unknown origin observed in the cores of some galaxy clusters—in a statistical sample of 58 clusters drawn from the Planck Sunyaev–Zel’dovich cluster catalog using a mass cut ( M {sub 500} > 6 × 10{sup 14} M {sub ⊙}). We supplement our statistical sample with a similarly sized nonstatistical sample mostly consisting of clusters in the ACCEPT X-ray catalog with suitable X-ray and radio data, which includes lower-mass clusters. Where necessary (for nine clusters), we reanalyzed the Very Large Array archival radio data to determine whether a minihalo is present.more » Our total sample includes all 28 currently known and recently discovered radio minihalos, including six candidates. We classify clusters as cool-core or non-cool-core according to the value of the specific entropy floor in the cluster center, rederived or newly derived from the Chandra X-ray density and temperature profiles where necessary (for 27 clusters). Contrary to the common wisdom that minihalos are rare, we find that almost all cool cores—at least 12 out of 15 (80%)—in our complete sample of massive clusters exhibit minihalos. The supplementary sample shows that the occurrence of minihalos may be lower in lower-mass cool-core clusters. No minihalos are found in non-cool cores or “warm cores.” These findings will help test theories of the origin of minihalos and provide information on the physical processes and energetics of the cluster cores.« less

  7. Evaluation of Primary Immunization Coverage of Infants Under Universal Immunization Programme in an Urban Area of Bangalore City Using Cluster Sampling and Lot Quality Assurance Sampling Techniques

    PubMed Central

    K, Punith; K, Lalitha; G, Suman; BS, Pradeep; Kumar K, Jayanth

    2008-01-01

    Research Question: Is LQAS technique better than cluster sampling technique in terms of resources to evaluate the immunization coverage in an urban area? Objective: To assess and compare the lot quality assurance sampling against cluster sampling in the evaluation of primary immunization coverage. Study Design: Population-based cross-sectional study. Study Setting: Areas under Mathikere Urban Health Center. Study Subjects: Children aged 12 months to 23 months. Sample Size: 220 in cluster sampling, 76 in lot quality assurance sampling. Statistical Analysis: Percentages and Proportions, Chi square Test. Results: (1) Using cluster sampling, the percentage of completely immunized, partially immunized and unimmunized children were 84.09%, 14.09% and 1.82%, respectively. With lot quality assurance sampling, it was 92.11%, 6.58% and 1.31%, respectively. (2) Immunization coverage levels as evaluated by cluster sampling technique were not statistically different from the coverage value as obtained by lot quality assurance sampling techniques. Considering the time and resources required, it was found that lot quality assurance sampling is a better technique in evaluating the primary immunization coverage in urban area. PMID:19876474

  8. A priori evaluation of two-stage cluster sampling for accuracy assessment of large-area land-cover maps

    USGS Publications Warehouse

    Wickham, J.D.; Stehman, S.V.; Smith, J.H.; Wade, T.G.; Yang, L.

    2004-01-01

    Two-stage cluster sampling reduces the cost of collecting accuracy assessment reference data by constraining sample elements to fall within a limited number of geographic domains (clusters). However, because classification error is typically positively spatially correlated, within-cluster correlation may reduce the precision of the accuracy estimates. The detailed population information to quantify a priori the effect of within-cluster correlation on precision is typically unavailable. Consequently, a convenient, practical approach to evaluate the likely performance of a two-stage cluster sample is needed. We describe such an a priori evaluation protocol focusing on the spatial distribution of the sample by land-cover class across different cluster sizes and costs of different sampling options, including options not imposing clustering. This protocol also assesses the two-stage design's adequacy for estimating the precision of accuracy estimates for rare land-cover classes. We illustrate the approach using two large-area, regional accuracy assessments from the National Land-Cover Data (NLCD), and describe how the a priorievaluation was used as a decision-making tool when implementing the NLCD design.

  9. Efficient evaluation of sampling quality of molecular dynamics simulations by clustering of dihedral torsion angles and Sammon mapping.

    PubMed

    Frickenhaus, Stephan; Kannan, Srinivasaraghavan; Zacharias, Martin

    2009-02-01

    A direct conformational clustering and mapping approach for peptide conformations based on backbone dihedral angles has been developed and applied to compare conformational sampling of Met-enkephalin using two molecular dynamics (MD) methods. Efficient clustering in dihedrals has been achieved by evaluating all combinations resulting from independent clustering of each dihedral angle distribution, thus resolving all conformational substates. In contrast, Cartesian clustering was unable to accurately distinguish between all substates. Projection of clusters on dihedral principal component (PCA) subspaces did not result in efficient separation of highly populated clusters. However, representation in a nonlinear metric by Sammon mapping was able to separate well the 48 highest populated clusters in just two dimensions. In addition, this approach also allowed us to visualize the transition frequencies between clusters efficiently. Significantly, higher transition frequencies between more distinct conformational substates were found for a recently developed biasing-potential replica exchange MD simulation method allowing faster sampling of possible substates compared to conventional MD simulations. Although the number of theoretically possible clusters grows exponentially with peptide length, in practice, the number of clusters is only limited by the sampling size (typically much smaller), and therefore the method is well suited also for large systems. The approach could be useful to rapidly and accurately evaluate conformational sampling during MD simulations, to compare different sampling strategies and eventually to detect kinetic bottlenecks in folding pathways.

  10. In situ diagnostics of the crystal-growth process through neutron imaging: application to scintillators

    DOE PAGES

    Tremsin, Anton S.; Makowska, Małgorzata G.; Perrodin, Didier; ...

    2016-04-12

    Neutrons are known to be unique probes in situations where other types of radiation fail to penetrate samples and their surrounding structures. In this paper it is demonstrated how thermal and cold neutron radiography can provide time-resolved imaging of materials while they are being processed (e.g.while growing single crystals). The processing equipment, in this case furnaces, and the scintillator materials are opaque to conventional X-ray interrogation techniques. The distribution of the europium activator within a BaBrCl:Eu scintillator (0.1 and 0.5% nominal doping concentrations per mole) is studiedin situduring the melting and solidification processes with a temporal resolution of 5–7 s.more » The strong tendency of the Eu dopant to segregate during the solidification process is observed in repeated cycles, with Eu forming clusters on multiple length scales (only for clusters larger than ~50 µm, as limited by the resolution of the present experiments). It is also demonstrated that the dopant concentration can be quantified even for very low concentration levels (~0.1%) in 10 mm thick samples. The interface between the solid and liquid phases can also be imaged, provided there is a sufficient change in concentration of one of the elements with a sufficient neutron attenuation cross section. Tomographic imaging of the BaBrCl:0.1%Eu sample reveals a strong correlation between crystal fractures and Eu-deficient clusters. The results of these experiments demonstrate the unique capabilities of neutron imaging forin situdiagnostics and the optimization of crystal-growth procedures.« less

  11. A Novel Artificial Bee Colony Based Clustering Algorithm for Categorical Data

    PubMed Central

    2015-01-01

    Data with categorical attributes are ubiquitous in the real world. However, existing partitional clustering algorithms for categorical data are prone to fall into local optima. To address this issue, in this paper we propose a novel clustering algorithm, ABC-K-Modes (Artificial Bee Colony clustering based on K-Modes), based on the traditional k-modes clustering algorithm and the artificial bee colony approach. In our approach, we first introduce a one-step k-modes procedure, and then integrate this procedure with the artificial bee colony approach to deal with categorical data. In the search process performed by scout bees, we adopt the multi-source search inspired by the idea of batch processing to accelerate the convergence of ABC-K-Modes. The performance of ABC-K-Modes is evaluated by a series of experiments in comparison with that of the other popular algorithms for categorical data. PMID:25993469

  12. A novel artificial bee colony based clustering algorithm for categorical data.

    PubMed

    Ji, Jinchao; Pang, Wei; Zheng, Yanlin; Wang, Zhe; Ma, Zhiqiang

    2015-01-01

    Data with categorical attributes are ubiquitous in the real world. However, existing partitional clustering algorithms for categorical data are prone to fall into local optima. To address this issue, in this paper we propose a novel clustering algorithm, ABC-K-Modes (Artificial Bee Colony clustering based on K-Modes), based on the traditional k-modes clustering algorithm and the artificial bee colony approach. In our approach, we first introduce a one-step k-modes procedure, and then integrate this procedure with the artificial bee colony approach to deal with categorical data. In the search process performed by scout bees, we adopt the multi-source search inspired by the idea of batch processing to accelerate the convergence of ABC-K-Modes. The performance of ABC-K-Modes is evaluated by a series of experiments in comparison with that of the other popular algorithms for categorical data.

  13. A strategy for analysis of (molecular) equilibrium simulations: Configuration space density estimation, clustering, and visualization

    NASA Astrophysics Data System (ADS)

    Hamprecht, Fred A.; Peter, Christine; Daura, Xavier; Thiel, Walter; van Gunsteren, Wilfred F.

    2001-02-01

    We propose an approach for summarizing the output of long simulations of complex systems, affording a rapid overview and interpretation. First, multidimensional scaling techniques are used in conjunction with dimension reduction methods to obtain a low-dimensional representation of the configuration space explored by the system. A nonparametric estimate of the density of states in this subspace is then obtained using kernel methods. The free energy surface is calculated from that density, and the configurations produced in the simulation are then clustered according to the topography of that surface, such that all configurations belonging to one local free energy minimum form one class. This topographical cluster analysis is performed using basin spanning trees which we introduce as subgraphs of Delaunay triangulations. Free energy surfaces obtained in dimensions lower than four can be visualized directly using iso-contours and -surfaces. Basin spanning trees also afford a glimpse of higher-dimensional topographies. The procedure is illustrated using molecular dynamics simulations on the reversible folding of peptide analoga. Finally, we emphasize the intimate relation of density estimation techniques to modern enhanced sampling algorithms.

  14. Statistical inferences for data from studies conducted with an aggregated multivariate outcome-dependent sample design

    PubMed Central

    Lu, Tsui-Shan; Longnecker, Matthew P.; Zhou, Haibo

    2016-01-01

    Outcome-dependent sampling (ODS) scheme is a cost-effective sampling scheme where one observes the exposure with a probability that depends on the outcome. The well-known such design is the case-control design for binary response, the case-cohort design for the failure time data and the general ODS design for a continuous response. While substantial work has been done for the univariate response case, statistical inference and design for the ODS with multivariate cases remain under-developed. Motivated by the need in biological studies for taking the advantage of the available responses for subjects in a cluster, we propose a multivariate outcome dependent sampling (Multivariate-ODS) design that is based on a general selection of the continuous responses within a cluster. The proposed inference procedure for the Multivariate-ODS design is semiparametric where all the underlying distributions of covariates are modeled nonparametrically using the empirical likelihood methods. We show that the proposed estimator is consistent and developed the asymptotically normality properties. Simulation studies show that the proposed estimator is more efficient than the estimator obtained using only the simple-random-sample portion of the Multivariate-ODS or the estimator from a simple random sample with the same sample size. The Multivariate-ODS design together with the proposed estimator provides an approach to further improve study efficiency for a given fixed study budget. We illustrate the proposed design and estimator with an analysis of association of PCB exposure to hearing loss in children born to the Collaborative Perinatal Study. PMID:27966260

  15. Simultaneous clustering of gene expression data with clinical chemistry and pathological evaluations reveals phenotypic prototypes

    PubMed Central

    Bushel, Pierre R; Wolfinger, Russell D; Gibson, Greg

    2007-01-01

    Background Commonly employed clustering methods for analysis of gene expression data do not directly incorporate phenotypic data about the samples. Furthermore, clustering of samples with known phenotypes is typically performed in an informal fashion. The inability of clustering algorithms to incorporate biological data in the grouping process can limit proper interpretation of the data and its underlying biology. Results We present a more formal approach, the modk-prototypes algorithm, for clustering biological samples based on simultaneously considering microarray gene expression data and classes of known phenotypic variables such as clinical chemistry evaluations and histopathologic observations. The strategy involves constructing an objective function with the sum of the squared Euclidean distances for numeric microarray and clinical chemistry data and simple matching for histopathology categorical values in order to measure dissimilarity of the samples. Separate weighting terms are used for microarray, clinical chemistry and histopathology measurements to control the influence of each data domain on the clustering of the samples. The dynamic validity index for numeric data was modified with a category utility measure for determining the number of clusters in the data sets. A cluster's prototype, formed from the mean of the values for numeric features and the mode of the categorical values of all the samples in the group, is representative of the phenotype of the cluster members. The approach is shown to work well with a simulated mixed data set and two real data examples containing numeric and categorical data types. One from a heart disease study and another from acetaminophen (an analgesic) exposure in rat liver that causes centrilobular necrosis. Conclusion The modk-prototypes algorithm partitioned the simulated data into clusters with samples in their respective class group and the heart disease samples into two groups (sick and buff denoting samples having pain type representative of angina and non-angina respectively) with an accuracy of 79%. This is on par with, or better than, the assignment accuracy of the heart disease samples by several well-known and successful clustering algorithms. Following modk-prototypes clustering of the acetaminophen-exposed samples, informative genes from the cluster prototypes were identified that are descriptive of, and phenotypically anchored to, levels of necrosis of the centrilobular region of the rat liver. The biological processes cell growth and/or maintenance, amine metabolism, and stress response were shown to discern between no and moderate levels of acetaminophen-induced centrilobular necrosis. The use of well-known and traditional measurements directly in the clustering provides some guarantee that the resulting clusters will be meaningfully interpretable. PMID:17408499

  16. Choosing a Cluster Sampling Design for Lot Quality Assurance Sampling Surveys

    PubMed Central

    Hund, Lauren; Bedrick, Edward J.; Pagano, Marcello

    2015-01-01

    Lot quality assurance sampling (LQAS) surveys are commonly used for monitoring and evaluation in resource-limited settings. Recently several methods have been proposed to combine LQAS with cluster sampling for more timely and cost-effective data collection. For some of these methods, the standard binomial model can be used for constructing decision rules as the clustering can be ignored. For other designs, considered here, clustering is accommodated in the design phase. In this paper, we compare these latter cluster LQAS methodologies and provide recommendations for choosing a cluster LQAS design. We compare technical differences in the three methods and determine situations in which the choice of method results in a substantively different design. We consider two different aspects of the methods: the distributional assumptions and the clustering parameterization. Further, we provide software tools for implementing each method and clarify misconceptions about these designs in the literature. We illustrate the differences in these methods using vaccination and nutrition cluster LQAS surveys as example designs. The cluster methods are not sensitive to the distributional assumptions but can result in substantially different designs (sample sizes) depending on the clustering parameterization. However, none of the clustering parameterizations used in the existing methods appears to be consistent with the observed data, and, consequently, choice between the cluster LQAS methods is not straightforward. Further research should attempt to characterize clustering patterns in specific applications and provide suggestions for best-practice cluster LQAS designs on a setting-specific basis. PMID:26125967

  17. Choosing a Cluster Sampling Design for Lot Quality Assurance Sampling Surveys.

    PubMed

    Hund, Lauren; Bedrick, Edward J; Pagano, Marcello

    2015-01-01

    Lot quality assurance sampling (LQAS) surveys are commonly used for monitoring and evaluation in resource-limited settings. Recently several methods have been proposed to combine LQAS with cluster sampling for more timely and cost-effective data collection. For some of these methods, the standard binomial model can be used for constructing decision rules as the clustering can be ignored. For other designs, considered here, clustering is accommodated in the design phase. In this paper, we compare these latter cluster LQAS methodologies and provide recommendations for choosing a cluster LQAS design. We compare technical differences in the three methods and determine situations in which the choice of method results in a substantively different design. We consider two different aspects of the methods: the distributional assumptions and the clustering parameterization. Further, we provide software tools for implementing each method and clarify misconceptions about these designs in the literature. We illustrate the differences in these methods using vaccination and nutrition cluster LQAS surveys as example designs. The cluster methods are not sensitive to the distributional assumptions but can result in substantially different designs (sample sizes) depending on the clustering parameterization. However, none of the clustering parameterizations used in the existing methods appears to be consistent with the observed data, and, consequently, choice between the cluster LQAS methods is not straightforward. Further research should attempt to characterize clustering patterns in specific applications and provide suggestions for best-practice cluster LQAS designs on a setting-specific basis.

  18. Cluster designs to assess the prevalence of acute malnutrition by lot quality assurance sampling: a validation study by computer simulation.

    PubMed

    Olives, Casey; Pagano, Marcello; Deitchler, Megan; Hedt, Bethany L; Egge, Kari; Valadez, Joseph J

    2009-04-01

    Traditional lot quality assurance sampling (LQAS) methods require simple random sampling to guarantee valid results. However, cluster sampling has been proposed to reduce the number of random starting points. This study uses simulations to examine the classification error of two such designs, a 67x3 (67 clusters of three observations) and a 33x6 (33 clusters of six observations) sampling scheme to assess the prevalence of global acute malnutrition (GAM). Further, we explore the use of a 67x3 sequential sampling scheme for LQAS classification of GAM prevalence. Results indicate that, for independent clusters with moderate intracluster correlation for the GAM outcome, the three sampling designs maintain approximate validity for LQAS analysis. Sequential sampling can substantially reduce the average sample size that is required for data collection. The presence of intercluster correlation can impact dramatically the classification error that is associated with LQAS analysis.

  19. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kwon, Deukwoo; Little, Mark P.; Miller, Donald L.

    Purpose: To determine more accurate regression formulas for estimating peak skin dose (PSD) from reference air kerma (RAK) or kerma-area product (KAP). Methods: After grouping of the data from 21 procedures into 13 clinically similar groups, assessments were made of optimal clustering using the Bayesian information criterion to obtain the optimal linear regressions of (log-transformed) PSD vs RAK, PSD vs KAP, and PSD vs RAK and KAP. Results: Three clusters of clinical groups were optimal in regression of PSD vs RAK, seven clusters of clinical groups were optimal in regression of PSD vs KAP, and six clusters of clinical groupsmore » were optimal in regression of PSD vs RAK and KAP. Prediction of PSD using both RAK and KAP is significantly better than prediction of PSD with either RAK or KAP alone. The regression of PSD vs RAK provided better predictions of PSD than the regression of PSD vs KAP. The partial-pooling (clustered) method yields smaller mean squared errors compared with the complete-pooling method.Conclusion: PSD distributions for interventional radiology procedures are log-normal. Estimates of PSD derived from RAK and KAP jointly are most accurate, followed closely by estimates derived from RAK alone. Estimates of PSD derived from KAP alone are the least accurate. Using a stochastic search approach, it is possible to cluster together certain dissimilar types of procedures to minimize the total error sum of squares.« less

  20. Understanding the cluster randomised crossover design: a graphical illustraton of the components of variation and a sample size tutorial.

    PubMed

    Arnup, Sarah J; McKenzie, Joanne E; Hemming, Karla; Pilcher, David; Forbes, Andrew B

    2017-08-15

    In a cluster randomised crossover (CRXO) design, a sequence of interventions is assigned to a group, or 'cluster' of individuals. Each cluster receives each intervention in a separate period of time, forming 'cluster-periods'. Sample size calculations for CRXO trials need to account for both the cluster randomisation and crossover aspects of the design. Formulae are available for the two-period, two-intervention, cross-sectional CRXO design, however implementation of these formulae is known to be suboptimal. The aims of this tutorial are to illustrate the intuition behind the design; and provide guidance on performing sample size calculations. Graphical illustrations are used to describe the effect of the cluster randomisation and crossover aspects of the design on the correlation between individual responses in a CRXO trial. Sample size calculations for binary and continuous outcomes are illustrated using parameters estimated from the Australia and New Zealand Intensive Care Society - Adult Patient Database (ANZICS-APD) for patient mortality and length(s) of stay (LOS). The similarity between individual responses in a CRXO trial can be understood in terms of three components of variation: variation in cluster mean response; variation in the cluster-period mean response; and variation between individual responses within a cluster-period; or equivalently in terms of the correlation between individual responses in the same cluster-period (within-cluster within-period correlation, WPC), and between individual responses in the same cluster, but in different periods (within-cluster between-period correlation, BPC). The BPC lies between zero and the WPC. When the WPC and BPC are equal the precision gained by crossover aspect of the CRXO design equals the precision lost by cluster randomisation. When the BPC is zero there is no advantage in a CRXO over a parallel-group cluster randomised trial. Sample size calculations illustrate that small changes in the specification of the WPC or BPC can increase the required number of clusters. By illustrating how the parameters required for sample size calculations arise from the CRXO design and by providing guidance on both how to choose values for the parameters and perform the sample size calculations, the implementation of the sample size formulae for CRXO trials may improve.

  1. 75 FR 44937 - Submission for OMB Review; Comment Request

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-07-30

    ... is a block cluster, which consists of one or more contiguous census blocks. The P sample is a sample of housing units and persons obtained independently from the census for a sample of block clusters. The E sample is a sample of census housing units and enumerations in the same block of clusters as the...

  2. Implementation of novel statistical procedures and other advanced approaches to improve analysis of CASA data.

    PubMed

    Ramón, M; Martínez-Pastor, F

    2018-04-23

    Computer-aided sperm analysis (CASA) produces a wealth of data that is frequently ignored. The use of multiparametric statistical methods can help explore these datasets, unveiling the subpopulation structure of sperm samples. In this review we analyse the significance of the internal heterogeneity of sperm samples and its relevance. We also provide a brief description of the statistical tools used for extracting sperm subpopulations from the datasets, namely unsupervised clustering (with non-hierarchical, hierarchical and two-step methods) and the most advanced supervised methods, based on machine learning. The former method has allowed exploration of subpopulation patterns in many species, whereas the latter offering further possibilities, especially considering functional studies and the practical use of subpopulation analysis. We also consider novel approaches, such as the use of geometric morphometrics or imaging flow cytometry. Finally, although the data provided by CASA systems provides valuable information on sperm samples by applying clustering analyses, there are several caveats. Protocols for capturing and analysing motility or morphometry should be standardised and adapted to each experiment, and the algorithms should be open in order to allow comparison of results between laboratories. Moreover, we must be aware of new technology that could change the paradigm for studying sperm motility and morphology.

  3. A DNA fingerprinting procedure for ultra high-throughput genetic analysis of insects.

    PubMed

    Schlipalius, D I; Waldron, J; Carroll, B J; Collins, P J; Ebert, P R

    2001-12-01

    Existing procedures for the generation of polymorphic DNA markers are not optimal for insect studies in which the organisms are often tiny and background molecular information is often non-existent. We have used a new high throughput DNA marker generation protocol called randomly amplified DNA fingerprints (RAF) to analyse the genetic variability in three separate strains of the stored grain pest, Rhyzopertha dominica. This protocol is quick, robust and reliable even though it requires minimal sample preparation, minute amounts of DNA and no prior molecular analysis of the organism. Arbitrarily selected oligonucleotide primers routinely produced approximately 50 scoreable polymorphic DNA markers, between individuals of three independent field isolates of R. dominica. Multivariate cluster analysis using forty-nine arbitrarily selected polymorphisms generated from a single primer reliably separated individuals into three clades corresponding to their geographical origin. The resulting clades were quite distinct, with an average genetic difference of 37.5 +/- 6.0% between clades and of 21.0 +/- 7.1% between individuals within clades. As a prelude to future gene mapping efforts, we have also assessed the performance of RAF under conditions commonly used in gene mapping. In this analysis, fingerprints from pooled DNA samples accurately and reproducibly reflected RAF profiles obtained from individual DNA samples that had been combined to create the bulked samples.

  4. An efficient matrix-matrix multiplication based antisymmetric tensor contraction engine for general order coupled cluster.

    PubMed

    Hanrath, Michael; Engels-Putzka, Anna

    2010-08-14

    In this paper, we present an efficient implementation of general tensor contractions, which is part of a new coupled-cluster program. The tensor contractions, used to evaluate the residuals in each coupled-cluster iteration are particularly important for the performance of the program. We developed a generic procedure, which carries out contractions of two tensors irrespective of their explicit structure. It can handle coupled-cluster-type expressions of arbitrary excitation level. To make the contraction efficient without loosing flexibility, we use a three-step procedure. First, the data contained in the tensors are rearranged into matrices, then a matrix-matrix multiplication is performed, and finally the result is backtransformed to a tensor. The current implementation is significantly more efficient than previous ones capable of treating arbitrary high excitations.

  5. Competency Based Teacher Education Component. Curriculum Methods and Materials, Elementary Mathematics and Social Studies.

    ERIC Educational Resources Information Center

    Woodworth, William D.

    Four mathematical/social studies module clusters are presented in an effort to develop proficiency in instruction and in inductive and deductive teaching procedures. Modules within the first cluster concern systems of numeration, set operations, numbers, measurement, geometry, mathematics, and reasoning. The second mathematical cluster presents…

  6. [Procedure of seed quality testing and seed grading standard of Prunus humilis].

    PubMed

    Wen, Hao; Ren, Guang-Xi; Gao, Ya; Luo, Jun; Liu, Chun-Sheng; Li, Wei-Dong

    2014-11-01

    So far there exists no corresponding quality test procedures and grading standards for the seed of Prunus humilis, which is one of the important source of base of semen pruni. Therefor we set up test procedures that are adapt to characteristics of the P. humilis seed through the study of the test of sampling, seed purity, thousand-grain weight, seed moisture, seed viability and germination percentage. 50 cases of seed specimens of P. humilis tested. The related data were analyzed by cluster analysis. Through this research, the seed quality test procedure was developed, and the seed quality grading standard was formulated. The seed quality of each grade should meet the following requirements: for first grade seeds, germination percentage ≥ 68%, thousand-grain weight 383 g, purity ≥ 93%, seed moisture ≤ 5%; for second grade seeds, germination percentage ≥ 26%, thousand-grain weight ≥ 266 g, purity ≥ 73%, seed moisture ≤9%; for third grade seeds, germination percentage ≥ 10%, purity ≥ 50%, thousand-grain weight ≥ 08 g, seed moisture ≤ 13%.

  7. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Carmichael, Joshua Daniel; Carr, Christina; Pettit, Erin C.

    We apply a fully autonomous icequake detection methodology to a single day of high-sample rate (200 Hz) seismic network data recorded from the terminus of Taylor Glacier, ANT that temporally coincided with a brine release episode near Blood Falls (May 13, 2014). We demonstrate a statistically validated procedure to assemble waveforms triggered by icequakes into populations of clusters linked by intra-event waveform similarity. Our processing methodology implements a noise-adaptive power detector coupled with a complete-linkage clustering algorithm and noise-adaptive correlation detector. This detector-chain reveals a population of 20 multiplet sequences that includes ~150 icequakes and produces zero false alarms onmore » the concurrent, diurnally variable noise. Our results are very promising for identifying changes in background seismicity associated with the presence or absence of brine release episodes. We thereby suggest that our methodology could be applied to longer time periods to establish a brine-release monitoring program for Blood Falls that is based on icequake detections.« less

  8. Unequal cluster sizes in stepped-wedge cluster randomised trials: a systematic review

    PubMed Central

    Morris, Tom; Gray, Laura

    2017-01-01

    Objectives To investigate the extent to which cluster sizes vary in stepped-wedge cluster randomised trials (SW-CRT) and whether any variability is accounted for during the sample size calculation and analysis of these trials. Setting Any, not limited to healthcare settings. Participants Any taking part in an SW-CRT published up to March 2016. Primary and secondary outcome measures The primary outcome is the variability in cluster sizes, measured by the coefficient of variation (CV) in cluster size. Secondary outcomes include the difference between the cluster sizes assumed during the sample size calculation and those observed during the trial, any reported variability in cluster sizes and whether the methods of sample size calculation and methods of analysis accounted for any variability in cluster sizes. Results Of the 101 included SW-CRTs, 48% mentioned that the included clusters were known to vary in size, yet only 13% of these accounted for this during the calculation of the sample size. However, 69% of the trials did use a method of analysis appropriate for when clusters vary in size. Full trial reports were available for 53 trials. The CV was calculated for 23 of these: the median CV was 0.41 (IQR: 0.22–0.52). Actual cluster sizes could be compared with those assumed during the sample size calculation for 14 (26%) of the trial reports; the cluster sizes were between 29% and 480% of that which had been assumed. Conclusions Cluster sizes often vary in SW-CRTs. Reporting of SW-CRTs also remains suboptimal. The effect of unequal cluster sizes on the statistical power of SW-CRTs needs further exploration and methods appropriate to studies with unequal cluster sizes need to be employed. PMID:29146637

  9. Analyzing simulation-based PRA data through traditional and topological clustering: A BWR station blackout case study

    DOE PAGES

    Maljovec, D.; Liu, S.; Wang, B.; ...

    2015-07-14

    Here, dynamic probabilistic risk assessment (DPRA) methodologies couple system simulator codes (e.g., RELAP and MELCOR) with simulation controller codes (e.g., RAVEN and ADAPT). Whereas system simulator codes model system dynamics deterministically, simulation controller codes introduce both deterministic (e.g., system control logic and operating procedures) and stochastic (e.g., component failures and parameter uncertainties) elements into the simulation. Typically, a DPRA is performed by sampling values of a set of parameters and simulating the system behavior for that specific set of parameter values. For complex systems, a major challenge in using DPRA methodologies is to analyze the large number of scenarios generated,more » where clustering techniques are typically employed to better organize and interpret the data. In this paper, we focus on the analysis of two nuclear simulation datasets that are part of the risk-informed safety margin characterization (RISMC) boiling water reactor (BWR) station blackout (SBO) case study. We provide the domain experts a software tool that encodes traditional and topological clustering techniques within an interactive analysis and visualization environment, for understanding the structures of such high-dimensional nuclear simulation datasets. We demonstrate through our case study that both types of clustering techniques complement each other for enhanced structural understanding of the data.« less

  10. The effect of clustering on lot quality assurance sampling: a probabilistic model to calculate sample sizes for quality assessments

    PubMed Central

    2013-01-01

    Background Traditional Lot Quality Assurance Sampling (LQAS) designs assume observations are collected using simple random sampling. Alternatively, randomly sampling clusters of observations and then individuals within clusters reduces costs but decreases the precision of the classifications. In this paper, we develop a general framework for designing the cluster(C)-LQAS system and illustrate the method with the design of data quality assessments for the community health worker program in Rwanda. Results To determine sample size and decision rules for C-LQAS, we use the beta-binomial distribution to account for inflated risk of errors introduced by sampling clusters at the first stage. We present general theory and code for sample size calculations. The C-LQAS sample sizes provided in this paper constrain misclassification risks below user-specified limits. Multiple C-LQAS systems meet the specified risk requirements, but numerous considerations, including per-cluster versus per-individual sampling costs, help identify optimal systems for distinct applications. Conclusions We show the utility of C-LQAS for data quality assessments, but the method generalizes to numerous applications. This paper provides the necessary technical detail and supplemental code to support the design of C-LQAS for specific programs. PMID:24160725

  11. The effect of clustering on lot quality assurance sampling: a probabilistic model to calculate sample sizes for quality assessments.

    PubMed

    Hedt-Gauthier, Bethany L; Mitsunaga, Tisha; Hund, Lauren; Olives, Casey; Pagano, Marcello

    2013-10-26

    Traditional Lot Quality Assurance Sampling (LQAS) designs assume observations are collected using simple random sampling. Alternatively, randomly sampling clusters of observations and then individuals within clusters reduces costs but decreases the precision of the classifications. In this paper, we develop a general framework for designing the cluster(C)-LQAS system and illustrate the method with the design of data quality assessments for the community health worker program in Rwanda. To determine sample size and decision rules for C-LQAS, we use the beta-binomial distribution to account for inflated risk of errors introduced by sampling clusters at the first stage. We present general theory and code for sample size calculations.The C-LQAS sample sizes provided in this paper constrain misclassification risks below user-specified limits. Multiple C-LQAS systems meet the specified risk requirements, but numerous considerations, including per-cluster versus per-individual sampling costs, help identify optimal systems for distinct applications. We show the utility of C-LQAS for data quality assessments, but the method generalizes to numerous applications. This paper provides the necessary technical detail and supplemental code to support the design of C-LQAS for specific programs.

  12. Dimensional assessment of personality pathology in patients with eating disorders.

    PubMed

    Goldner, E M; Srikameswaran, S; Schroeder, M L; Livesley, W J; Birmingham, C L

    1999-02-22

    This study examined patients with eating disorders on personality pathology using a dimensional method. Female subjects who met DSM-IV diagnostic criteria for eating disorder (n = 136) were evaluated and compared to an age-controlled general population sample (n = 68). We assessed 18 features of personality disorder with the Dimensional Assessment of Personality Pathology - Basic Questionnaire (DAPP-BQ). Factor analysis and cluster analysis were used to derive three clusters of patients. A five-factor solution was obtained with limited intercorrelation between factors. Cluster analysis produced three clusters with the following characteristics: Cluster 1 members (constituting 49.3% of the sample and labelled 'rigid') had higher mean scores on factors denoting compulsivity and interpersonal difficulties; Cluster 2 (18.4% of the sample) showed highest scores in factors denoting psychopathy, neuroticism and impulsive features, and appeared to constitute a borderline psychopathology group; Cluster 3 (32.4% of the sample) was characterized by few differences in personality pathology in comparison to the normal population sample. Cluster membership was associated with DSM-IV diagnosis -- a large proportion of patients with anorexia nervosa were members of Cluster 1. An empirical classification of eating-disordered patients derived from dimensional assessment of personality pathology identified three groups with clinical relevance.

  13. Identification of subsurface microorganisms at Yucca Mountain; Third quarterly report, January 1, 1994--March 31, 1994

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stetzenbach, L.D.

    1994-05-01

    Bacteria isolated from ground water samples taken from 31 springs during 1993 were collected and processed according to procedures described in earlier reports. These procedures required aseptic collection of surface water samples in sterile screw-capped containers, transportation to the HRC microbiology laboratory, and culture by spread plating onto R2A medium. The isolates were further processed for identification using a gas chromatographic analysis of fatty acid methyl esters (FAME) extracted from cell membranes. This work generated a presumptive identification of 113 bacterial species distributed among 45 genera using a database obtained from Microbial ID, Inc., Newark, Delaware (MIDI). A preliminary examinationmore » of the FAME data was accomplished using cluster analysis and principal component analysis software obtained from MIDI. Typically, bacterial strains that cluster at less than 10 Euclidian distance units have fatty acid patterns consistent among members of the same species. Thus an organism obtained from one source can be recognized if it is isolated again from the same or any other source. This makes it possible to track the distribution of organisms and monitor environmental conditions or fluid transport mechanisms. Microorganisms are seldom found as monocultures in natural environments. They are more likely to be closely associated with other genera with complementary metabolic requirements. An understanding of the indigenous microorganism population is useful in understanding subtle changes in the environment. However, classification of environmental organisms using traditional methods is not ideal because differentiation of species with small variations or genera with very similar taxonomic characteristics is beyond the capabilities of traditional microbiological methods.« less

  14. A comparative study of mid-infrared diffuse reflection (DR) and attenuated total reflection (ATR) spectroscopy for the detection of fungal infection on RWA2-corn.

    PubMed

    Kos, Gregor; Krska, Rudolf; Lohninger, Hans; Griffiths, Peter R

    2004-01-01

    An investigation into the rapid detection of mycotoxin-producing fungi on corn by two mid-infrared spectroscopic techniques was undertaken. Corn samples from a single genotype (RWA2, blanks, and contaminated with Fusarium graminearum) were ground, sieved and, after appropriate sample preparation, subjected to mid-infrared spectroscopy using two different accessories (diffuse reflection and attenuated total reflection). The measured spectra were evaluated with principal component analysis (PCA) and the blank and contaminated samples were classified by cluster analysis. Reference data for fungal metabolites were obtained with conventional methods. After extraction and clean-up, each sample was analyzed for the toxin deoxynivalenol (DON) by gas chromatography with electron capture detection (GC-ECD) and ergosterol (a parameter for the total fungal biomass) by high-performance liquid chromatography with diode array detection (HPLC-DAD). The concentration ranges for contaminated samples were 880-3600 microg/kg for ergosterol and 300-2600 microg/kg for DON. Classification efficiency was 100% for ATR spectra. DR spectra did not show as obvious a clustering of contaminated and blank samples. Results and trends were also observed in single spectra plots. Quantification using a PLS1 regression algorithm showed good correlation with DON reference data, but a rather high standard error of prediction (SEP) with 600 microg/kg (DR) and 490 microg/kg (ATR), respectively, for ergosterol. Comparing measurement procedures and results showed advantages for the ATR technique, mainly owing to its ease of use and the easier interpretation of results that were better with respect to classification and quantification.

  15. Nursing advocacy in procedural pain care.

    PubMed

    Vaartio, Heli; Leino-Kilpi, Helena; Suominen, Tarja; Puukka, Pauli

    2009-05-01

    In nursing, the concept of advocacy is often understood in terms of reactive or proactive action aimed at protecting patients' legal or moral rights. However, advocacy activities have not often been researched in the context of everyday clinical nursing practice, at least from patients' point of view. This study investigated the implementation of nursing advocacy in the context of procedural pain care from the perspectives of both patients and nurses. The cross-sectional study was conducted on a cluster sample of surgical otolaryngology patients (n = 405) and nurses (n = 118) from 12 hospital units in Finland. The data were obtained using an instrument specially designed for this purpose, and analysed statistically by descriptive and non-parametric methods. According to the results, patients and nurses have slightly different views about which dimensions of advocacy are implemented in procedural pain care. It seems that advocacy acts are chosen and implemented rather haphazardly, depending partly on how active patients are in expressing their wishes and interests and partly on nurses' empowerment.

  16. Toothbrushing procedure in schoolchildren with no previous formal instruction: variables associated to dental biofilm removal.

    PubMed

    Rossi, Glenda N; Sorazabal, Ana L; Salgado, Pablo A; Squassi, Aldo F; Klemonskis, Graciela L

    2016-04-01

    The aim of this study was to establish the association between features regarding brushing procedure performed by schoolchildren without previous formal training and the effectiveness of biofilm removal. Out of a population of 8900 6- and 7-year-old schoolchildren in Buenos Aires City, 600 children were selected from schools located in homogeneous risk areas. Informed consent was requested from parents or guardians and formal assent was obtained from children themselves. The final sample consisted of 316 subjects. The following tooth brushing variables were analyzed: toothbrush-gripping, orientation of active part of bristles with respect to the tooth, type of movement applied, brushing both jaws together or separately, including all 6 sextants and duration of brushing. The level of dental biofilm after brushing was determined by O'Leary's index, acceptable cut-off point = 20%. Four calibrated dentists performed observations and clinical examinations. Frequency distribution, central tendency and dispersion measures were calculated. Cluster analyses were performed; proportions of variables for each cluster were compared with Bonferroni's correction and OR was obtained. The most frequent categories were: palm gripping (71.51%); perpendicular orientation (85.8%); horizontal movement (95.6%); separate addressing of jaws (68%) and inclusion of all 6 sextants (50.6%). Mean duration of brushing was 48.78 ± 27.36 seconds. 42.7% of the children achieved an acceptable biofilm level. The cluster with the highest proportion of subjects with acceptable post-brushing biofilm levels (p<0.05) differed significantly from the rest for the variable "inclusion of all 6 sextants in brushing procedure". OR was 2.538 (CI 95% 1.603 - 4.017). Inclusion of all six sextants could be a determinant variable for the removal of biofilm by brushing in schoolchildren, and should be systematized as a component in oral hygiene education. Sociedad Argentina de Investigación Odontológica.

  17. Cluster designs to assess the prevalence of acute malnutrition by lot quality assurance sampling: a validation study by computer simulation

    PubMed Central

    Olives, Casey; Pagano, Marcello; Deitchler, Megan; Hedt, Bethany L; Egge, Kari; Valadez, Joseph J

    2009-01-01

    Traditional lot quality assurance sampling (LQAS) methods require simple random sampling to guarantee valid results. However, cluster sampling has been proposed to reduce the number of random starting points. This study uses simulations to examine the classification error of two such designs, a 67×3 (67 clusters of three observations) and a 33×6 (33 clusters of six observations) sampling scheme to assess the prevalence of global acute malnutrition (GAM). Further, we explore the use of a 67×3 sequential sampling scheme for LQAS classification of GAM prevalence. Results indicate that, for independent clusters with moderate intracluster correlation for the GAM outcome, the three sampling designs maintain approximate validity for LQAS analysis. Sequential sampling can substantially reduce the average sample size that is required for data collection. The presence of intercluster correlation can impact dramatically the classification error that is associated with LQAS analysis. PMID:20011037

  18. An Exemplar-Based Multi-View Domain Generalization Framework for Visual Recognition.

    PubMed

    Niu, Li; Li, Wen; Xu, Dong; Cai, Jianfei

    2018-02-01

    In this paper, we propose a new exemplar-based multi-view domain generalization (EMVDG) framework for visual recognition by learning robust classifier that are able to generalize well to arbitrary target domain based on the training samples with multiple types of features (i.e., multi-view features). In this framework, we aim to address two issues simultaneously. First, the distribution of training samples (i.e., the source domain) is often considerably different from that of testing samples (i.e., the target domain), so the performance of the classifiers learnt on the source domain may drop significantly on the target domain. Moreover, the testing data are often unseen during the training procedure. Second, when the training data are associated with multi-view features, the recognition performance can be further improved by exploiting the relation among multiple types of features. To address the first issue, considering that it has been shown that fusing multiple SVM classifiers can enhance the domain generalization ability, we build our EMVDG framework upon exemplar SVMs (ESVMs), in which a set of ESVM classifiers are learnt with each one trained based on one positive training sample and all the negative training samples. When the source domain contains multiple latent domains, the learnt ESVM classifiers are expected to be grouped into multiple clusters. To address the second issue, we propose two approaches under the EMVDG framework based on the consensus principle and the complementary principle, respectively. Specifically, we propose an EMVDG_CO method by adding a co-regularizer to enforce the cluster structures of ESVM classifiers on different views to be consistent based on the consensus principle. Inspired by multiple kernel learning, we also propose another EMVDG_MK method by fusing the ESVM classifiers from different views based on the complementary principle. In addition, we further extend our EMVDG framework to exemplar-based multi-view domain adaptation (EMVDA) framework when the unlabeled target domain data are available during the training procedure. The effectiveness of our EMVDG and EMVDA frameworks for visual recognition is clearly demonstrated by comprehensive experiments on three benchmark data sets.

  19. Spatial scan statistics for detection of multiple clusters with arbitrary shapes.

    PubMed

    Lin, Pei-Sheng; Kung, Yi-Hung; Clayton, Murray

    2016-12-01

    In applying scan statistics for public health research, it would be valuable to develop a detection method for multiple clusters that accommodates spatial correlation and covariate effects in an integrated model. In this article, we connect the concepts of the likelihood ratio (LR) scan statistic and the quasi-likelihood (QL) scan statistic to provide a series of detection procedures sufficiently flexible to apply to clusters of arbitrary shape. First, we use an independent scan model for detection of clusters and then a variogram tool to examine the existence of spatial correlation and regional variation based on residuals of the independent scan model. When the estimate of regional variation is significantly different from zero, a mixed QL estimating equation is developed to estimate coefficients of geographic clusters and covariates. We use the Benjamini-Hochberg procedure (1995) to find a threshold for p-values to address the multiple testing problem. A quasi-deviance criterion is used to regroup the estimated clusters to find geographic clusters with arbitrary shapes. We conduct simulations to compare the performance of the proposed method with other scan statistics. For illustration, the method is applied to enterovirus data from Taiwan. © 2016, The International Biometric Society.

  20. A class of spherical, truncated, anisotropic models for application to globular clusters

    NASA Astrophysics Data System (ADS)

    de Vita, Ruggero; Bertin, Giuseppe; Zocchi, Alice

    2016-05-01

    Recently, a class of non-truncated, radially anisotropic models (the so-called f(ν)-models), originally constructed in the context of violent relaxation and modelling of elliptical galaxies, has been found to possess interesting qualities in relation to observed and simulated globular clusters. In view of new applications to globular clusters, we improve this class of models along two directions. To make them more suitable for the description of small stellar systems hosted by galaxies, we introduce a "tidal" truncation by means of a procedure that guarantees full continuity of the distribution function. The new fT(ν)-models are shown to provide a better fit to the observed photometric and spectroscopic profiles for a sample of 13 globular clusters studied earlier by means of non-truncated models; interestingly, the best-fit models also perform better with respect to the radial-orbit instability. Then, we design a flexible but simple two-component family of truncated models to study the separate issues of mass segregation and multiple populations. We do not aim at a fully realistic description of globular clusters to compete with the description currently obtained by means of dedicated simulations. The goal here is to try to identify the simplest models, that is, those with the smallest number of free parameters, but still have the capacity to provide a reasonable description for clusters that are evidently beyond the reach of one-component models. With this tool, we aim at identifying the key factors that characterize mass segregation or the presence of multiple populations. To reduce the relevant parameter space, we formulate a few physical arguments based on recent observations and simulations. A first application to two well-studied globular clusters is briefly described and discussed.

  1. Navigating complex sample analysis using national survey data.

    PubMed

    Saylor, Jennifer; Friedmann, Erika; Lee, Hyeon Joo

    2012-01-01

    The National Center for Health Statistics conducts the National Health and Nutrition Examination Survey and other national surveys with probability-based complex sample designs. Goals of national surveys are to provide valid data for the population of the United States. Analyses of data from population surveys present unique challenges in the research process but are valuable avenues to study the health of the United States population. The aim of this study was to demonstrate the importance of using complex data analysis techniques for data obtained with complex multistage sampling design and provide an example of analysis using the SPSS Complex Samples procedure. Illustration of challenges and solutions specific to secondary data analysis of national databases are described using the National Health and Nutrition Examination Survey as the exemplar. Oversampling of small or sensitive groups provides necessary estimates of variability within small groups. Use of weights without complex samples accurately estimates population means and frequency from the sample after accounting for over- or undersampling of specific groups. Weighting alone leads to inappropriate population estimates of variability, because they are computed as if the measures were from the entire population rather than a sample in the data set. The SPSS Complex Samples procedure allows inclusion of all sampling design elements, stratification, clusters, and weights. Use of national data sets allows use of extensive, expensive, and well-documented survey data for exploratory questions but limits analysis to those variables included in the data set. The large sample permits examination of multiple predictors and interactive relationships. Merging data files, availability of data in several waves of surveys, and complex sampling are techniques used to provide a representative sample but present unique challenges. In sophisticated data analysis techniques, use of these data is optimized.

  2. Cluster decomposition of full configuration interaction wave functions: A tool for chemical interpretation of systems with strong correlation

    NASA Astrophysics Data System (ADS)

    Lehtola, Susi; Tubman, Norm M.; Whaley, K. Birgitta; Head-Gordon, Martin

    2017-10-01

    Approximate full configuration interaction (FCI) calculations have recently become tractable for systems of unforeseen size, thanks to stochastic and adaptive approximations to the exponentially scaling FCI problem. The result of an FCI calculation is a weighted set of electronic configurations, which can also be expressed in terms of excitations from a reference configuration. The excitation amplitudes contain information on the complexity of the electronic wave function, but this information is contaminated by contributions from disconnected excitations, i.e., those excitations that are just products of independent lower-level excitations. The unwanted contributions can be removed via a cluster decomposition procedure, making it possible to examine the importance of connected excitations in complicated multireference molecules which are outside the reach of conventional algorithms. We present an implementation of the cluster decomposition analysis and apply it to both true FCI wave functions, as well as wave functions generated from the adaptive sampling CI algorithm. The cluster decomposition is useful for interpreting calculations in chemical studies, as a diagnostic for the convergence of various excitation manifolds, as well as as a guidepost for polynomially scaling electronic structure models. Applications are presented for (i) the double dissociation of water, (ii) the carbon dimer, (iii) the π space of polyacenes, and (iv) the chromium dimer. While the cluster amplitudes exhibit rapid decay with an increasing rank for the first three systems, even connected octuple excitations still appear important in Cr2, suggesting that spin-restricted single-reference coupled-cluster approaches may not be tractable for some problems in transition metal chemistry.

  3. Testing for X-Ray–SZ Differences and Redshift Evolution in the X-Ray Morphology of Galaxy Clusters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nurgaliev, D.; McDonald, M.; Benson, B. A.

    We present a quantitative study of the X-ray morphology of galaxy clusters, as a function of their detection method and redshift. We analyze two separate samples of galaxy clusters: a sample of 36 clusters atmore » $$0.35\\lt z\\lt 0.9$$ selected in the X-ray with the ROSAT PSPC 400 deg(2) survey, and a sample of 90 clusters at $$0.25\\lt z\\lt 1.2$$ selected via the Sunyaev–Zel’dovich (SZ) effect with the South Pole Telescope. Clusters from both samples have similar-quality Chandra observations, which allow us to quantify their X-ray morphologies via two distinct methods: centroid shifts (w) and photon asymmetry ($${A}_{\\mathrm{phot}}$$). The latter technique provides nearly unbiased morphology estimates for clusters spanning a broad range of redshift and data quality. We further compare the X-ray morphologies of X-ray- and SZ-selected clusters with those of simulated clusters. We do not find a statistically significant difference in the measured X-ray morphology of X-ray and SZ-selected clusters over the redshift range probed by these samples, suggesting that the two are probing similar populations of clusters. We find that the X-ray morphologies of simulated clusters are statistically indistinguishable from those of X-ray- or SZ-selected clusters, implying that the most important physics for dictating the large-scale gas morphology (outside of the core) is well-approximated in these simulations. Finally, we find no statistically significant redshift evolution in the X-ray morphology (both for observed and simulated clusters), over the range of $$z\\sim 0.3$$ to $$z\\sim 1$$, seemingly in contradiction with the redshift-dependent halo merger rate predicted by simulations.« less

  4. Testing for X-Ray–SZ Differences and Redshift Evolution in the X-Ray Morphology of Galaxy Clusters

    DOE PAGES

    Nurgaliev, D.; McDonald, M.; Benson, B. A.; ...

    2017-05-16

    We present a quantitative study of the X-ray morphology of galaxy clusters, as a function of their detection method and redshift. We analyze two separate samples of galaxy clusters: a sample of 36 clusters atmore » $$0.35\\lt z\\lt 0.9$$ selected in the X-ray with the ROSAT PSPC 400 deg(2) survey, and a sample of 90 clusters at $$0.25\\lt z\\lt 1.2$$ selected via the Sunyaev–Zel’dovich (SZ) effect with the South Pole Telescope. Clusters from both samples have similar-quality Chandra observations, which allow us to quantify their X-ray morphologies via two distinct methods: centroid shifts (w) and photon asymmetry ($${A}_{\\mathrm{phot}}$$). The latter technique provides nearly unbiased morphology estimates for clusters spanning a broad range of redshift and data quality. We further compare the X-ray morphologies of X-ray- and SZ-selected clusters with those of simulated clusters. We do not find a statistically significant difference in the measured X-ray morphology of X-ray and SZ-selected clusters over the redshift range probed by these samples, suggesting that the two are probing similar populations of clusters. We find that the X-ray morphologies of simulated clusters are statistically indistinguishable from those of X-ray- or SZ-selected clusters, implying that the most important physics for dictating the large-scale gas morphology (outside of the core) is well-approximated in these simulations. Finally, we find no statistically significant redshift evolution in the X-ray morphology (both for observed and simulated clusters), over the range of $$z\\sim 0.3$$ to $$z\\sim 1$$, seemingly in contradiction with the redshift-dependent halo merger rate predicted by simulations.« less

  5. Re-estimating sample size in cluster randomised trials with active recruitment within clusters.

    PubMed

    van Schie, S; Moerbeek, M

    2014-08-30

    Often only a limited number of clusters can be obtained in cluster randomised trials, although many potential participants can be recruited within each cluster. Thus, active recruitment is feasible within the clusters. To obtain an efficient sample size in a cluster randomised trial, the cluster level and individual level variance should be known before the study starts, but this is often not the case. We suggest using an internal pilot study design to address this problem of unknown variances. A pilot can be useful to re-estimate the variances and re-calculate the sample size during the trial. Using simulated data, it is shown that an initially low or high power can be adjusted using an internal pilot with the type I error rate remaining within an acceptable range. The intracluster correlation coefficient can be re-estimated with more precision, which has a positive effect on the sample size. We conclude that an internal pilot study design may be used if active recruitment is feasible within a limited number of clusters. Copyright © 2014 John Wiley & Sons, Ltd.

  6. An optical catalog of galaxy clusters obtained from an adaptive matched filter finder applied to SDSS DR9 data

    NASA Astrophysics Data System (ADS)

    Banerjee, P.; Szabo, T.; Pierpaoli, E.; Franco, G.; Ortiz, M.; Oramas, A.; Tornello, B.

    2018-01-01

    We present a new galaxy cluster catalog constructed from the Sloan Digital Sky Survey Data Release 9 (SDSS DR9) using an Adaptive Matched Filter (AMF) technique. Our catalog has 46,479 galaxy clusters with richness Λ200 > 20 in the redshift range 0.045 ≤ z < 0.641 in ∼11,500 deg2 of the sky. Angular position, richness, core and virial radii and redshift estimates for these clusters, as well as their error analysis, are provided as part of this catalog. In addition to the main version of the catalog, we also provide an extended version with a lower richness cut, containing 79,368 clusters. This version, in addition to the clusters in the main catalog, also contains those clusters (with richness 10 < Λ200 < 20) which have a one-to-one match in the DR8 catalog developed by Wen et al.(WHL). We obtain probabilities for cluster membership for each galaxy and implement several procedures for the identification and removal of false cluster detections. We cross-correlate the main AMF DR9 catalog with a number of cluster catalogs in different wavebands (Optical, X-ray). We compare our catalog with other SDSS-based ones such as the redMaPPer (26,350 clusters) and the Wen et al. (WHL) (132,684 clusters) in the same area of the sky and in the overlapping redshift range. We match 97% of the richest Abell clusters (Richness group 3), the same as WHL, while redMaPPer matches ∼ 90% of these clusters. Considering AMF DR9 richness bins, redMaPPer does not have one-to-one matches for 70% of our lowest richness clusters (20 < Λ200 < 40), while WHL matches 54% of these missed clusters (not present in redMaPPer). redMaPPer consistently does not possess one-to-one matches for ∼ 20% AMF DR9 clusters with Λ200 > 40, while WHL matches ≥ 70% of these missed clusters on average. For comparisons with X-ray clusters, we match the AMF catalog with BAX, MCXC and a combined catalog from NORAS and REFLEX. We consistently obtain a greater number of one-to-one matches for X-ray clusters across higher luminosity bins (Lx > 6 × 1044 ergs/sec) than redMaPPer while WHL matches the most clusters overall. For the most luminous clusters (Lx > 8), our catalog performs equivalently to WHL. This new catalog provides a wider sample than redMaPPer while retaining many fewer objects than WHL.

  7. The X-CLASS-redMaPPer galaxy cluster comparison. I. Identification procedures

    NASA Astrophysics Data System (ADS)

    Sadibekova, T.; Pierre, M.; Clerc, N.; Faccioli, L.; Gastaud, R.; Le Fevre, J.-P.; Rozo, E.; Rykoff, E.

    2014-11-01

    Context. This paper is the first in a series undertaking a comprehensive correlation analysis between optically selected and X-ray-selected cluster catalogues. The rationale of the project is to develop a holistic picture of galaxy clusters utilising optical and X-ray-cluster-selected catalogues with well-understood selection functions. Aims: Unlike most of the X-ray/optical cluster correlations to date, the present paper focuses on the non-matching objects in either waveband. We investigate how the differences observed between the optical and X-ray catalogues may stem from (1) a shortcoming of the detection algorithms; (2) dispersion in the X-ray/optical scaling relations; or (3) substantial intrinsic differences between the cluster populations probed in the X-ray and optical bands. The aim is to inventory and elucidate these effects in order to account for selection biases in the further determination of X-ray/optical cluster scaling relations. Methods: We correlated the X-CLASS serendipitous cluster catalogue extracted from the XMM archive with the redMaPPer optical cluster catalogue derived from the Sloan Digital Sky Survey (DR8). We performed a detailed and, in large part, interactive analysis of the matching output from the correlation. The overlap between the two catalogues has been accurately determined and possible cluster positional errors were manually recovered. The final samples comprise 270 and 355 redMaPPer and X-CLASS clusters, respectively. X-ray cluster matching rates were analysed as a function of optical richness. In the second step, the redMaPPer clusters were correlated with the entire X-ray catalogue, containing point and uncharacterised sources (down to a few 10-15 erg s-1 cm-2 in the [0.5-2] keV band). A stacking analysis was performed for the remaining undetected optical clusters. Results: We find that all rich (λ ≥ 80) clusters are detected in X-rays out to z = 0.6. Below this redshift, the richness threshold for X-ray detection steadily decreases with redshift. Likewise, all X-ray bright clusters are detected by redMaPPer. After correcting for obvious pipeline shortcomings (about 10% of the cases both in optical and X-ray), ~50% of the redMaPPer (down to a richness of 20) are found to coincide with an X-CLASS cluster; when considering X-ray sources of any type, this fraction increases to ~80%; for the remaining objects, the stacking analysis finds a weak signal within 0.5 Mpc around the cluster optical centres. The fraction of clusters totally dominated by AGN-type emission appears to be a few percent. Conversely, ~40% of the X-CLASS clusters are identified with a redMaPPer (down to a richness of 20) - part of the non-matches being due to the X-CLASS sample extending further out than redMaPPer (z< 1.5 vs. z< 0.6), but extending the correlation down to a richness of 5 raises the matching rate to ~65%. Conclusions: This state-of-the-art study involving two well-validated cluster catalogues has shown itself to be complex, and it points to a number of issues inherent to blind cross-matching, owing both to pipeline shortcomings and cluster peculiar properties. These can only been accounted for after a manual check. The combined X-ray and optical scaling relations will be presented in a subsequent article.

  8. A PRIOR EVALUATION OF TWO-STAGE CLUSTER SAMPLING FOR ACCURACY ASSESSMENT OF LARGE-AREA LAND-COVER MAPS

    EPA Science Inventory

    Two-stage cluster sampling reduces the cost of collecting accuracy assessment reference data by constraining sample elements to fall within a limited number of geographic domains (clusters). However, because classification error is typically positively spatially correlated, withi...

  9. Cluster Stability Estimation Based on a Minimal Spanning Trees Approach

    NASA Astrophysics Data System (ADS)

    Volkovich, Zeev (Vladimir); Barzily, Zeev; Weber, Gerhard-Wilhelm; Toledano-Kitai, Dvora

    2009-08-01

    Among the areas of data and text mining which are employed today in science, economy and technology, clustering theory serves as a preprocessing step in the data analyzing. However, there are many open questions still waiting for a theoretical and practical treatment, e.g., the problem of determining the true number of clusters has not been satisfactorily solved. In the current paper, this problem is addressed by the cluster stability approach. For several possible numbers of clusters we estimate the stability of partitions obtained from clustering of samples. Partitions are considered consistent if their clusters are stable. Clusters validity is measured as the total number of edges, in the clusters' minimal spanning trees, connecting points from different samples. Actually, we use the Friedman and Rafsky two sample test statistic. The homogeneity hypothesis, of well mingled samples within the clusters, leads to asymptotic normal distribution of the considered statistic. Resting upon this fact, the standard score of the mentioned edges quantity is set, and the partition quality is represented by the worst cluster corresponding to the minimal standard score value. It is natural to expect that the true number of clusters can be characterized by the empirical distribution having the shortest left tail. The proposed methodology sequentially creates the described value distribution and estimates its left-asymmetry. Numerical experiments, presented in the paper, demonstrate the ability of the approach to detect the true number of clusters.

  10. 2-Way k-Means as a Model for Microbiome Samples.

    PubMed

    Jackson, Weston J; Agarwal, Ipsita; Pe'er, Itsik

    2017-01-01

    Motivation . Microbiome sequencing allows defining clusters of samples with shared composition. However, this paradigm poorly accounts for samples whose composition is a mixture of cluster-characterizing ones and which therefore lie in between them in the cluster space. This paper addresses unsupervised learning of 2-way clusters. It defines a mixture model that allows 2-way cluster assignment and describes a variant of generalized k -means for learning such a model. We demonstrate applicability to microbial 16S rDNA sequencing data from the Human Vaginal Microbiome Project.

  11. 2-Way k-Means as a Model for Microbiome Samples

    PubMed Central

    2017-01-01

    Motivation. Microbiome sequencing allows defining clusters of samples with shared composition. However, this paradigm poorly accounts for samples whose composition is a mixture of cluster-characterizing ones and which therefore lie in between them in the cluster space. This paper addresses unsupervised learning of 2-way clusters. It defines a mixture model that allows 2-way cluster assignment and describes a variant of generalized k-means for learning such a model. We demonstrate applicability to microbial 16S rDNA sequencing data from the Human Vaginal Microbiome Project. PMID:29177026

  12. The Study on Mental Health at Work: Design and sampling.

    PubMed

    Rose, Uwe; Schiel, Stefan; Schröder, Helmut; Kleudgen, Martin; Tophoven, Silke; Rauch, Angela; Freude, Gabriele; Müller, Grit

    2017-08-01

    The Study on Mental Health at Work (S-MGA) generates the first nationwide representative survey enabling the exploration of the relationship between working conditions, mental health and functioning. This paper describes the study design, sampling procedures and data collection, and presents a summary of the sample characteristics. S-MGA is a representative study of German employees aged 31-60 years subject to social security contributions. The sample was drawn from the employment register based on a two-stage cluster sampling procedure. Firstly, 206 municipalities were randomly selected from a pool of 12,227 municipalities in Germany. Secondly, 13,590 addresses were drawn from the selected municipalities for the purpose of conducting 4500 face-to-face interviews. The questionnaire covers psychosocial working and employment conditions, measures of mental health, work ability and functioning. Data from personal interviews were combined with employment histories from register data. Descriptive statistics of socio-demographic characteristics and logistic regressions analyses were used for comparing population, gross sample and respondents. In total, 4511 face-to-face interviews were conducted. A test for sampling bias revealed that individuals in older cohorts participated more often, while individuals with an unknown educational level, residing in major cities or with a non-German ethnic background were slightly underrepresented. There is no indication of major deviations in characteristics between the basic population and the sample of respondents. Hence, S-MGA provides representative data for research on work and health, designed as a cohort study with plans to rerun the survey 5 years after the first assessment.

  13. X-Ray Morphological Analysis of the Planck ESZ Clusters

    NASA Astrophysics Data System (ADS)

    Lovisari, Lorenzo; Forman, William R.; Jones, Christine; Ettori, Stefano; Andrade-Santos, Felipe; Arnaud, Monique; Démoclès, Jessica; Pratt, Gabriel W.; Randall, Scott; Kraft, Ralph

    2017-09-01

    X-ray observations show that galaxy clusters have a very large range of morphologies. The most disturbed systems, which are good to study how clusters form and grow and to test physical models, may potentially complicate cosmological studies because the cluster mass determination becomes more challenging. Thus, we need to understand the cluster properties of our samples to reduce possible biases. This is complicated by the fact that different experiments may detect different cluster populations. For example, Sunyaev-Zeldovich (SZ) selected cluster samples have been found to include a greater fraction of disturbed systems than X-ray selected samples. In this paper we determine eight morphological parameters for the Planck Early Sunyaev-Zeldovich (ESZ) objects observed with XMM-Newton. We found that two parameters, concentration and centroid shift, are the best to distinguish between relaxed and disturbed systems. For each parameter we provide the values that allow selecting the most relaxed or most disturbed objects from a sample. We found that there is no mass dependence on the cluster dynamical state. By comparing our results with what was obtained with REXCESS clusters, we also confirm that the ESZ clusters indeed tend to be more disturbed, as found by previous studies.

  14. X-Ray Morphological Analysis of the Planck ESZ Clusters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lovisari, Lorenzo; Forman, William R.; Jones, Christine

    2017-09-01

    X-ray observations show that galaxy clusters have a very large range of morphologies. The most disturbed systems, which are good to study how clusters form and grow and to test physical models, may potentially complicate cosmological studies because the cluster mass determination becomes more challenging. Thus, we need to understand the cluster properties of our samples to reduce possible biases. This is complicated by the fact that different experiments may detect different cluster populations. For example, Sunyaev–Zeldovich (SZ) selected cluster samples have been found to include a greater fraction of disturbed systems than X-ray selected samples. In this paper wemore » determine eight morphological parameters for the Planck Early Sunyaev–Zeldovich (ESZ) objects observed with XMM-Newton . We found that two parameters, concentration and centroid shift, are the best to distinguish between relaxed and disturbed systems. For each parameter we provide the values that allow selecting the most relaxed or most disturbed objects from a sample. We found that there is no mass dependence on the cluster dynamical state. By comparing our results with what was obtained with REXCESS clusters, we also confirm that the ESZ clusters indeed tend to be more disturbed, as found by previous studies.« less

  15. Feasibility of a novel one-stop ISET device to capture CTCs and its clinical application

    PubMed Central

    Zheng, Liang; Zhi, Xuan; Cheng, Boran; Chen, Yuanyuan; Zhang, Chunxiao; Shi, Dongdong; Song, Haibin; Cai, Congli; Zhou, Pengfei; Xiong, Bin

    2017-01-01

    Introduction Circulating tumor cells (CTCs) play a crucial role in cancer metastasis. In this study, we introduced a novel isolation method by size of epithelial tumor cells (ISET) device with automatic isolation and staining procedure, named one-stop ISET (osISET) and validated its feasibility to capture CTCs from cancer patients. Moreover, we aim to investigate the correlation between clinicopathologic features and CTCs in colorectal cancer (CRC) in order to explore its clinical application. Results The capture efficiency ranged from 80.3% to 88% with tumor cells spiked into medium while 67% to 78.3% with tumor cells spiked into healthy donors’ blood. In detection blood samples of 72 CRC patients, CTCs and clusters of circulating tumor cells (CTC-clusters) were detected with a positive rate of 52.8% (38/72) and 18.1% (13/72) respectively. Moreover, CTC positive rate was associated with factors of lymphatic or venous invasion, tumor depth, lymph node metastasis and TNM stage in CRC patients (p < 0.01). Lymphocyte count and neutrophil to lymphocyte ratio (NLR) were significantly different between CTC positive and negative groups (p < 0.01). Materials and Methods The capture efficiency of the device was tested by spiking cancer cells (MCF-7, A549, SW480, Hela) into medium or blood samples of healthy donors. Blood samples of 72 CRC patients were detected by osISET device. The clinicopathologic characteristics of 72 CRC patients were collected and the association with CTC positive rate or CTC count were analyzed. Conclusions Our osISET device was feasible to capture and identify CTCs and CTC-clusters from cancer patients. In addition, our device holds a potential for application in cancer management. PMID:27935872

  16. Constructing simple yet accurate potentials for describing the solvation of HCl/water clusters in bulk helium and nanodroplets.

    PubMed

    Boese, A Daniel; Forbert, Harald; Masia, Marco; Tekin, Adem; Marx, Dominik; Jansen, Georg

    2011-08-28

    The infrared spectroscopy of molecules, complexes, and molecular aggregates dissolved in superfluid helium clusters, commonly called HElium NanoDroplet Isolation (HENDI) spectroscopy, is an established, powerful experimental technique for extracting high resolution ro-vibrational spectra at ultra-low temperatures. Realistic quantum simulations of such systems, in particular in cases where the solute is undergoing a chemical reaction, require accurate solute-helium potentials which are also simple enough to be efficiently evaluated over the vast number of steps required in typical Monte Carlo or molecular dynamics sampling. This precludes using global potential energy surfaces as often parameterized for small complexes in the realm of high-resolution spectroscopic investigations that, in view of the computational effort imposed, are focused on the intermolecular interaction of rigid molecules with helium. Simple Lennard-Jones-like pair potentials, on the other hand, fall short in providing the required flexibility and accuracy in order to account for chemical reactions of the solute molecule. Here, a general scheme of constructing sufficiently accurate site-site potentials for use in typical quantum simulations is presented. This scheme employs atom-based grids, accounts for local and global minima, and is applied to the special case of a HCl(H(2)O)(4) cluster solvated by helium. As a first step, accurate interaction energies of a helium atom with a set of representative configurations sampled from a trajectory following the dissociation of the HCl(H(2)O)(4) cluster were computed using an efficient combination of density functional theory and symmetry-adapted perturbation theory, i.e. the DFT-SAPT approach. For each of the sampled cluster configurations, a helium atom was placed at several hundred positions distributed in space, leading to an overall number of about 400,000 such quantum chemical calculations. The resulting total interaction energies, decomposed into several energetic contributions, served to fit a site-site potential, where the sites are located at the atomic positions and, additionally, pseudo-sites are distributed along the lines joining pairs of atom sites within the molecular cluster. This approach ensures that this solute-helium potential is able to describe both undissociated molecular and dissociated (zwitter-) ionic configurations, as well as the interconnecting reaction pathway without re-adjusting partial charges or other parameters depending on the particular configuration. Test calculations of the larger HCl(H(2)O)(5) cluster interacting with helium demonstrate the transferability of the derived site-site potential. This specific potential can be readily used in quantum simulations of such HCl/water clusters in bulk helium or helium nanodroplets, whereas the underlying construction procedure can be generalized to other molecular solutes in other atomic solvents such as those encountered in rare gas matrix isolation spectroscopy.

  17. Adaptive Cluster Sampling for Forest Inventories

    Treesearch

    Francis A. Roesch

    1993-01-01

    Adaptive cluster sampling is shown to be a viable alternative for sampling forests when there are rare characteristics of the forest trees which are of interest and occur on clustered trees. The ideas of recent work in Thompson (1990) have been extended to the case in which the initial sample is selected with unequal probabilities. An example is given in which the...

  18. Sample size adjustments for varying cluster sizes in cluster randomized trials with binary outcomes analyzed with second-order PQL mixed logistic regression.

    PubMed

    Candel, Math J J M; Van Breukelen, Gerard J P

    2010-06-30

    Adjustments of sample size formulas are given for varying cluster sizes in cluster randomized trials with a binary outcome when testing the treatment effect with mixed effects logistic regression using second-order penalized quasi-likelihood estimation (PQL). Starting from first-order marginal quasi-likelihood (MQL) estimation of the treatment effect, the asymptotic relative efficiency of unequal versus equal cluster sizes is derived. A Monte Carlo simulation study shows this asymptotic relative efficiency to be rather accurate for realistic sample sizes, when employing second-order PQL. An approximate, simpler formula is presented to estimate the efficiency loss due to varying cluster sizes when planning a trial. In many cases sampling 14 per cent more clusters is sufficient to repair the efficiency loss due to varying cluster sizes. Since current closed-form formulas for sample size calculation are based on first-order MQL, planning a trial also requires a conversion factor to obtain the variance of the second-order PQL estimator. In a second Monte Carlo study, this conversion factor turned out to be 1.25 at most. (c) 2010 John Wiley & Sons, Ltd.

  19. Searching for the 3.5 keV Line in the Stacked Suzaku Observations of Galaxy Clusters

    NASA Technical Reports Server (NTRS)

    Bulbul, Esra; Markevitch, Maxim; Foster, Adam; Miller, Eric; Bautz, Mark; Lowenstein, Mike; Randall, Scott W.; Smith, Randall K.

    2016-01-01

    We perform a detailed study of the stacked Suzaku observations of 47 galaxy clusters, spanning a redshift range of 0.01-0.45, to search for the unidentified 3.5 keV line. This sample provides an independent test for the previously detected line. We detect a 2sigma-significant spectral feature at 3.5 keV in the spectrum of the full sample. When the sample is divided into two subsamples (cool-core and non-cool core clusters), the cool-core subsample shows no statistically significant positive residuals at the line energy. A very weak (approx. 2sigma confidence) spectral feature at 3.5 keV is permitted by the data from the non-cool-core clusters sample. The upper limit on a neutrino decay mixing angle of sin(sup 2)(2theta) = 6.1 x 10(exp -11) from the full Suzaku sample is consistent with the previous detections in the stacked XMM-Newton sample of galaxy clusters (which had a higher statistical sensitivity to faint lines), M31, and Galactic center, at a 90% confidence level. However, the constraint from the present sample, which does not include the Perseus cluster, is in tension with previously reported line flux observed in the core of the Perseus cluster with XMM-Newton and Suzaku.

  20. Objective sampling design in a highly heterogeneous landscape - characterizing environmental determinants of malaria vector distribution in French Guiana, in the Amazonian region.

    PubMed

    Roux, Emmanuel; Gaborit, Pascal; Romaña, Christine A; Girod, Romain; Dessay, Nadine; Dusfour, Isabelle

    2013-12-01

    Sampling design is a key issue when establishing species inventories and characterizing habitats within highly heterogeneous landscapes. Sampling efforts in such environments may be constrained and many field studies only rely on subjective and/or qualitative approaches to design collection strategy. The region of Cacao, in French Guiana, provides an excellent study site to understand the presence and abundance of Anopheles mosquitoes, their species dynamics and the transmission risk of malaria across various environments. We propose an objective methodology to define a stratified sampling design. Following thorough environmental characterization, a factorial analysis of mixed groups allows the data to be reduced and non-collinear principal components to be identified while balancing the influences of the different environmental factors. Such components defined new variables which could then be used in a robust k-means clustering procedure. Then, we identified five clusters that corresponded to our sampling strata and selected sampling sites in each stratum. We validated our method by comparing the species overlap of entomological collections from selected sites and the environmental similarities of the same sites. The Morisita index was significantly correlated (Pearson linear correlation) with environmental similarity based on i) the balanced environmental variable groups considered jointly (p = 0.001) and ii) land cover/use (p-value < 0.001). The Jaccard index was significantly correlated with land cover/use-based environmental similarity (p-value = 0.001). The results validate our sampling approach. Land cover/use maps (based on high spatial resolution satellite images) were shown to be particularly useful when studying the presence, density and diversity of Anopheles mosquitoes at local scales and in very heterogeneous landscapes.

  1. Objective sampling design in a highly heterogeneous landscape - characterizing environmental determinants of malaria vector distribution in French Guiana, in the Amazonian region

    PubMed Central

    2013-01-01

    Background Sampling design is a key issue when establishing species inventories and characterizing habitats within highly heterogeneous landscapes. Sampling efforts in such environments may be constrained and many field studies only rely on subjective and/or qualitative approaches to design collection strategy. The region of Cacao, in French Guiana, provides an excellent study site to understand the presence and abundance of Anopheles mosquitoes, their species dynamics and the transmission risk of malaria across various environments. We propose an objective methodology to define a stratified sampling design. Following thorough environmental characterization, a factorial analysis of mixed groups allows the data to be reduced and non-collinear principal components to be identified while balancing the influences of the different environmental factors. Such components defined new variables which could then be used in a robust k-means clustering procedure. Then, we identified five clusters that corresponded to our sampling strata and selected sampling sites in each stratum. Results We validated our method by comparing the species overlap of entomological collections from selected sites and the environmental similarities of the same sites. The Morisita index was significantly correlated (Pearson linear correlation) with environmental similarity based on i) the balanced environmental variable groups considered jointly (p = 0.001) and ii) land cover/use (p-value << 0.001). The Jaccard index was significantly correlated with land cover/use-based environmental similarity (p-value = 0.001). Conclusions The results validate our sampling approach. Land cover/use maps (based on high spatial resolution satellite images) were shown to be particularly useful when studying the presence, density and diversity of Anopheles mosquitoes at local scales and in very heterogeneous landscapes. PMID:24289184

  2. Statistical inferences for data from studies conducted with an aggregated multivariate outcome-dependent sample design.

    PubMed

    Lu, Tsui-Shan; Longnecker, Matthew P; Zhou, Haibo

    2017-03-15

    Outcome-dependent sampling (ODS) scheme is a cost-effective sampling scheme where one observes the exposure with a probability that depends on the outcome. The well-known such design is the case-control design for binary response, the case-cohort design for the failure time data, and the general ODS design for a continuous response. While substantial work has been carried out for the univariate response case, statistical inference and design for the ODS with multivariate cases remain under-developed. Motivated by the need in biological studies for taking the advantage of the available responses for subjects in a cluster, we propose a multivariate outcome-dependent sampling (multivariate-ODS) design that is based on a general selection of the continuous responses within a cluster. The proposed inference procedure for the multivariate-ODS design is semiparametric where all the underlying distributions of covariates are modeled nonparametrically using the empirical likelihood methods. We show that the proposed estimator is consistent and developed the asymptotically normality properties. Simulation studies show that the proposed estimator is more efficient than the estimator obtained using only the simple-random-sample portion of the multivariate-ODS or the estimator from a simple random sample with the same sample size. The multivariate-ODS design together with the proposed estimator provides an approach to further improve study efficiency for a given fixed study budget. We illustrate the proposed design and estimator with an analysis of association of polychlorinated biphenyl exposure to hearing loss in children born to the Collaborative Perinatal Study. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  3. "Gap hunting" to characterize clustered probe signals in Illumina methylation array data.

    PubMed

    Andrews, Shan V; Ladd-Acosta, Christine; Feinberg, Andrew P; Hansen, Kasper D; Fallin, M Daniele

    2016-01-01

    The Illumina 450k array has been widely used in epigenetic association studies. Current quality-control (QC) pipelines typically remove certain sets of probes, such as those containing a SNP or with multiple mapping locations. An additional set of potentially problematic probes are those with DNA methylation distributions characterized by two or more distinct clusters separated by gaps. Data-driven identification of such probes may offer additional insights for downstream analyses. We developed a procedure, termed "gap hunting," to identify probes showing clustered distributions. Among 590 peripheral blood samples from the Study to Explore Early Development, we identified 11,007 "gap probes." The vast majority (9199) are likely attributed to an underlying SNP(s) or other variant in the probe, although SNP-affected probes exist that do not produce a gap signals. Specific factors predict which SNPs lead to gap signals, including type of nucleotide change, probe type, DNA strand, and overall methylation state. These expected effects are demonstrated in paired genotype and 450k data on the same samples. Gap probes can also serve as a surrogate for the local genetic sequence on a haplotype scale and can be used to adjust for population stratification. The characteristics of gap probes reflect potentially informative biology. QC pipelines may benefit from an efficient data-driven approach that "flags" gap probes, rather than filtering such probes, followed by careful interpretation of downstream association analyses. Our results should translate directly to the recently released Illumina EPIC array given the similar chemistry and content design.

  4. CA II TRIPLET SPECTROSCOPY OF SMALL MAGELLANIC CLOUD RED GIANTS. III. ABUNDANCES AND VELOCITIES FOR A SAMPLE OF 14 CLUSTERS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Parisi, M. C.; Clariá, J. J.; Marcionni, N.

    2015-05-15

    We obtained spectra of red giants in 15 Small Magellanic Cloud (SMC) clusters in the region of the Ca ii lines with FORS2 on the Very Large Telescope. We determined the mean metallicity and radial velocity with mean errors of 0.05 dex and 2.6 km s{sup −1}, respectively, from a mean of 6.5 members per cluster. One cluster (B113) was too young for a reliable metallicity determination and was excluded from the sample. We combined the sample studied here with 15 clusters previously studied by us using the same technique, and with 7 clusters whose metallicities determined by other authorsmore » are on a scale similar to ours. This compilation of 36 clusters is the largest SMC cluster sample currently available with accurate and homogeneously determined metallicities. We found a high probability that the metallicity distribution is bimodal, with potential peaks at −1.1 and −0.8 dex. Our data show no strong evidence of a metallicity gradient in the SMC clusters, somewhat at odds with recent evidence from Ca ii triplet spectra of a large sample of field stars. This may be revealing possible differences in the chemical history of clusters and field stars. Our clusters show a significant dispersion of metallicities, whatever age is considered, which could be reflecting the lack of a unique age–metallicity relation in this galaxy. None of the chemical evolution models currently available in the literature satisfactorily represents the global chemical enrichment processes of SMC clusters.« less

  5. Machine learning prediction for classification of outcomes in local minimisation

    NASA Astrophysics Data System (ADS)

    Das, Ritankar; Wales, David J.

    2017-01-01

    Machine learning schemes are employed to predict which local minimum will result from local energy minimisation of random starting configurations for a triatomic cluster. The input data consists of structural information at one or more of the configurations in optimisation sequences that converge to one of four distinct local minima. The ability to make reliable predictions, in terms of the energy or other properties of interest, could save significant computational resources in sampling procedures that involve systematic geometry optimisation. Results are compared for two energy minimisation schemes, and for neural network and quadratic functions of the inputs.

  6. Spectroscopic characterization of galaxy clusters in RCS-1: spectroscopic confirmation, redshift accuracy, and dynamical mass-richness relation

    NASA Astrophysics Data System (ADS)

    Gilbank, David G.; Barrientos, L. Felipe; Ellingson, Erica; Blindert, Kris; Yee, H. K. C.; Anguita, T.; Gladders, M. D.; Hall, P. B.; Hertling, G.; Infante, L.; Yan, R.; Carrasco, M.; Garcia-Vergara, Cristina; Dawson, K. S.; Lidman, C.; Morokuma, T.

    2018-05-01

    We present follow-up spectroscopic observations of galaxy clusters from the first Red-sequence Cluster Survey (RCS-1). This work focuses on two samples, a lower redshift sample of ˜30 clusters ranging in redshift from z ˜ 0.2-0.6 observed with multiobject spectroscopy (MOS) on 4-6.5-m class telescopes and a z ˜ 1 sample of ˜10 clusters 8-m class telescope observations. We examine the detection efficiency and redshift accuracy of the now widely used red-sequence technique for selecting clusters via overdensities of red-sequence galaxies. Using both these data and extended samples including previously published RCS-1 spectroscopy and spectroscopic redshifts from SDSS, we find that the red-sequence redshift using simple two-filter cluster photometric redshifts is accurate to σz ≈ 0.035(1 + z) in RCS-1. This accuracy can potentially be improved with better survey photometric calibration. For the lower redshift sample, ˜5 per cent of clusters show some (minor) contamination from secondary systems with the same red-sequence intruding into the measurement aperture of the original cluster. At z ˜ 1, the rate rises to ˜20 per cent. Approximately ten per cent of projections are expected to be serious, where the two components contribute significant numbers of their red-sequence galaxies to another cluster. Finally, we present a preliminary study of the mass-richness calibration using velocity dispersions to probe the dynamical masses of the clusters. We find a relation broadly consistent with that seen in the local universe from the WINGS sample at z ˜ 0.05.

  7. Unequal cluster sizes in stepped-wedge cluster randomised trials: a systematic review.

    PubMed

    Kristunas, Caroline; Morris, Tom; Gray, Laura

    2017-11-15

    To investigate the extent to which cluster sizes vary in stepped-wedge cluster randomised trials (SW-CRT) and whether any variability is accounted for during the sample size calculation and analysis of these trials. Any, not limited to healthcare settings. Any taking part in an SW-CRT published up to March 2016. The primary outcome is the variability in cluster sizes, measured by the coefficient of variation (CV) in cluster size. Secondary outcomes include the difference between the cluster sizes assumed during the sample size calculation and those observed during the trial, any reported variability in cluster sizes and whether the methods of sample size calculation and methods of analysis accounted for any variability in cluster sizes. Of the 101 included SW-CRTs, 48% mentioned that the included clusters were known to vary in size, yet only 13% of these accounted for this during the calculation of the sample size. However, 69% of the trials did use a method of analysis appropriate for when clusters vary in size. Full trial reports were available for 53 trials. The CV was calculated for 23 of these: the median CV was 0.41 (IQR: 0.22-0.52). Actual cluster sizes could be compared with those assumed during the sample size calculation for 14 (26%) of the trial reports; the cluster sizes were between 29% and 480% of that which had been assumed. Cluster sizes often vary in SW-CRTs. Reporting of SW-CRTs also remains suboptimal. The effect of unequal cluster sizes on the statistical power of SW-CRTs needs further exploration and methods appropriate to studies with unequal cluster sizes need to be employed. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  8. An imbalance in cluster sizes does not lead to notable loss of power in cross-sectional, stepped-wedge cluster randomised trials with a continuous outcome.

    PubMed

    Kristunas, Caroline A; Smith, Karen L; Gray, Laura J

    2017-03-07

    The current methodology for sample size calculations for stepped-wedge cluster randomised trials (SW-CRTs) is based on the assumption of equal cluster sizes. However, as is often the case in cluster randomised trials (CRTs), the clusters in SW-CRTs are likely to vary in size, which in other designs of CRT leads to a reduction in power. The effect of an imbalance in cluster size on the power of SW-CRTs has not previously been reported, nor what an appropriate adjustment to the sample size calculation should be to allow for any imbalance. We aimed to assess the impact of an imbalance in cluster size on the power of a cross-sectional SW-CRT and recommend a method for calculating the sample size of a SW-CRT when there is an imbalance in cluster size. The effect of varying degrees of imbalance in cluster size on the power of SW-CRTs was investigated using simulations. The sample size was calculated using both the standard method and two proposed adjusted design effects (DEs), based on those suggested for CRTs with unequal cluster sizes. The data were analysed using generalised estimating equations with an exchangeable correlation matrix and robust standard errors. An imbalance in cluster size was not found to have a notable effect on the power of SW-CRTs. The two proposed adjusted DEs resulted in trials that were generally considerably over-powered. We recommend that the standard method of sample size calculation for SW-CRTs be used, provided that the assumptions of the method hold. However, it would be beneficial to investigate, through simulation, what effect the maximum likely amount of inequality in cluster sizes would be on the power of the trial and whether any inflation of the sample size would be required.

  9. Selection of Variables in Cluster Analysis: An Empirical Comparison of Eight Procedures

    ERIC Educational Resources Information Center

    Steinley, Douglas; Brusco, Michael J.

    2008-01-01

    Eight different variable selection techniques for model-based and non-model-based clustering are evaluated across a wide range of cluster structures. It is shown that several methods have difficulties when non-informative variables (i.e., random noise) are included in the model. Furthermore, the distribution of the random noise greatly impacts the…

  10. Twin related domains in 3D microstructures of conventionally processed and grain boundary engineered materials

    DOE PAGES

    Lind, Jonathan; Li, Shiu Fai; Kumar, Mukul

    2016-05-20

    The concept of twin-limited microstructures has been explored in the literature as a crystallographically constrained grain boundary network connected via only coincident site lattice (CSL) boundaries. The advent of orientation imaging has made classification of twin-related domains (TRD) or any other orientation cluster experimentally accessible in 2D using EBSD. With the emergence of 3D orientation mapping, a comparison of TRDs in measured 3D microstructures is performed in this paper and compared against their 2D counterparts. The TRD analysis is performed on a conventionally processed (CP) and a grain boundary engineered (EM) high purity copper sample that have been subjected tomore » successive anneal procedures to promote grain growth. Finally, the EM sample shows extremely large TRDs which begin to approach that of a twin-limited microstructure, while the TRDs in the CP sample remain relatively small and remote.« less

  11. X-ray and optical substructures of the DAFT/FADA survey clusters

    NASA Astrophysics Data System (ADS)

    Guennou, L.; Durret, F.; Adami, C.; Lima Neto, G. B.

    2013-04-01

    We have undertaken the DAFT/FADA survey with the double aim of setting constraints on dark energy based on weak lensing tomography and of obtaining homogeneous and high quality data for a sample of 91 massive clusters in the redshift range 0.4-0.9 for which there were HST archive data. We have analysed the XMM-Newton data available for 42 of these clusters to derive their X-ray temperatures and luminosities and search for substructures. Out of these, a spatial analysis was possible for 30 clusters, but only 23 had deep enough X-ray data for a really robust analysis. This study was coupled with a dynamical analysis for the 26 clusters having at least 30 spectroscopic galaxy redshifts in the cluster range. Altogether, the X-ray sample of 23 clusters and the optical sample of 26 clusters have 14 clusters in common. We present preliminary results on the coupled X-ray and dynamical analyses of these 14 clusters.

  12. Non-specific filtering of beta-distributed data.

    PubMed

    Wang, Xinhui; Laird, Peter W; Hinoue, Toshinori; Groshen, Susan; Siegmund, Kimberly D

    2014-06-19

    Non-specific feature selection is a dimension reduction procedure performed prior to cluster analysis of high dimensional molecular data. Not all measured features are expected to show biological variation, so only the most varying are selected for analysis. In DNA methylation studies, DNA methylation is measured as a proportion, bounded between 0 and 1, with variance a function of the mean. Filtering on standard deviation biases the selection of probes to those with mean values near 0.5. We explore the effect this has on clustering, and develop alternate filter methods that utilize a variance stabilizing transformation for Beta distributed data and do not share this bias. We compared results for 11 different non-specific filters on eight Infinium HumanMethylation data sets, selected to span a variety of biological conditions. We found that for data sets having a small fraction of samples showing abnormal methylation of a subset of normally unmethylated CpGs, a characteristic of the CpG island methylator phenotype in cancer, a novel filter statistic that utilized a variance-stabilizing transformation for Beta distributed data outperformed the common filter of using standard deviation of the DNA methylation proportion, or its log-transformed M-value, in its ability to detect the cancer subtype in a cluster analysis. However, the standard deviation filter always performed among the best for distinguishing subgroups of normal tissue. The novel filter and standard deviation filter tended to favour features in different genome contexts; for the same data set, the novel filter always selected more features from CpG island promoters and the standard deviation filter always selected more features from non-CpG island intergenic regions. Interestingly, despite selecting largely non-overlapping sets of features, the two filters did find sample subsets that overlapped for some real data sets. We found two different filter statistics that tended to prioritize features with different characteristics, each performed well for identifying clusters of cancer and non-cancer tissue, and identifying a cancer CpG island hypermethylation phenotype. Since cluster analysis is for discovery, we would suggest trying both filters on any new data sets, evaluating the overlap of features selected and clusters discovered.

  13. Procedure of Partitioning Data Into Number of Data Sets or Data Group - A Review

    NASA Astrophysics Data System (ADS)

    Kim, Tai-Hoon

    The goal of clustering is to decompose a dataset into similar groups based on a objective function. Some already well established clustering algorithms are there for data clustering. Objective of these data clustering algorithms are to divide the data points of the feature space into a number of groups (or classes) so that a predefined set of criteria are satisfied. The article considers the comparative study about the effectiveness and efficiency of traditional data clustering algorithms. For evaluating the performance of the clustering algorithms, Minkowski score is used here for different data sets.

  14. Multivariate Analysis of the Visual Information Processing of Numbers

    ERIC Educational Resources Information Center

    Levine, David M.

    1977-01-01

    Nonmetric multidimensional scaling and hierarchical clustering procedures are applied to a confusion matrix of numerals. Two dimensions were interpreted: straight versus curved, and locus of curvature. Four major clusters of numerals were developed. (Author/JKS)

  15. An improved initialization center k-means clustering algorithm based on distance and density

    NASA Astrophysics Data System (ADS)

    Duan, Yanling; Liu, Qun; Xia, Shuyin

    2018-04-01

    Aiming at the problem of the random initial clustering center of k means algorithm that the clustering results are influenced by outlier data sample and are unstable in multiple clustering, a method of central point initialization method based on larger distance and higher density is proposed. The reciprocal of the weighted average of distance is used to represent the sample density, and the data sample with the larger distance and the higher density are selected as the initial clustering centers to optimize the clustering results. Then, a clustering evaluation method based on distance and density is designed to verify the feasibility of the algorithm and the practicality, the experimental results on UCI data sets show that the algorithm has a certain stability and practicality.

  16. Effects of rare earth doping on multi-core iron oxide nanoparticles properties

    NASA Astrophysics Data System (ADS)

    Petran, Anca; Radu, Teodora; Borodi, Gheorghe; Nan, Alexandrina; Suciu, Maria; Turcu, Rodica

    2018-01-01

    New multi-core iron oxide magnetic nanoparticles doped with rare earth metals (Gd, Eu) were obtained by a one step synthesis procedure using a solvothermal method for potential biomedical applications. The obtained clusters were characterized by X-ray diffraction (XRD), transmission electron microscopy (TEM), energy-dispersive X-ray microanalysis (EDX), X-ray photoelectron spectroscopy (XPS) and magnetization measurements. They possess high colloidal stability, a saturation magnetization of up to 52 emu/g, and nearly spherical shape. The presence of rare earth ions in the obtained samples was confirmed by EDX and XPS. XRD analysis proved the homogeneous distribution of the trivalent rare earth ions in the inverse-spinel structure of magnetite and the increase of crystal strain upon doping the samples. XPS study reveals the valence state and the cation distribution on the octahedral and tetrahedral sites of the analysed samples. The observed shift of the XPS valence band spectra maximum in the direction of higher binding energies after rare earth doping, as well as theoretical valence band calculations prove the presence of Gd and Eu ions in octahedral sites. The blood protein adsorption ability of the obtained samples surface, the most important factor of the interaction between biomaterials and body fluids, was assessed by interaction with bovine serum albumin (BSA). The rare earth doped clusters surface show higher afinity for binding BSA. In vitro cytotoxicity test results for the studied samples showed no cytotoxicity in low and medium doses, establishing a potential perspective for rare earth doped MNC to facilitate multiple therapies in a single formulation for cancer theranostics.

  17. False Discovery Control in Large-Scale Spatial Multiple Testing

    PubMed Central

    Sun, Wenguang; Reich, Brian J.; Cai, T. Tony; Guindani, Michele; Schwartzman, Armin

    2014-01-01

    Summary This article develops a unified theoretical and computational framework for false discovery control in multiple testing of spatial signals. We consider both point-wise and cluster-wise spatial analyses, and derive oracle procedures which optimally control the false discovery rate, false discovery exceedance and false cluster rate, respectively. A data-driven finite approximation strategy is developed to mimic the oracle procedures on a continuous spatial domain. Our multiple testing procedures are asymptotically valid and can be effectively implemented using Bayesian computational algorithms for analysis of large spatial data sets. Numerical results show that the proposed procedures lead to more accurate error control and better power performance than conventional methods. We demonstrate our methods for analyzing the time trends in tropospheric ozone in eastern US. PMID:25642138

  18. Weak lensing magnification of SpARCS galaxy clusters

    NASA Astrophysics Data System (ADS)

    Tudorica, A.; Hildebrandt, H.; Tewes, M.; Hoekstra, H.; Morrison, C. B.; Muzzin, A.; Wilson, G.; Yee, H. K. C.; Lidman, C.; Hicks, A.; Nantais, J.; Erben, T.; van der Burg, R. F. J.; Demarco, R.

    2017-12-01

    Context. Measuring and calibrating relations between cluster observables is critical for resource-limited studies. The mass-richness relation of clusters offers an observationally inexpensive way of estimating masses. Its calibration is essential for cluster and cosmological studies, especially for high-redshift clusters. Weak gravitational lensing magnification is a promising and complementary method to shear studies, that can be applied at higher redshifts. Aims: We aim to employ the weak lensing magnification method to calibrate the mass-richness relation up to a redshift of 1.4. We used the Spitzer Adaptation of the Red-Sequence Cluster Survey (SpARCS) galaxy cluster candidates (0.2 < z < 1.4) and optical data from the Canada France Hawaii Telescope (CFHT) to test whether magnification can be effectively used to constrain the mass of high-redshift clusters. Methods: Lyman-break galaxies (LBGs) selected using the u-band dropout technique and their colours were used as a background sample of sources. LBG positions were cross-correlated with the centres of the sample of SpARCS clusters to estimate the magnification signal, which was optimally-weighted using an externally-calibrated LBG luminosity function. The signal was measured for cluster sub-samples, binned in both redshift and richness. Results: We measured the cross-correlation between the positions of galaxy cluster candidates and LBGs and detected a weak lensing magnification signal for all bins at a detection significance of 2.6-5.5σ. In particular, the significance of the measurement for clusters with z> 1.0 is 4.1σ; for the entire cluster sample we obtained an average M200 of 1.28 -0.21+0.23 × 1014 M⊙. Conclusions: Our measurements demonstrated the feasibility of using weak lensing magnification as a viable tool for determining the average halo masses for samples of high redshift galaxy clusters. The results also established the success of using galaxy over-densities to select massive clusters at z > 1. Additional studies are necessary for further modelling of the various systematic effects we discussed.

  19. High Prevalence of Intermediate Leptospira spp. DNA in Febrile Humans from Urban and Rural Ecuador.

    PubMed

    Chiriboga, Jorge; Barragan, Verónica; Arroyo, Gabriela; Sosa, Andrea; Birdsell, Dawn N; España, Karool; Mora, Ana; Espín, Emilia; Mejía, María Eugenia; Morales, Melba; Pinargote, Carmina; Gonzalez, Manuel; Hartskeerl, Rudy; Keim, Paul; Bretas, Gustavo; Eisenberg, Joseph N S; Trueba, Gabriel

    2015-12-01

    Leptospira spp., which comprise 3 clusters (pathogenic, saprophytic, and intermediate) that vary in pathogenicity, infect >1 million persons worldwide each year. The disease burden of the intermediate leptospires is unclear. To increase knowledge of this cluster, we used new molecular approaches to characterize Leptospira spp. in 464 samples from febrile patients in rural, semiurban, and urban communities in Ecuador; in 20 samples from nonfebrile persons in the rural community; and in 206 samples from animals in the semiurban community. We observed a higher percentage of leptospiral DNA-positive samples from febrile persons in rural (64%) versus urban (21%) and semiurban (25%) communities; no leptospires were detected in nonfebrile persons. The percentage of intermediate cluster strains in humans (96%) was higher than that of pathogenic cluster strains (4%); strains in animal samples belonged to intermediate (49%) and pathogenic (51%) clusters. Intermediate cluster strains may be causing a substantial amount of fever in coastal Ecuador.

  20. High Prevalence of Intermediate Leptospira spp. DNA in Febrile Humans from Urban and Rural Ecuador

    PubMed Central

    Chiriboga, Jorge; Barragan, Verónica; Arroyo, Gabriela; Sosa, Andrea; Birdsell, Dawn N.; España, Karool; Mora, Ana; Espín, Emilia; Mejía, María Eugenia; Morales, Melba; Pinargote, Carmina; Gonzalez, Manuel; Hartskeerl, Rudy; Keim, Paul; Bretas, Gustavo; Eisenberg, Joseph N.S.

    2015-01-01

    Leptospira spp., which comprise 3 clusters (pathogenic, saprophytic, and intermediate) that vary in pathogenicity, infect >1 million persons worldwide each year. The disease burden of the intermediate leptospires is unclear. To increase knowledge of this cluster, we used new molecular approaches to characterize Leptospira spp. in 464 samples from febrile patients in rural, semiurban, and urban communities in Ecuador; in 20 samples from nonfebrile persons in the rural community; and in 206 samples from animals in the semiurban community. We observed a higher percentage of leptospiral DNA–positive samples from febrile persons in rural (64%) versus urban (21%) and semiurban (25%) communities; no leptospires were detected in nonfebrile persons. The percentage of intermediate cluster strains in humans (96%) was higher than that of pathogenic cluster strains (4%); strains in animal samples belonged to intermediate (49%) and pathogenic (51%) clusters. Intermediate cluster strains may be causing a substantial amount of fever in coastal Ecuador. PMID:26583534

  1. Spectral characteristics and the extent of paleosols of the Palouse formation

    NASA Technical Reports Server (NTRS)

    Frazier, B. E.; Busacca, Alan; Cheng, Yaan; Wherry, David; Hart, Judy; Gill, Steve

    1987-01-01

    Thematic mapping data was analyzed and verified by comparison to previously gathered transect samples and to aerial photographs. A bare-soil field with exposed paleosols characterized by slight enrichment of iron was investigated. Spectral relationships were first investigated statistically by creating a data set with DN values spatially matched as nearly as possible to field sample points. Chemical data for each point included organic carbon, free iron oxide, and amorphous iron content. The chemical data, DN values, and various band ratios were examined with the program package Statistix in order to find the combinations of reflectance data most likely to show a relationship which would dependably separate the exposed paleosols from the other soils. Cluster analysis and Fastclas classification procedures were applied to the most promising of the band ratio combinations.

  2. Active learning for semi-supervised clustering based on locally linear propagation reconstruction.

    PubMed

    Chang, Chin-Chun; Lin, Po-Yi

    2015-03-01

    The success of semi-supervised clustering relies on the effectiveness of side information. To get effective side information, a new active learner learning pairwise constraints known as must-link and cannot-link constraints is proposed in this paper. Three novel techniques are developed for learning effective pairwise constraints. The first technique is used to identify samples less important to cluster structures. This technique makes use of a kernel version of locally linear embedding for manifold learning. Samples neither important to locally linear propagation reconstructions of other samples nor on flat patches in the learned manifold are regarded as unimportant samples. The second is a novel criterion for query selection. This criterion considers not only the importance of a sample to expanding the space coverage of the learned samples but also the expected number of queries needed to learn the sample. To facilitate semi-supervised clustering, the third technique yields inferred must-links for passing information about flat patches in the learned manifold to semi-supervised clustering algorithms. Experimental results have shown that the learned pairwise constraints can capture the underlying cluster structures and proven the feasibility of the proposed approach. Copyright © 2014 Elsevier Ltd. All rights reserved.

  3. Subtypes of Personality and 'Locus of Control' in Bariatric Patients and their Effect on Weight Loss, Eating Disorder and Depressive Symptoms, and Quality of Life.

    PubMed

    Peterhänsel, Carolin; Linde, Katja; Wagner, Birgit; Dietrich, Arne; Kersting, Anette

    2017-09-01

    The present study subdivided personality types in a bariatric sample and investigated their impact on weight loss and psychopathology 6 and 12 months after surgery. One hundred thirty participants answered questionnaires on personality (NEO-FFI), 'locus of control' (IPC), depression severity (BDI-II), eating disorder psychopathology (EDE-Q), and health-related quality of life (HRQoL; SF-12). K-means cluster analyses were used to identify subtypes. Two subtypes emerged: an 'emotionally dysregulated/undercontrolled' cluster defined by high neuroticism and external orientation and a 'resilient/high functioning' cluster with the reverse pattern. Prior to surgery, the first subtype reported more eating disorder and depressive symptoms and less HRQoL. Differences persisted regarding depression and mental HRQoL until 12 months after surgery, except in the areas weight loss and eating disorders. Personality seems to influence the improvement or maintenance of psychiatric symptoms after bariatric surgery. Future research could elucidate whether adapted treatment programmes could have an influence on the improvement of procedure outcomes. Copyright © 2017 John Wiley & Sons, Ltd and Eating Disorders Association. Copyright © 2017 John Wiley & Sons, Ltd and Eating Disorders Association.

  4. Multiple-locus variable number of tandem repeat analysis as a tool for molecular epidemiology of botulism: The Italian experience.

    PubMed

    Anniballi, Fabrizio; Fillo, Silvia; Giordani, Francesco; Auricchio, Bruna; Tehran, Domenico Azarnia; di Stefano, Enrica; Mandarino, Giuseppina; De Medici, Dario; Lista, Florigio

    2016-12-01

    Clostridium botulinum is the bacterial agent of botulism, a rare but severe neuro-paralytic disease. Because of its high impact, in Italy botulism is monitored by an ad hoc surveillance system. The National Reference Centre for Botulism, as part of this system, collects and analyzes all demographic, epidemiologic, microbiological, and molecular data recovered during cases and/or outbreaks occurred in Italy. A panel of 312 C. botulinum strains belonging to group I were submitted to MLVA sub-typing. Strains, isolated from clinical specimens, food and environmental samples collected during the surveillance activities, were representative of all forms of botulism from all Italian regions. Through clustering analysis isolates were grouped into 12 main clusters. No regional or temporal clustering was detected, demonstrating the high heterogeneity of strains circulating in Italy. This study confirmed that MLVA is capable of sub-typing C. botulinum strains. Moreover, MLVA is effective at tracing and tracking the source of contamination and is helpful for the surveillance system in terms of planning and upgrading of procedures, activities and data collection forms. Copyright © 2016 Elsevier B.V. All rights reserved.

  5. The relationships between electricity consumption and GDP in Asian countries, using hierarchical structure methods

    NASA Astrophysics Data System (ADS)

    Kantar, Ersin; Keskin, Mustafa

    2013-11-01

    This study uses hierarchical structure methods (minimal spanning tree (MST) and hierarchical tree (HT)) to examine the relationship between energy consumption and economic growth in a sample of 30 Asian countries covering the period 1971-2008. These countries are categorized into four panels based on the World Bank income classification, namely high, upper middle, lower middle, and low income. In particular, we use the data of electricity consumption and real gross domestic product (GDP) per capita to detect the topological properties of the countries. We show a relationship between electricity consumption and economic growth by using the MST and HT. We also use the bootstrap technique to investigate a value of the statistical reliability to the links of the MST. Finally, we use a clustering linkage procedure in order to observe the cluster structure. The results of the structural topologies of these trees are as follows: (i) we identified different clusters of countries according to their geographical location and economic growth, (ii) we found a strong relationship between energy consumption and economic growth for all income groups considered in this study and (iii) the results are in good agreement with the causal relationship between electricity consumption and economic growth.

  6. "A Richness Study of 14 Distant X-Ray Clusters from the 160 Square Degree Survey"

    NASA Technical Reports Server (NTRS)

    Jones, Christine; West, Donald (Technical Monitor)

    2001-01-01

    We have measured the surface density of galaxies toward 14 X-ray-selected cluster candidates at redshifts z(sub i) 0.46, and we show that they are associated with rich galaxy concentrations. These clusters, having X-ray luminosities of Lx(0.5-2 keV) approx. (0.5 - 2.6) x 10(exp 44) ergs/ sec are among the most distant and luminous in our 160 deg(exp 2) ROSAT Position Sensitive Proportional Counter cluster survey. We find that the clusters range between Abell richness classes 0 and 2 and have a most probable richness class of 1. We compare the richness distribution of our distant clusters to those for three samples of nearby clusters with similar X-ray luminosities. We find that the nearby and distant samples have similar richness distributions, which shows that clusters have apparently not evolved substantially in richness since redshift z=0.5. There is, however, a marginal tendency for the distant clusters to be slightly poorer than nearby clusters, although deeper multicolor data for a large sample would be required to confirm this trend. We compare the distribution of distant X-ray clusters in the L(sub X)-richness plane to the distribution of optically selected clusters from the Palomar Distant Cluster Survey. The optically selected clusters appear overly rich for their X-ray luminosities, when compared to X-ray-selected clusters. Apparently, X-ray and optical surveys do not necessarily sample identical mass concentrations at large redshifts. This may indicate the existence of a population of optically rich clusters with anomalously low X-ray emission, More likely, however, it reflects the tendency for optical surveys to select unvirialized mass concentrations, as might be expected when peering along large-scale filaments.

  7. The XXL survey XV: evidence for dry merger driven BCG growth in XXL-100-GC X-ray clusters

    NASA Astrophysics Data System (ADS)

    Lavoie, S.; Willis, J. P.; Démoclès, J.; Eckert, D.; Gastaldello, F.; Smith, G. P.; Lidman, C.; Adami, C.; Pacaud, F.; Pierre, M.; Clerc, N.; Giles, P.; Lieu, M.; Chiappetti, L.; Altieri, B.; Ardila, F.; Baldry, I.; Bongiorno, A.; Desai, S.; Elyiv, A.; Faccioli, L.; Gardner, B.; Garilli, B.; Groote, M. W.; Guennou, L.; Guzzo, L.; Hopkins, A. M.; Liske, J.; McGee, S.; Melnyk, O.; Owers, M. S.; Poggianti, B.; Ponman, T. J.; Scodeggio, M.; Spitler, L.; Tuffs, R. J.

    2016-11-01

    The growth of brightest cluster galaxies (BCGs) is closely related to the properties of their host cluster. We present evidence for dry mergers as the dominant source of BCG mass growth at z ≲ 1 in the XXL 100 brightest cluster sample. We use the global red sequence, Hα emission and mean star formation history to show that BCGs in the sample possess star formation levels comparable to field ellipticals of similar stellar mass and redshift. XXL 100 brightest clusters are less massive on average than those in other X-ray selected samples such as LoCuSS or HIFLUGCS. Few clusters in the sample display high central gas concentration, rendering inefficient the growth of BCGs via star formation resulting from the accretion of cool gas. Using measures of the relaxation state of their host clusters, we show that BCGs grow as relaxation proceeds. We find that the BCG stellar mass corresponds to a relatively constant fraction 1 per cent of the total cluster mass in relaxed systems. We also show that, following a cluster scale merger event, the BCG stellar mass lags behind the expected value from the Mcluster-MBCG relation but subsequently accretes stellar mass via dry mergers as the BCG and cluster evolve towards a relaxed state.

  8. The Atacama Cosmology Telescope: Cosmology from Galaxy Clusters Detected Via the Sunyaev-Zel'dovich Effect

    NASA Technical Reports Server (NTRS)

    Sehgal, Neelima; Trac, Hy; Acquaviva, Viviana; Ade, Peter A. R.; Aguirre, Paula; Amiri, Mandana; Appel, John W.; Barrientos, L. Felipe; Battistelli, Elia S.; Bond, J. Richard; hide

    2010-01-01

    We present constraints on cosmological parameters based on a sample of Sunyaev-Zel'dovich-selected galaxy clusters detected in a millimeter-wave survey by the Atacama Cosmology Telescope. The cluster sample used in this analysis consists of 9 optically-confirmed high-mass clusters comprising the high-significance end of the total cluster sample identified in 455 square degrees of sky surveyed during 2008 at 148 GHz. We focus on the most massive systems to reduce the degeneracy between unknown cluster astrophysics and cosmology derived from SZ surveys. We describe the scaling relation between cluster mass and SZ signal with a 4-parameter fit. Marginalizing over the values of the parameters in this fit with conservative priors gives (sigma)8 = 0.851 +/- 0.115 and w = -1.14 +/- 0.35 for a spatially-flat wCDM cosmological model with WMAP 7-year priors on cosmological parameters. This gives a modest improvement in statistical uncertainty over WMAP 7-year constraints alone. Fixing the scaling relation between cluster mass and SZ signal to a fiducial relation obtained from numerical simulations and calibrated by X-ray observations, we find (sigma)8 + 0.821 +/- 0.044 and w = -1.05 +/- 0.20. These results are consistent with constraints from WMAP 7 plus baryon acoustic oscillations plus type Ia supernova which give (sigma)8 = 0.802 +/- 0.038 and w = -0.98 +/- 0.053. A stacking analysis of the clusters in this sample compared to clusters simulated assuming the fiducial model also shows good agreement. These results suggest that, given the sample of clusters used here, both the astrophysics of massive clusters and the cosmological parameters derived from them are broadly consistent with current models.

  9. Effects of bursting dynamic features on the generation of multi-clustered structure of neural network with symmetric spike-timing-dependent plasticity learning rule.

    PubMed

    Liu, Hui; Song, Yongduan; Xue, Fangzheng; Li, Xiumin

    2015-11-01

    In this paper, the generation of multi-clustered structure of self-organized neural network with different neuronal firing patterns, i.e., bursting or spiking, has been investigated. The initially all-to-all-connected spiking neural network or bursting neural network can be self-organized into clustered structure through the symmetric spike-timing-dependent plasticity learning for both bursting and spiking neurons. However, the time consumption of this clustering procedure of the burst-based self-organized neural network (BSON) is much shorter than the spike-based self-organized neural network (SSON). Our results show that the BSON network has more obvious small-world properties, i.e., higher clustering coefficient and smaller shortest path length than the SSON network. Also, the results of larger structure entropy and activity entropy of the BSON network demonstrate that this network has higher topological complexity and dynamical diversity, which benefits for enhancing information transmission of neural circuits. Hence, we conclude that the burst firing can significantly enhance the efficiency of clustering procedure and the emergent clustered structure renders the whole network more synchronous and therefore more sensitive to weak input. This result is further confirmed from its improved performance on stochastic resonance. Therefore, we believe that the multi-clustered neural network which self-organized from the bursting dynamics has high efficiency in information processing.

  10. Effects of bursting dynamic features on the generation of multi-clustered structure of neural network with symmetric spike-timing-dependent plasticity learning rule

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liu, Hui; Song, Yongduan; Xue, Fangzheng

    In this paper, the generation of multi-clustered structure of self-organized neural network with different neuronal firing patterns, i.e., bursting or spiking, has been investigated. The initially all-to-all-connected spiking neural network or bursting neural network can be self-organized into clustered structure through the symmetric spike-timing-dependent plasticity learning for both bursting and spiking neurons. However, the time consumption of this clustering procedure of the burst-based self-organized neural network (BSON) is much shorter than the spike-based self-organized neural network (SSON). Our results show that the BSON network has more obvious small-world properties, i.e., higher clustering coefficient and smaller shortest path length than themore » SSON network. Also, the results of larger structure entropy and activity entropy of the BSON network demonstrate that this network has higher topological complexity and dynamical diversity, which benefits for enhancing information transmission of neural circuits. Hence, we conclude that the burst firing can significantly enhance the efficiency of clustering procedure and the emergent clustered structure renders the whole network more synchronous and therefore more sensitive to weak input. This result is further confirmed from its improved performance on stochastic resonance. Therefore, we believe that the multi-clustered neural network which self-organized from the bursting dynamics has high efficiency in information processing.« less

  11. Uranium hydrogeochemical and stream sediment reconnaissance of the Arminto NTMS quadrangle, Wyoming, including concentrations of forty-three additional elements

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Morgan, T.L.

    1979-11-01

    During the summers of 1976 and 1977, 570 water and 1249 sediment samples were collected from 1517 locations within the 18,000-km/sup 2/ area of the Arminto NTMS quadrangle of central Wyoming. Water samples were collected from wells, springs, streams, and artifical ponds; sediment samples were collected from wet and dry streams, springs, and wet and dry ponds. All water samples were analyzed for 13 elements, including uranium, and each sediment sample was analyzed for 43 elements, including uranium and thorium. Uranium concentrations in water samples range from below the detection limit to 84.60 parts per billion (ppb) with a meanmore » of 4.32 ppb. All water sample types except pond water samples were considered as a single population in interpreting the data. Pond water samples were excluded due to possible concentration of uranium by evaporation. Most of the water samples containing greater than 20 ppb uranium grouped into six clusters that indicate possible areas of interest for further investigation. One cluster is associated with the Pumpkin Buttes District, and two others are near the Kaycee and Mayoworth areas of uranium mineralization. The largest cluster is located on the west side of the Powder River Basin. One cluster is located in the central Big Horn Basin and another is in the Wind River Basin; both are in areas underlain by favorable host units. Uranium concentrations in sediment samples range from 0.08 parts per million (ppm) to 115.50 ppm with a mean of 3.50 ppm. Two clusters of sediment samples over 7 ppm were delineated. The first, containing the two highest-concentration samples, corresponds with the Copper Mountain District. Many of the high uranium concentrations in samples in this cluster may be due to contamination from mining or prospecting activity upstream from the sample sites. The second cluster encompasses a wide area in the Wind River Basin along the southern boundary of the quadrangle.« less

  12. Planck/SDSS Cluster Mass and Gas Scaling Relations for a Volume-Complete redMaPPer Sample

    NASA Astrophysics Data System (ADS)

    Jimeno, Pablo; Diego, Jose M.; Broadhurst, Tom; De Martino, I.; Lazkoz, Ruth

    2018-04-01

    Using Planck satellite data, we construct Sunyaev-Zel'dovich (SZ) gas pressure profiles for a large, volume-complete sample of optically selected clusters. We have defined a sample of over 8,000 redMaPPer clusters from the Sloan Digital Sky Survey (SDSS), within the volume-complete redshift region 0.100 < z < 0.325, for which we construct SZ effect maps by stacking Planck data over the full range of richness. Dividing the sample into richness bins we simultaneously solve for the mean cluster mass in each bin together with the corresponding radial pressure profile parameters, employing an MCMC analysis. These profiles are well detected over a much wider range of cluster mass and radius than previous work, showing a clear trend towards larger break radius with increasing cluster mass. Our SZ-based masses fall ˜16% below the mass-richness relations from weak lensing, in a similar fashion as the "hydrostatic bias" related with X-ray derived masses. Finally, we derive a tight Y500-M500 relation over a wide range of cluster mass, with a power law slope equal to 1.70 ± 0.07, that agrees well with the independent slope obtained by the Planck team with an SZ-selected cluster sample, but extends to lower masses with higher precision.

  13. OMERACT-based fibromyalgia symptom subgroups: an exploratory cluster analysis.

    PubMed

    Vincent, Ann; Hoskin, Tanya L; Whipple, Mary O; Clauw, Daniel J; Barton, Debra L; Benzo, Roberto P; Williams, David A

    2014-10-16

    The aim of this study was to identify subsets of patients with fibromyalgia with similar symptom profiles using the Outcome Measures in Rheumatology (OMERACT) core symptom domains. Female patients with a diagnosis of fibromyalgia and currently meeting fibromyalgia research survey criteria completed the Brief Pain Inventory, the 30-item Profile of Mood States, the Medical Outcomes Sleep Scale, the Multidimensional Fatigue Inventory, the Multiple Ability Self-Report Questionnaire, the Fibromyalgia Impact Questionnaire-Revised (FIQ-R) and the Short Form-36 between 1 June 2011 and 31 October 2011. Hierarchical agglomerative clustering was used to identify subgroups of patients with similar symptom profiles. To validate the results from this sample, hierarchical agglomerative clustering was repeated in an external sample of female patients with fibromyalgia with similar inclusion criteria. A total of 581 females with a mean age of 55.1 (range, 20.1 to 90.2) years were included. A four-cluster solution best fit the data, and each clustering variable differed significantly (P <0.0001) among the four clusters. The four clusters divided the sample into severity levels: Cluster 1 reflects the lowest average levels across all symptoms, and cluster 4 reflects the highest average levels. Clusters 2 and 3 capture moderate symptoms levels. Clusters 2 and 3 differed mainly in profiles of anxiety and depression, with Cluster 2 having lower levels of depression and anxiety than Cluster 3, despite higher levels of pain. The results of the cluster analysis of the external sample (n = 478) looked very similar to those found in the original cluster analysis, except for a slight difference in sleep problems. This was despite having patients in the validation sample who were significantly younger (P <0.0001) and had more severe symptoms (higher FIQ-R total scores (P = 0.0004)). In our study, we incorporated core OMERACT symptom domains, which allowed for clustering based on a comprehensive symptom profile. Although our exploratory cluster solution needs confirmation in a longitudinal study, this approach could provide a rationale to support the study of individualized clinical evaluation and intervention.

  14. Tracking Undergraduate Student Achievement in a First-Year Physiology Course Using a Cluster Analysis Approach

    ERIC Educational Resources Information Center

    Brown, S. J.; White, S.; Power, N.

    2015-01-01

    A cluster analysis data classification technique was used on assessment scores from 157 undergraduate nursing students who passed 2 successive compulsory courses in human anatomy and physiology. Student scores in five summative assessment tasks, taken in each of the courses, were used as inputs for a cluster analysis procedure. We aimed to group…

  15. Hausdorff clustering

    NASA Astrophysics Data System (ADS)

    Basalto, Nicolas; Bellotti, Roberto; de Carlo, Francesco; Facchi, Paolo; Pantaleo, Ester; Pascazio, Saverio

    2008-10-01

    A clustering algorithm based on the Hausdorff distance is analyzed and compared to the single, complete, and average linkage algorithms. The four clustering procedures are applied to a toy example and to the time series of financial data. The dendrograms are scrutinized and their features compared. The Hausdorff linkage relies on firm mathematical grounds and turns out to be very effective when one has to discriminate among complex structures.

  16. Finding Groups Using Model-based Cluster Analysis: Heterogeneous Emotional Self-regulatory Processes and Heavy Alcohol Use Risk

    PubMed Central

    Mun, Eun-Young; von Eye, Alexander; Bates, Marsha E.; Vaschillo, Evgeny G.

    2010-01-01

    Model-based cluster analysis is a new clustering procedure to investigate population heterogeneity utilizing finite mixture multivariate normal densities. It is an inferentially based, statistically principled procedure that allows comparison of non-nested models using the Bayesian Information Criterion (BIC) to compare multiple models and identify the optimum number of clusters. The current study clustered 36 young men and women based on their baseline heart rate (HR) and HR variability (HRV), chronic alcohol use, and reasons for drinking. Two cluster groups were identified and labeled High Alcohol Risk and Normative groups. Compared to the Normative group, individuals in the High Alcohol Risk group had higher levels of alcohol use and more strongly endorsed disinhibition and suppression reasons for use. The High Alcohol Risk group showed significant HRV changes in response to positive and negative emotional and appetitive picture cues, compared to neutral cues. In contrast, the Normative group showed a significant HRV change only to negative cues. Findings suggest that the individuals with autonomic self-regulatory difficulties may be more susceptible to heavy alcohol use and use alcohol for emotional regulation. PMID:18331138

  17. Data-driven inference for the spatial scan statistic.

    PubMed

    Almeida, Alexandre C L; Duarte, Anderson R; Duczmal, Luiz H; Oliveira, Fernando L P; Takahashi, Ricardo H C

    2011-08-02

    Kulldorff's spatial scan statistic for aggregated area maps searches for clusters of cases without specifying their size (number of areas) or geographic location in advance. Their statistical significance is tested while adjusting for the multiple testing inherent in such a procedure. However, as is shown in this work, this adjustment is not done in an even manner for all possible cluster sizes. A modification is proposed to the usual inference test of the spatial scan statistic, incorporating additional information about the size of the most likely cluster found. A new interpretation of the results of the spatial scan statistic is done, posing a modified inference question: what is the probability that the null hypothesis is rejected for the original observed cases map with a most likely cluster of size k, taking into account only those most likely clusters of size k found under null hypothesis for comparison? This question is especially important when the p-value computed by the usual inference process is near the alpha significance level, regarding the correctness of the decision based in this inference. A practical procedure is provided to make more accurate inferences about the most likely cluster found by the spatial scan statistic.

  18. Uncertainties in the cluster-cluster correlation function

    NASA Astrophysics Data System (ADS)

    Ling, E. N.; Frenk, C. S.; Barrow, J. D.

    1986-12-01

    The bootstrap resampling technique is applied to estimate sampling errors and significance levels of the two-point correlation functions determined for a subset of the CfA redshift survey of galaxies and a redshift sample of 104 Abell clusters. The angular correlation function for a sample of 1664 Abell clusters is also calculated. The standard errors in xi(r) for the Abell data are found to be considerably larger than quoted 'Poisson errors'. The best estimate for the ratio of the correlation length of Abell clusters (richness class R greater than or equal to 1, distance class D less than or equal to 4) to that of CfA galaxies is 4.2 + 1.4 or - 1.0 (68 percentile error). The enhancement of cluster clustering over galaxy clustering is statistically significant in the presence of resampling errors. The uncertainties found do not include the effects of possible systematic biases in the galaxy and cluster catalogs and could be regarded as lower bounds on the true uncertainty range.

  19. Health Occupations Cluster.

    ERIC Educational Resources Information Center

    Walraven, Catherine; And Others

    These instructional materials consist of a series of curriculum worksheets that cover tasks to be mastered by students in health occupations cluster programs. Covered in the curriculum worksheets are diagnostic procedures; observing/recording/reporting/planning; safety; nutrition/elimination; hygiene/personal care/comfort;…

  20. Mathematical Geology

    ERIC Educational Resources Information Center

    Merriam, Daniel F.

    1978-01-01

    Geomathematics is a developing field that is being used in practical applications. Classification is an important element and the dynamic-cluster method (DCM), a nonhierarchial procedure, was introduced this past year. A method for testing the degree of cluster distinctness was developed also. (MA)

  1. Biochemical imaging of tissues by SIMS for biomedical applications

    NASA Astrophysics Data System (ADS)

    Lee, Tae Geol; Park, Ji-Won; Shon, Hyun Kyong; Moon, Dae Won; Choi, Won Woo; Li, Kapsok; Chung, Jin Ho

    2008-12-01

    With the development of optimal surface cleaning techniques by cluster ion beam sputtering, certain applications of SIMS for analyzing cells and tissues have been actively investigated. For this report, we collaborated with bio-medical scientists to study bio-SIMS analyses of skin and cancer tissues for biomedical diagnostics. We pay close attention to the setting up of a routine procedure for preparing tissue specimens and treating the surface before obtaining the bio-SIMS data. Bio-SIMS was used to study two biosystems, skin tissues for understanding the effects of photoaging and colon cancer tissues for insight into the development of new cancer diagnostics for cancer. Time-of-flight SIMS imaging measurements were taken after surface cleaning with cluster ion bombardment by Bi n or C 60 under varying conditions. The imaging capability of bio-SIMS with a spatial resolution of a few microns combined with principal component analysis reveal biologically meaningful information, but the lack of high molecular weight peaks even with cluster ion bombardment was a problem. This, among other problems, shows that discourse with biologists and medical doctors are critical to glean any meaningful information from SIMS mass spectrometric and imaging data. For SIMS to be accepted as a routine, daily analysis tool in biomedical laboratories, various practical sample handling methodology such as surface matrix treatment, including nano-metal particles and metal coating, in addition to cluster sputtering, should be studied.

  2. Dynamics of cD Clusters of Galaxies. 4; Conclusion of a Survey of 25 Abell Clusters

    NASA Technical Reports Server (NTRS)

    Oegerle, William R.; Hill, John M.; Fisher, Richard R. (Technical Monitor)

    2001-01-01

    We present the final results of a spectroscopic study of a sample of cD galaxy clusters. The goal of this program has been to study the dynamics of the clusters, with emphasis on determining the nature and frequency of cD galaxies with peculiar velocities. Redshifts measured with the MX Spectrometer have been combined with those obtained from the literature to obtain typically 50 - 150 observed velocities in each of 25 galaxy clusters containing a central cD galaxy. We present a dynamical analysis of the final 11 clusters to be observed in this sample. All 25 clusters are analyzed in a uniform manner to test for the presence of substructure, and to determine peculiar velocities and their statistical significance for the central cD galaxy. These peculiar velocities were used to determine whether or not the central cD galaxy is at rest in the cluster potential well. We find that 30 - 50% of the clusters in our sample possess significant subclustering (depending on the cluster radius used in the analysis), which is in agreement with other studies of non-cD clusters. Hence, the dynamical state of cD clusters is not different than other present-day clusters. After careful study, four of the clusters appear to have a cD galaxy with a significant peculiar velocity. Dressler-Shectman tests indicate that three of these four clusters have statistically significant substructure within 1.5/h(sub 75) Mpc of the cluster center. The dispersion 75 of the cD peculiar velocities is 164 +41/-34 km/s around the mean cluster velocity. This represents a significant detection of peculiar cD velocities, but at a level which is far below the mean velocity dispersion for this sample of clusters. The picture that emerges is one in which cD galaxies are nearly at rest with respect to the cluster potential well, but have small residual velocities due to subcluster mergers.

  3. Clustering Methods with Qualitative Data: A Mixed Methods Approach for Prevention Research with Small Samples

    PubMed Central

    Henry, David; Dymnicki, Allison B.; Mohatt, Nathaniel; Allen, James; Kelly, James G.

    2016-01-01

    Qualitative methods potentially add depth to prevention research, but can produce large amounts of complex data even with small samples. Studies conducted with culturally distinct samples often produce voluminous qualitative data, but may lack sufficient sample sizes for sophisticated quantitative analysis. Currently lacking in mixed methods research are methods allowing for more fully integrating qualitative and quantitative analysis techniques. Cluster analysis can be applied to coded qualitative data to clarify the findings of prevention studies by aiding efforts to reveal such things as the motives of participants for their actions and the reasons behind counterintuitive findings. By clustering groups of participants with similar profiles of codes in a quantitative analysis, cluster analysis can serve as a key component in mixed methods research. This article reports two studies. In the first study, we conduct simulations to test the accuracy of cluster assignment using three different clustering methods with binary data as produced when coding qualitative interviews. Results indicated that hierarchical clustering, K-Means clustering, and latent class analysis produced similar levels of accuracy with binary data, and that the accuracy of these methods did not decrease with samples as small as 50. Whereas the first study explores the feasibility of using common clustering methods with binary data, the second study provides a “real-world” example using data from a qualitative study of community leadership connected with a drug abuse prevention project. We discuss the implications of this approach for conducting prevention research, especially with small samples and culturally distinct communities. PMID:25946969

  4. Clustering Methods with Qualitative Data: a Mixed-Methods Approach for Prevention Research with Small Samples.

    PubMed

    Henry, David; Dymnicki, Allison B; Mohatt, Nathaniel; Allen, James; Kelly, James G

    2015-10-01

    Qualitative methods potentially add depth to prevention research but can produce large amounts of complex data even with small samples. Studies conducted with culturally distinct samples often produce voluminous qualitative data but may lack sufficient sample sizes for sophisticated quantitative analysis. Currently lacking in mixed-methods research are methods allowing for more fully integrating qualitative and quantitative analysis techniques. Cluster analysis can be applied to coded qualitative data to clarify the findings of prevention studies by aiding efforts to reveal such things as the motives of participants for their actions and the reasons behind counterintuitive findings. By clustering groups of participants with similar profiles of codes in a quantitative analysis, cluster analysis can serve as a key component in mixed-methods research. This article reports two studies. In the first study, we conduct simulations to test the accuracy of cluster assignment using three different clustering methods with binary data as produced when coding qualitative interviews. Results indicated that hierarchical clustering, K-means clustering, and latent class analysis produced similar levels of accuracy with binary data and that the accuracy of these methods did not decrease with samples as small as 50. Whereas the first study explores the feasibility of using common clustering methods with binary data, the second study provides a "real-world" example using data from a qualitative study of community leadership connected with a drug abuse prevention project. We discuss the implications of this approach for conducting prevention research, especially with small samples and culturally distinct communities.

  5. X-Ray Temperatures, Luminosities, and Masses from XMM-Newton Follow-up of the First Shear-selected Galaxy Cluster Sample

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Deshpande, Amruta J.; Hughes, John P.; Wittman, David, E-mail: amrejd@physics.rutgers.edu, E-mail: jph@physics.rutgers.edu, E-mail: dwittman@physics.ucdavis.edu

    We continue the study of the first sample of shear-selected clusters from the initial 8.6 square degrees of the Deep Lens Survey (DLS); a sample with well-defined selection criteria corresponding to the highest ranked shear peaks in the survey area. We aim to characterize the weak lensing selection by examining the sample’s X-ray properties. There are multiple X-ray clusters associated with nearly all the shear peaks: 14 X-ray clusters corresponding to seven DLS shear peaks. An additional three X-ray clusters cannot be definitively associated with shear peaks, mainly due to large positional offsets between the X-ray centroid and the shearmore » peak. Here we report on the XMM-Newton properties of the 17 X-ray clusters. The X-ray clusters display a wide range of luminosities and temperatures; the L {sub X} − T {sub X} relation we determine for the shear-associated X-ray clusters is consistent with X-ray cluster samples selected without regard to dynamical state, while it is inconsistent with self-similarity. For a subset of the sample, we measure X-ray masses using temperature as a proxy, and compare to weak lensing masses determined by the DLS team. The resulting mass comparison is consistent with equality. The X-ray and weak lensing masses show considerable intrinsic scatter (∼48%), which is consistent with X-ray selected samples when their X-ray and weak lensing masses are independently determined.« less

  6. 75 FR 16424 - Proposed Information Collection; Comment Request; Census Coverage Measurement Final Housing Unit...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-04-01

    ... unit is a block cluster, which consists of one or more geographically contiguous census blocks. As in... a number of distinct processes, ranging from forming block clusters, selecting the block clusters... sample of block clusters, while the E Sample is the census of housing units and enumerations in the same...

  7. Group sequential designs for stepped-wedge cluster randomised trials

    PubMed Central

    Grayling, Michael J; Wason, James MS; Mander, Adrian P

    2017-01-01

    Background/Aims: The stepped-wedge cluster randomised trial design has received substantial attention in recent years. Although various extensions to the original design have been proposed, no guidance is available on the design of stepped-wedge cluster randomised trials with interim analyses. In an individually randomised trial setting, group sequential methods can provide notable efficiency gains and ethical benefits. We address this by discussing how established group sequential methodology can be adapted for stepped-wedge designs. Methods: Utilising the error spending approach to group sequential trial design, we detail the assumptions required for the determination of stepped-wedge cluster randomised trials with interim analyses. We consider early stopping for efficacy, futility, or efficacy and futility. We describe first how this can be done for any specified linear mixed model for data analysis. We then focus on one particular commonly utilised model and, using a recently completed stepped-wedge cluster randomised trial, compare the performance of several designs with interim analyses to the classical stepped-wedge design. Finally, the performance of a quantile substitution procedure for dealing with the case of unknown variance is explored. Results: We demonstrate that the incorporation of early stopping in stepped-wedge cluster randomised trial designs could reduce the expected sample size under the null and alternative hypotheses by up to 31% and 22%, respectively, with no cost to the trial’s type-I and type-II error rates. The use of restricted error maximum likelihood estimation was found to be more important than quantile substitution for controlling the type-I error rate. Conclusion: The addition of interim analyses into stepped-wedge cluster randomised trials could help guard against time-consuming trials conducted on poor performing treatments and also help expedite the implementation of efficacious treatments. In future, trialists should consider incorporating early stopping of some kind into stepped-wedge cluster randomised trials according to the needs of the particular trial. PMID:28653550

  8. Group sequential designs for stepped-wedge cluster randomised trials.

    PubMed

    Grayling, Michael J; Wason, James Ms; Mander, Adrian P

    2017-10-01

    The stepped-wedge cluster randomised trial design has received substantial attention in recent years. Although various extensions to the original design have been proposed, no guidance is available on the design of stepped-wedge cluster randomised trials with interim analyses. In an individually randomised trial setting, group sequential methods can provide notable efficiency gains and ethical benefits. We address this by discussing how established group sequential methodology can be adapted for stepped-wedge designs. Utilising the error spending approach to group sequential trial design, we detail the assumptions required for the determination of stepped-wedge cluster randomised trials with interim analyses. We consider early stopping for efficacy, futility, or efficacy and futility. We describe first how this can be done for any specified linear mixed model for data analysis. We then focus on one particular commonly utilised model and, using a recently completed stepped-wedge cluster randomised trial, compare the performance of several designs with interim analyses to the classical stepped-wedge design. Finally, the performance of a quantile substitution procedure for dealing with the case of unknown variance is explored. We demonstrate that the incorporation of early stopping in stepped-wedge cluster randomised trial designs could reduce the expected sample size under the null and alternative hypotheses by up to 31% and 22%, respectively, with no cost to the trial's type-I and type-II error rates. The use of restricted error maximum likelihood estimation was found to be more important than quantile substitution for controlling the type-I error rate. The addition of interim analyses into stepped-wedge cluster randomised trials could help guard against time-consuming trials conducted on poor performing treatments and also help expedite the implementation of efficacious treatments. In future, trialists should consider incorporating early stopping of some kind into stepped-wedge cluster randomised trials according to the needs of the particular trial.

  9. The Effectiveness of Two Grammar Treatment Procedures for Children with SLI: A Randomized Clinical Trial

    ERIC Educational Resources Information Center

    Smith-Lock, Karen M.; Leitão, Suze; Prior, Polly; Nickels, Lyndsey

    2015-01-01

    Purpose: This study compared the effectiveness of two grammar treatment procedures for children with specific language impairment. Method: A double-blind superiority trial with cluster randomization was used to compare a cueing procedure, designed to elicit a correct production following an initial error, to a recasting procedure, which required…

  10. Performance of small cluster surveys and the clustered LQAS design to estimate local-level vaccination coverage in Mali.

    PubMed

    Minetti, Andrea; Riera-Montes, Margarita; Nackers, Fabienne; Roederer, Thomas; Koudika, Marie Hortense; Sekkenes, Johanne; Taconet, Aurore; Fermon, Florence; Touré, Albouhary; Grais, Rebecca F; Checchi, Francesco

    2012-10-12

    Estimation of vaccination coverage at the local level is essential to identify communities that may require additional support. Cluster surveys can be used in resource-poor settings, when population figures are inaccurate. To be feasible, cluster samples need to be small, without losing robustness of results. The clustered LQAS (CLQAS) approach has been proposed as an alternative, as smaller sample sizes are required. We explored (i) the efficiency of cluster surveys of decreasing sample size through bootstrapping analysis and (ii) the performance of CLQAS under three alternative sampling plans to classify local VC, using data from a survey carried out in Mali after mass vaccination against meningococcal meningitis group A. VC estimates provided by a 10 × 15 cluster survey design were reasonably robust. We used them to classify health areas in three categories and guide mop-up activities: i) health areas not requiring supplemental activities; ii) health areas requiring additional vaccination; iii) health areas requiring further evaluation. As sample size decreased (from 10 × 15 to 10 × 3), standard error of VC and ICC estimates were increasingly unstable. Results of CLQAS simulations were not accurate for most health areas, with an overall risk of misclassification greater than 0.25 in one health area out of three. It was greater than 0.50 in one health area out of two under two of the three sampling plans. Small sample cluster surveys (10 × 15) are acceptably robust for classification of VC at local level. We do not recommend the CLQAS method as currently formulated for evaluating vaccination programmes.

  11. Changes to Serum Sample Tube and Processing Methodology Does Not Cause Inter-Individual Variation in Automated Whole Serum N-Glycan Profiling in Health and Disease

    PubMed Central

    Shubhakar, Archana; Kalla, Rahul; Nimmo, Elaine R.; Fernandes, Daryl L.; Satsangi, Jack; Spencer, Daniel I. R.

    2015-01-01

    Introduction Serum N-glycans have been identified as putative biomarkers for numerous diseases. The impact of different serum sample tubes and processing methods on N-glycan analysis has received relatively little attention. This study aimed to determine the effect of different sample tubes and processing methods on the whole serum N-glycan profile in both health and disease. A secondary objective was to describe a robot automated N-glycan release, labeling and cleanup process for use in a biomarker discovery system. Methods 25 patients with active and quiescent inflammatory bowel disease and controls had three different serum sample tubes taken at the same draw. Two different processing methods were used for three types of tube (with and without gel-separation medium). Samples were randomised and processed in a blinded fashion. Whole serum N-glycan release, 2-aminobenzamide labeling and cleanup was automated using a Hamilton Microlab STARlet Liquid Handling robot. Samples were analysed using a hydrophilic interaction liquid chromatography/ethylene bridged hybrid(BEH) column on an ultra-high performance liquid chromatography instrument. Data were analysed quantitatively by pairwise correlation and hierarchical clustering using the area under each chromatogram peak. Qualitatively, a blinded assessor attempted to match chromatograms to each individual. Results There was small intra-individual variation in serum N-glycan profiles from samples collected using different sample processing methods. Intra-individual correlation coefficients were between 0.99 and 1. Unsupervised hierarchical clustering and principal coordinate analyses accurately matched samples from the same individual. Qualitative analysis demonstrated good chromatogram overlay and a blinded assessor was able to accurately match individuals based on chromatogram profile, regardless of disease status. Conclusions The three different serum sample tubes processed using the described methods cause minimal inter-individual variation in serum whole N-glycan profile when processed using an automated workstream. This has important implications for N-glycan biomarker discovery studies using different serum processing standard operating procedures. PMID:25831126

  12. Changes to serum sample tube and processing methodology does not cause Intra-Individual [corrected] variation in automated whole serum N-glycan profiling in health and disease.

    PubMed

    Ventham, Nicholas T; Gardner, Richard A; Kennedy, Nicholas A; Shubhakar, Archana; Kalla, Rahul; Nimmo, Elaine R; Fernandes, Daryl L; Satsangi, Jack; Spencer, Daniel I R

    2015-01-01

    Serum N-glycans have been identified as putative biomarkers for numerous diseases. The impact of different serum sample tubes and processing methods on N-glycan analysis has received relatively little attention. This study aimed to determine the effect of different sample tubes and processing methods on the whole serum N-glycan profile in both health and disease. A secondary objective was to describe a robot automated N-glycan release, labeling and cleanup process for use in a biomarker discovery system. 25 patients with active and quiescent inflammatory bowel disease and controls had three different serum sample tubes taken at the same draw. Two different processing methods were used for three types of tube (with and without gel-separation medium). Samples were randomised and processed in a blinded fashion. Whole serum N-glycan release, 2-aminobenzamide labeling and cleanup was automated using a Hamilton Microlab STARlet Liquid Handling robot. Samples were analysed using a hydrophilic interaction liquid chromatography/ethylene bridged hybrid(BEH) column on an ultra-high performance liquid chromatography instrument. Data were analysed quantitatively by pairwise correlation and hierarchical clustering using the area under each chromatogram peak. Qualitatively, a blinded assessor attempted to match chromatograms to each individual. There was small intra-individual variation in serum N-glycan profiles from samples collected using different sample processing methods. Intra-individual correlation coefficients were between 0.99 and 1. Unsupervised hierarchical clustering and principal coordinate analyses accurately matched samples from the same individual. Qualitative analysis demonstrated good chromatogram overlay and a blinded assessor was able to accurately match individuals based on chromatogram profile, regardless of disease status. The three different serum sample tubes processed using the described methods cause minimal inter-individual variation in serum whole N-glycan profile when processed using an automated workstream. This has important implications for N-glycan biomarker discovery studies using different serum processing standard operating procedures.

  13. An agglomerative hierarchical clustering approach to visualisation in Bayesian clustering problems

    PubMed Central

    Dawson, Kevin J.; Belkhir, Khalid

    2009-01-01

    Clustering problems (including the clustering of individuals into outcrossing populations, hybrid generations, full-sib families and selfing lines) have recently received much attention in population genetics. In these clustering problems, the parameter of interest is a partition of the set of sampled individuals, - the sample partition. In a fully Bayesian approach to clustering problems of this type, our knowledge about the sample partition is represented by a probability distribution on the space of possible sample partitions. Since the number of possible partitions grows very rapidly with the sample size, we can not visualise this probability distribution in its entirety, unless the sample is very small. As a solution to this visualisation problem, we recommend using an agglomerative hierarchical clustering algorithm, which we call the exact linkage algorithm. This algorithm is a special case of the maximin clustering algorithm that we introduced previously. The exact linkage algorithm is now implemented in our software package Partition View. The exact linkage algorithm takes the posterior co-assignment probabilities as input, and yields as output a rooted binary tree, - or more generally, a forest of such trees. Each node of this forest defines a set of individuals, and the node height is the posterior co-assignment probability of this set. This provides a useful visual representation of the uncertainty associated with the assignment of individuals to categories. It is also a useful starting point for a more detailed exploration of the posterior distribution in terms of the co-assignment probabilities. PMID:19337306

  14. Using data mining to segment healthcare markets from patients' preference perspectives.

    PubMed

    Liu, Sandra S; Chen, Jie

    2009-01-01

    This paper aims to provide an example of how to use data mining techniques to identify patient segments regarding preferences for healthcare attributes and their demographic characteristics. Data were derived from a number of individuals who received in-patient care at a health network in 2006. Data mining and conventional hierarchical clustering with average linkage and Pearson correlation procedures are employed and compared to show how each procedure best determines segmentation variables. Data mining tools identified three differentiable segments by means of cluster analysis. These three clusters have significantly different demographic profiles. The study reveals, when compared with traditional statistical methods, that data mining provides an efficient and effective tool for market segmentation. When there are numerous cluster variables involved, researchers and practitioners need to incorporate factor analysis for reducing variables to clearly and meaningfully understand clusters. Interests and applications in data mining are increasing in many businesses. However, this technology is seldom applied to healthcare customer experience management. The paper shows that efficient and effective application of data mining methods can aid the understanding of patient healthcare preferences.

  15. Empirical Identification of Hierarchies.

    ERIC Educational Resources Information Center

    McCormick, Douglas; And Others

    Outlining a cluster procedure which maximizes specific criteria while building scales from binary measures using a sequential, agglomerative, overlapping, non-hierarchic method results in indices giving truer results than exploratory facotr analyses or multidimensional scaling. In a series of eleven figures, patterns within cluster histories…

  16. Sample size calculation in cost-effectiveness cluster randomized trials: optimal and maximin approaches.

    PubMed

    Manju, Md Abu; Candel, Math J J M; Berger, Martijn P F

    2014-07-10

    In this paper, the optimal sample sizes at the cluster and person levels for each of two treatment arms are obtained for cluster randomized trials where the cost-effectiveness of treatments on a continuous scale is studied. The optimal sample sizes maximize the efficiency or power for a given budget or minimize the budget for a given efficiency or power. Optimal sample sizes require information on the intra-cluster correlations (ICCs) for effects and costs, the correlations between costs and effects at individual and cluster levels, the ratio of the variance of effects translated into costs to the variance of the costs (the variance ratio), sampling and measuring costs, and the budget. When planning, a study information on the model parameters usually is not available. To overcome this local optimality problem, the current paper also presents maximin sample sizes. The maximin sample sizes turn out to be rather robust against misspecifying the correlation between costs and effects at the cluster and individual levels but may lose much efficiency when misspecifying the variance ratio. The robustness of the maximin sample sizes against misspecifying the ICCs depends on the variance ratio. The maximin sample sizes are robust under misspecification of the ICC for costs for realistic values of the variance ratio greater than one but not robust under misspecification of the ICC for effects. Finally, we show how to calculate optimal or maximin sample sizes that yield sufficient power for a test on the cost-effectiveness of an intervention.

  17. VizieR Online Data Catalog: LAMOST survey of star clusters in M31. II. (Chen+, 2016)

    NASA Astrophysics Data System (ADS)

    Chen, B.; Liu, X.; Xiang, M.; Yuan, H.; Huang, Y.; Shi, J.; Fan, Z.; Huo, Z.; Wang, C.; Ren, J.; Tian, Z.; Zhang, H.; Liu, G.; Cao, Z.; Zhang, Y.; Hou, Y.; Wang, Y.

    2016-09-01

    We select a sample of 306 massive star clusters observed with the Large Sky Area Multi-Object Fibre Spectroscopic Telescope (LAMOST) in the vicinity fields of M31 and M33. Massive clusters in our sample are all selected from the catalog presented in Paper I (Chen et al. 2015, Cat. J/other/RAA/15.1392), including five newly discovered clusters selected with the SDSS photometry, three newly confirmed, and 298 previously known clusters from Revised Bologna Catalogue (RBC; Galleti et al. 2012, Cat. V/143; http://www.bo.astro.it/M31/). Since then another two objects, B341 and B207, have also been observed with LAMOST, and they are included in the current analysis. The current sample does not include those listed in Paper I but is selected from Johnson et al. 2012 (Cat. J/ApJ/752/95) since most of them are young but not so massive. All objects are observed with LAMOST between 2011 September and 2014 June. Table1 lists the name, position, and radial velocity of all sample clusters analyzed in the current work. The LAMOST spectra cover the wavelength range 3700-9000Å at a resolving power of R~1800. Details about the observations and data reduction can be found in Paper I. The median signal-to-noise ratio (S/N) per pixel at 4750 and 7450Å of spectra of all clusters in the current sample are, respectively, 14 and 37. Essentially all spectra have S/N(4750Å)>5 except for the spectra of 18 clusters. The latter have S/N(7540Å)>10. Peacock et al. 2010 (Cat. J/MNRAS/402/803) retrieved images of M31 star clusters and candidates from the SDSS archive and extracted ugriz aperture photometric magnitudes from those objects using the SExtractor. They present a catalog containing homogeneous ugriz photometry of 572 star clusters and 373 candidates. Among them, 299 clusters are in our sample. (2 data files).

  18. CHEERS: The chemical evolution RGS sample

    NASA Astrophysics Data System (ADS)

    de Plaa, J.; Kaastra, J. S.; Werner, N.; Pinto, C.; Kosec, P.; Zhang, Y.-Y.; Mernier, F.; Lovisari, L.; Akamatsu, H.; Schellenberger, G.; Hofmann, F.; Reiprich, T. H.; Finoguenov, A.; Ahoranta, J.; Sanders, J. S.; Fabian, A. C.; Pols, O.; Simionescu, A.; Vink, J.; Böhringer, H.

    2017-11-01

    Context. The chemical yields of supernovae and the metal enrichment of the intra-cluster medium (ICM) are not well understood. The hot gas in clusters of galaxies has been enriched with metals originating from billions of supernovae and provides a fair sample of large-scale metal enrichment in the Universe. High-resolution X-ray spectra of clusters of galaxies provide a unique way of measuring abundances in the hot intracluster medium (ICM). The abundance measurements can provide constraints on the supernova explosion mechanism and the initial-mass function of the stellar population. This paper introduces the CHEmical Enrichment RGS Sample (CHEERS), which is a sample of 44 bright local giant ellipticals, groups, and clusters of galaxies observed with XMM-Newton. Aims: The CHEERS project aims to provide the most accurate set of cluster abundances measured in X-rays using this sample. This paper focuses specifically on the abundance measurements of O and Fe using the reflection grating spectrometer (RGS) on board XMM-Newton. We aim to thoroughly discuss the cluster to cluster abundance variations and the robustness of the measurements. Methods: We have selected the CHEERS sample such that the oxygen abundance in each cluster is detected at a level of at least 5σ in the RGS. The dispersive nature of the RGS limits the sample to clusters with sharp surface brightness peaks. The deep exposures and the size of the sample allow us to quantify the intrinsic scatter and the systematic uncertainties in the abundances using spectral modeling techniques. Results: We report the oxygen and iron abundances as measured with RGS in the core regions of all 44 clusters in the sample. We do not find a significant trend of O/Fe as a function of cluster temperature, but we do find an intrinsic scatter in the O and Fe abundances from cluster to cluster. The level of systematic uncertainties in the O/Fe ratio is estimated to be around 20-30%, while the systematic uncertainties in the absolute O and Fe abundances can be as high as 50% in extreme cases. Thanks to the high statistics of the observations, we were able to identify and correct a systematic bias in the oxygen abundance determination that was due to an inaccuracy in the spectral model. Conclusions: The lack of dependence of O/Fe on temperature suggests that the enrichment of the ICM does not depend on cluster mass and that most of the enrichment likely took place before the ICM was formed. We find that the observed scatter in the O/Fe ratio is due to a combination of intrinsic scatter in the source and systematic uncertainties in the spectral fitting, which we are unable to separate. The astrophysical source of intrinsic scatter could be due to differences in active galactic nucleus activity and ongoing star formation in the brightest cluster galaxy. The systematic scatter is due to uncertainties in the spatial line broadening, absorption column, multi-temperature structure, and the thermal plasma models.

  19. Cluster Masses Derived from X-ray and Sunyaev-Zeldovich Effect Measurements

    NASA Technical Reports Server (NTRS)

    Laroque, S.; Joy, Marshall; Bonamente, M.; Carlstrom, J.; Dawson, K.

    2003-01-01

    We infer the gas mass and total gravitational mass of 11 clusters using two different methods; analysis of X-ray data from the Chandra X-ray Observatory and analysis of centimeter-wave Sunyaev-Zel'dovich Effect (SZE) data from the BIMA and OVRO interferometers. This flux-limited sample of clusters from the BCS cluster catalogue was chosen so as to be well above the surface brightness limit of the ROSAT All Sky Survey; this is therefore an orientation unbiased sample. The gas mass fraction, f_g, is calculated for each cluster using both X-ray and SZE data, and the results are compared at a fiducial radius of r_500. Comparison of the X-ray and SZE results for this orientation unbiased sample allows us to constrain cluster systematics, such as clumping of the intracluster medium. We derive an upper limit on Omega_M assuming that the mass composition of clusters within r_500 reflects the universal mass composition Omega_M h_100 is greater than Omega _B / f-g. We also demonstrate how the mean f_g derived from the sample can be used to estimate the masses of clusters discovered by upcoming deep SZE surveys.

  20. Advances in Significance Testing for Cluster Detection

    NASA Astrophysics Data System (ADS)

    Coleman, Deidra Andrea

    Over the past two decades, much attention has been given to data driven project goals such as the Human Genome Project and the development of syndromic surveillance systems. A major component of these types of projects is analyzing the abundance of data. Detecting clusters within the data can be beneficial as it can lead to the identification of specified sequences of DNA nucleotides that are related to important biological functions or the locations of epidemics such as disease outbreaks or bioterrorism attacks. Cluster detection techniques require efficient and accurate hypothesis testing procedures. In this dissertation, we improve upon the hypothesis testing procedures for cluster detection by enhancing distributional theory and providing an alternative method for spatial cluster detection using syndromic surveillance data. In Chapter 2, we provide an efficient method to compute the exact distribution of the number and coverage of h-clumps of a collection of words. This method involves defining a Markov chain using a minimal deterministic automaton to reduce the number of states needed for computation. We allow words of the collection to contain other words of the collection making the method more general. We use our method to compute the distributions of the number and coverage of h-clumps in the Chi motif of H. influenza.. In Chapter 3, we provide an efficient algorithm to compute the exact distribution of multiple window discrete scan statistics for higher-order, multi-state Markovian sequences. This algorithm involves defining a Markov chain to efficiently keep track of probabilities needed to compute p-values of the statistic. We use our algorithm to identify cases where the available approximation does not perform well. We also use our algorithm to detect unusual clusters of made free throw shots by National Basketball Association players during the 2009-2010 regular season. In Chapter 4, we give a procedure to detect outbreaks using syndromic surveillance data while controlling the Bayesian False Discovery Rate (BFDR). The procedure entails choosing an appropriate Bayesian model that captures the spatial dependency inherent in epidemiological data and considers all days of interest, selecting a test statistic based on a chosen measure that provides the magnitude of the maximumal spatial cluster for each day, and identifying a cutoff value that controls the BFDR for rejecting the collective null hypothesis of no outbreak over a collection of days for a specified region.We use our procedure to analyze botulism-like syndrome data collected by the North Carolina Disease Event Tracking and Epidemiologic Collection Tool (NC DETECT).

  1. Physical properties of star clusters in the outer LMC as observed by the DES

    DOE PAGES

    Pieres, A.; Santiago, B.; Balbinot, E.; ...

    2016-05-26

    The Large Magellanic Cloud (LMC) harbors a rich and diverse system of star clusters, whose ages, chemical abundances, and positions provide information about the LMC history of star formation. We use Science Verification imaging data from the Dark Energy Survey to increase the census of known star clusters in the outer LMC and to derive physical parameters for a large sample of such objects using a spatially and photometrically homogeneous data set. Our sample contains 255 visually identified cluster candidates, of which 109 were not listed in any previous catalog. We quantify the crowding effect for the stellar sample producedmore » by the DES Data Management pipeline and conclude that the stellar completeness is < 10% inside typical LMC cluster cores. We therefore develop a pipeline to sample and measure stellar magnitudes and positions around the cluster candidates using DAOPHOT. We also implement a maximum-likelihood method to fit individual density profiles and colour-magnitude diagrams. For 117 (from a total of 255) of the cluster candidates (28 uncatalogued clusters), we obtain reliable ages, metallicities, distance moduli and structural parameters, confirming their nature as physical systems. The distribution of cluster metallicities shows a radial dependence, with no clusters more metal-rich than [Fe/H] ~ -0.7 beyond 8 kpc from the LMC center. Furthermore, the age distribution has two peaks at ≃ 1.2 Gyr and ≃ 2.7 Gyr.« less

  2. Physical properties of star clusters in the outer LMC as observed by the DES

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pieres, A.; Santiago, B.; Balbinot, E.

    The Large Magellanic Cloud (LMC) harbors a rich and diverse system of star clusters, whose ages, chemical abundances, and positions provide information about the LMC history of star formation. We use Science Verification imaging data from the Dark Energy Survey to increase the census of known star clusters in the outer LMC and to derive physical parameters for a large sample of such objects using a spatially and photometrically homogeneous data set. Our sample contains 255 visually identified cluster candidates, of which 109 were not listed in any previous catalog. We quantify the crowding effect for the stellar sample producedmore » by the DES Data Management pipeline and conclude that the stellar completeness is < 10% inside typical LMC cluster cores. We therefore develop a pipeline to sample and measure stellar magnitudes and positions around the cluster candidates using DAOPHOT. We also implement a maximum-likelihood method to fit individual density profiles and colour-magnitude diagrams. For 117 (from a total of 255) of the cluster candidates (28 uncatalogued clusters), we obtain reliable ages, metallicities, distance moduli and structural parameters, confirming their nature as physical systems. The distribution of cluster metallicities shows a radial dependence, with no clusters more metal-rich than [Fe/H] ~ -0.7 beyond 8 kpc from the LMC center. Furthermore, the age distribution has two peaks at ≃ 1.2 Gyr and ≃ 2.7 Gyr.« less

  3. Detailed analysis of CAMS procedures for phase 3 using ground truth inventories

    NASA Technical Reports Server (NTRS)

    Carnes, J. G.

    1979-01-01

    The results of a study of Procedure 1 as used during LACIE Phase 3 are presented. The study was performed by comparing the Procedure 1 classification results with digitized ground-truth inventories. The proportion estimation accuracy, dot labeling accuracy, and clustering effectiveness are discussed.

  4. A local search for a graph clustering problem

    NASA Astrophysics Data System (ADS)

    Navrotskaya, Anna; Il'ev, Victor

    2016-10-01

    In the clustering problems one has to partition a given set of objects (a data set) into some subsets (called clusters) taking into consideration only similarity of the objects. One of most visual formalizations of clustering is graph clustering, that is grouping the vertices of a graph into clusters taking into consideration the edge structure of the graph whose vertices are objects and edges represent similarities between the objects. In the graph k-clustering problem the number of clusters does not exceed k and the goal is to minimize the number of edges between clusters and the number of missing edges within clusters. This problem is NP-hard for any k ≥ 2. We propose a polynomial time (2k-1)-approximation algorithm for graph k-clustering. Then we apply a local search procedure to the feasible solution found by this algorithm and hold experimental research of obtained heuristics.

  5. X-ray morphological study of galaxy cluster catalogues

    NASA Astrophysics Data System (ADS)

    Democles, Jessica; Pierre, Marguerite; Arnaud, Monique

    2016-07-01

    Context : The intra-cluster medium distribution as probed by X-ray morphology based analysis gives good indication of the system dynamical state. In the race for the determination of precise scaling relations and understanding their scatter, the dynamical state offers valuable information. Method : We develop the analysis of the centroid-shift so that it can be applied to characterize galaxy cluster surveys such as the XXL survey or high redshift cluster samples. We use it together with the surface brightness concentration parameter and the offset between X-ray peak and brightest cluster galaxy in the context of the XXL bright cluster sample (Pacaud et al 2015) and a set of high redshift massive clusters detected by Planck and SPT and observed by both XMM-Newton and Chandra observatories. Results : Using the wide redshift coverage of the XXL sample, we see no trend between the dynamical state of the systems with the redshift.

  6. Lower likelihood of cardiac procedures after acute coronary syndrome in patients with human immunodeficiency virus/acquired immunodeficiency syndrome.

    PubMed

    Clement, Meredith E; Lin, Li; Navar, Ann Marie; Okeke, Nwora Lance; Naggie, Susanna; Douglas, Pamela S

    2018-02-01

    Cardiovascular disease (CVD) is an increasing cause of morbidity and mortality in human immunodeficiency virus (HIV)-infected adults; however, this population may be less likely to receive interventions during hospitalization for acute coronary syndrome (ACS). The degree to which this disparity can be attributed to poorly controlled HIV infection is unknown.In this large cohort study, we used the National Inpatient Sample (NIS) to compare rates of cardiac procedures among patients with asymptomatic HIV-infection, symptomatic acquired immunodeficiency syndrome (AIDS), and uninfected adults hospitalized with ACS from 2009 to 2012. Multivariable analysis was used to compare procedure rates by HIV status, with appropriate weighting to account for NIS sampling design including stratification and hospital clustering.The dataset included 1,091,759 ACS hospitalizations, 0.35% of which (n = 3783) were in HIV-infected patients. Patients with symptomatic AIDS, asymptomatic HIV, and uninfected patients differed by sex, race, and income status. Overall rates of cardiac catheterization and revascularization were 53.3% and 37.4%, respectively. In multivariable regression, we found that relative to uninfected patients, those with symptomatic AIDS were less likely to undergo catheterization (odds ratio [OR] 0.48, confidence interval [CI] 0.43-0.55), percutaneous coronary intervention (OR 0.69, CI 0.59-0.79), and coronary artery bypass grafting (0.75, CI 0.61-0.93). No difference was seen for those with asymptomatic HIV relative to uninfected patients (OR 0.93, CI 0.81-1.07; OR 1.06, CI 0.93-1.21; OR 0.88, CI 0.72-1.06, respectively).We found that lower rates of cardiovascular procedures in HIV-infected patients were primarily driven by less frequent procedures in those with AIDS.

  7. Lower likelihood of cardiac procedures after acute coronary syndrome in patients with human immunodeficiency virus/acquired immunodeficiency syndrome

    PubMed Central

    Clement, Meredith E.; Lin, Li; Navar, Ann Marie; Okeke, Nwora Lance; Naggie, Susanna; Douglas, Pamela S.

    2018-01-01

    Abstract Cardiovascular disease (CVD) is an increasing cause of morbidity and mortality in human immunodeficiency virus (HIV)-infected adults; however, this population may be less likely to receive interventions during hospitalization for acute coronary syndrome (ACS). The degree to which this disparity can be attributed to poorly controlled HIV infection is unknown. In this large cohort study, we used the National Inpatient Sample (NIS) to compare rates of cardiac procedures among patients with asymptomatic HIV-infection, symptomatic acquired immunodeficiency syndrome (AIDS), and uninfected adults hospitalized with ACS from 2009 to 2012. Multivariable analysis was used to compare procedure rates by HIV status, with appropriate weighting to account for NIS sampling design including stratification and hospital clustering. The dataset included 1,091,759 ACS hospitalizations, 0.35% of which (n = 3783) were in HIV-infected patients. Patients with symptomatic AIDS, asymptomatic HIV, and uninfected patients differed by sex, race, and income status. Overall rates of cardiac catheterization and revascularization were 53.3% and 37.4%, respectively. In multivariable regression, we found that relative to uninfected patients, those with symptomatic AIDS were less likely to undergo catheterization (odds ratio [OR] 0.48, confidence interval [CI] 0.43–0.55), percutaneous coronary intervention (OR 0.69, CI 0.59–0.79), and coronary artery bypass grafting (0.75, CI 0.61–0.93). No difference was seen for those with asymptomatic HIV relative to uninfected patients (OR 0.93, CI 0.81–1.07; OR 1.06, CI 0.93–1.21; OR 0.88, CI 0.72–1.06, respectively). We found that lower rates of cardiovascular procedures in HIV-infected patients were primarily driven by less frequent procedures in those with AIDS. PMID:29419696

  8. Effect of centrifugation on dynamic susceptibility of magnetic fluids

    NASA Astrophysics Data System (ADS)

    Pshenichnikov, Alexander; Lebedev, Alexander; Lakhtina, Ekaterina; Kuznetsov, Andrey

    2017-06-01

    The dispersive composition, dynamic susceptibility and spectrum of times of magnetization relaxation for six samples of magnetic fluid obtained by centrifuging two base colloidal solutions of the magnetite in kerosene was investigated experimentally. The base solutions differed by the concentration of the magnetic phase and the width of the particle size distribution. The procedure of cluster analysis allowing one to estimate the characteristic sizes of aggregates with uncompensated magnetic moments was described. The results of the magnetogranulometric and cluster analyses were discussed. It was shown that centrifugation has a strong effect on the physical properties of the separated fractions, which is related to the spatial redistribution of particles and multi-particle aggregates. The presence of aggregates in magnetic fluids is interpreted as the main reason of low-frequency (0.1-10 kHz) dispersion of the dynamic susceptibility. The obtained results count in favor of using centrifugation as an effective means of changing the dynamic susceptibility over wide limits and obtaining fluids with the specified type of susceptibility dispersion.

  9. HIV risk reduction intervention among traditionally circumcised young men in South Africa: a cluster randomized control trial.

    PubMed

    Peltzer, Karl; Simbayi, Leickness; Banyini, Mercy; Kekana, Queen

    2011-01-01

    The aim of this study was to test a 180-minute group HIV risk-reduction counseling intervention trial with men undergoing traditional circumcision in South Africa to reduce behavioral disinhibition (false security) as a result of the procedure. A cluster randomized controlled trial design was employed using a sample of 160 men, 80 in the experimental group and 80 in the control group. Comparisons between baseline and 3-month follow-up assessments on key behavioral outcomes were completed. We found that behavioral intentions, risk-reduction skills, and male role norms did not change in the experimental compared to the control condition. However, HIV-related stigma beliefs were significantly reduced in both conditions over time. These findings show that one small-group HIV risk-reduction intervention did not reduce sexual risk behaviors in recently traditionally circumcised men at high risk for behavioral disinhibition. Copyright © 2011 Association of Nurses in AIDS Care. Published by Elsevier Inc. All rights reserved.

  10. Planck/SDSS cluster mass and gas scaling relations for a volume-complete redMaPPer sample

    NASA Astrophysics Data System (ADS)

    Jimeno, Pablo; Diego, Jose M.; Broadhurst, Tom; De Martino, I.; Lazkoz, Ruth

    2018-07-01

    Using Planck satellite data, we construct Sunyaev-Zel'dovich (SZ) gas pressure profiles for a large, volume-complete sample of optically selected clusters. We have defined a sample of over 8000 redMaPPer clusters from the Sloan Digital Sky Survey, within the volume-complete redshift region 0.100

  11. Toward An Understanding of Cluster Evolution: A Deep X-Ray Selected Cluster Catalog from ROSAT

    NASA Technical Reports Server (NTRS)

    Jones, Christine; Oliversen, Ronald (Technical Monitor)

    2002-01-01

    In the past year, we have focussed on studying individual clusters found in this sample with Chandra, as well as using Chandra to measure the luminosity-temperature relation for a sample of distant clusters identified through the ROSAT study, and finally we are continuing our study of fossil groups. For the luminosity-temperature study, we compared a sample of nearby clusters with a sample of distant clusters and, for the first time, measured a significant change in the relation as a function of redshift (Vikhlinin et al. in final preparation for submission to Cape). We also used our ROSAT analysis to select and propose for Chandra observations of individual clusters. We are now analyzing the Chandra observations of the distant cluster A520, which appears to have undergone a recent merger. Finally, we have completed the analysis of the fossil groups identified in ROM observations. In the past few months, we have derived X-ray fluxes and luminosities as well as X-ray extents for an initial sample of 89 objects. Based on the X-ray extents and the lack of bright galaxies, we have identified 16 fossil groups. We are comparing their X-ray and optical properties with those of optically rich groups. A paper is being readied for submission (Jones, Forman, and Vikhlinin in preparation).

  12. Ecological tolerances of Miocene larger benthic foraminifera from Indonesia

    NASA Astrophysics Data System (ADS)

    Novak, Vibor; Renema, Willem

    2018-01-01

    To provide a comprehensive palaeoenvironmental reconstruction based on larger benthic foraminifera (LBF), a quantitative analysis of their assemblage composition is needed. Besides microfacies analysis which includes environmental preferences of foraminiferal taxa, statistical analyses should also be employed. Therefore, detrended correspondence analysis and cluster analysis were performed on relative abundance data of identified LBF assemblages deposited in mixed carbonate-siliciclastic (MCS) systems and blue-water (BW) settings. Studied MCS system localities include ten sections from the central part of the Kutai Basin in East Kalimantan, ranging from late Burdigalian to Serravallian age. The BW samples were collected from eleven sections of the Bulu Formation on Central Java, dated as Serravallian. Results from detrended correspondence analysis reveal significant differences between these two environmental settings. Cluster analysis produced five clusters of samples; clusters 1 and 2 comprise dominantly MCS samples, clusters 3 and 4 with dominance of BW samples, and cluster 5 showing a mixed composition with both MCS and BW samples. The results of cluster analysis were afterwards subjected to indicator species analysis resulting in the interpretation that generated three groups among LBF taxa: typical assemblage indicators, regularly occurring taxa and rare taxa. By interpreting the results of detrended correspondence analysis, cluster analysis and indicator species analysis, along with environmental preferences of identified LBF taxa, a palaeoenvironmental model is proposed for the distribution of LBF in Miocene MCS systems and adjacent BW settings of Indonesia.

  13. X-Ray Temperatures, Luminosities, and Masses from XMM-Newton Follow-upof the First Shear-selected Galaxy Cluster Sample

    NASA Astrophysics Data System (ADS)

    Deshpande, Amruta J.; Hughes, John P.; Wittman, David

    2017-04-01

    We continue the study of the first sample of shear-selected clusters from the initial 8.6 square degrees of the Deep Lens Survey (DLS); a sample with well-defined selection criteria corresponding to the highest ranked shear peaks in the survey area. We aim to characterize the weak lensing selection by examining the sample’s X-ray properties. There are multiple X-ray clusters associated with nearly all the shear peaks: 14 X-ray clusters corresponding to seven DLS shear peaks. An additional three X-ray clusters cannot be definitively associated with shear peaks, mainly due to large positional offsets between the X-ray centroid and the shear peak. Here we report on the XMM-Newton properties of the 17 X-ray clusters. The X-ray clusters display a wide range of luminosities and temperatures; the L X -T X relation we determine for the shear-associated X-ray clusters is consistent with X-ray cluster samples selected without regard to dynamical state, while it is inconsistent with self-similarity. For a subset of the sample, we measure X-ray masses using temperature as a proxy, and compare to weak lensing masses determined by the DLS team. The resulting mass comparison is consistent with equality. The X-ray and weak lensing masses show considerable intrinsic scatter (˜48%), which is consistent with X-ray selected samples when their X-ray and weak lensing masses are independently determined. Some of the data presented herein were obtained at the W.M. Keck Observatory, which is operated as a scientific partnership among the California Institute of Technology, the University of California, and the National Aeronautics and Space Administration. The Observatory was made possible by the generous financial support of the W. M. Keck Foundation.

  14. Performance of small cluster surveys and the clustered LQAS design to estimate local-level vaccination coverage in Mali

    PubMed Central

    2012-01-01

    Background Estimation of vaccination coverage at the local level is essential to identify communities that may require additional support. Cluster surveys can be used in resource-poor settings, when population figures are inaccurate. To be feasible, cluster samples need to be small, without losing robustness of results. The clustered LQAS (CLQAS) approach has been proposed as an alternative, as smaller sample sizes are required. Methods We explored (i) the efficiency of cluster surveys of decreasing sample size through bootstrapping analysis and (ii) the performance of CLQAS under three alternative sampling plans to classify local VC, using data from a survey carried out in Mali after mass vaccination against meningococcal meningitis group A. Results VC estimates provided by a 10 × 15 cluster survey design were reasonably robust. We used them to classify health areas in three categories and guide mop-up activities: i) health areas not requiring supplemental activities; ii) health areas requiring additional vaccination; iii) health areas requiring further evaluation. As sample size decreased (from 10 × 15 to 10 × 3), standard error of VC and ICC estimates were increasingly unstable. Results of CLQAS simulations were not accurate for most health areas, with an overall risk of misclassification greater than 0.25 in one health area out of three. It was greater than 0.50 in one health area out of two under two of the three sampling plans. Conclusions Small sample cluster surveys (10 × 15) are acceptably robust for classification of VC at local level. We do not recommend the CLQAS method as currently formulated for evaluating vaccination programmes. PMID:23057445

  15. WHISPER or SHOUT study: protocol of a cluster-randomised controlled trial assessing mHealth sexual reproductive health and nutrition interventions among female sex workers in Mombasa, Kenya

    PubMed Central

    Ampt, Frances H; Mudogo, Collins; Gichangi, Peter; Lim, Megan S C; Manguro, Griffins; Chersich, Matthew; Jaoko, Walter; Temmerman, Marleen; Laini, Marilyn; Comrie-Thomson, Liz; Stoové, Mark; Agius, Paul A; Hellard, Margaret; L’Engle, Kelly; Luchters, Stanley

    2017-01-01

    Introduction New interventions are required to reduce unintended pregnancies among female sex workers (FSWs) in low- and middle-income countries and to improve their nutritional health. Given sex workers’ high mobile phone usage, repeated exposure to short messaging service (SMS) messages could address individual and interpersonal barriers to contraceptive uptake and better nutrition. Methods In this two-arm cluster randomised trial, each arm constitutes an equal-attention control group for the other. SMS messages were developed systematically, participatory and theory-driven and cover either sexual and reproductive health (WHISPER) or nutrition (SHOUT). Messages are sent to participants 2–3 times/week for 12 months and include fact-based and motivational content as well as role model stories. Participants can send reply texts to obtain additional information. Sex work venues (clusters) in Mombasa, Kenya, were randomly sampled with a probability proportionate to venue size. Up to 10 women were recruited from each venue to enrol 860 women. FSWs aged 16–35 years, who owned a mobile phone and were not pregnant at enrolment were eligible. Structured questionnaires, pregnancy tests, HIV and syphilis rapid tests and full blood counts were performed at enrolment, with subsequent visits at 6 and 12 months. Analysis The primary outcomes of WHISPER and SHOUT are unintended pregnancy incidence and prevalence of anaemia at 12 months, respectively. Each will be compared between study groups using discrete-time survival analysis. Potential limitations Contamination may occur if participants discuss their intervention with those in the other trial arm. This is mitigated by cluster recruitment and only sampling a small proportion of sex work venues from the sampling frame. Conclusions The design allows for the simultaneous testing of two independent mHealth interventions for which messaging frequency and study procedures are identical. This trial may guide future mHealth initiatives and provide methodological insights into use of reciprocal control groups. Trial registration number ACTRN12616000852459; Pre-results. PMID:28821530

  16. Clustering behavior in microbial communities from acute endodontic infections.

    PubMed

    Montagner, Francisco; Jacinto, Rogério C; Signoretti, Fernanda G C; Sanches, Paula F; Gomes, Brenda P F A

    2012-02-01

    Acute endodontic infections harbor heterogeneous microbial communities in both the root canal (RC) system and apical tissues. Data comparing the microbial structure and diversity in endodontic infections in related ecosystems, such as RC with necrotic pulp and acute apical abscess (AAA), are scarce in the literature. The aim of this study was to examine the presence of selected endodontic pathogens in paired samples from necrotic RC and AAA using polymerase chain reaction (PCR) followed by the construction of cluster profiles. Paired samples of RC and AAA exudates were collected from 20 subjects and analyzed by PCR for the presence of selected strict and facultative anaerobic strains. The frequency of species was compared between the RC and the AAA samples. A stringent neighboring clustering algorithm was applied to investigate the existence of similar high-order groups of samples. A dendrogram was constructed to show the arrangement of the sample groups produced by the hierarchical clustering. All samples harbored bacterial DNA. Porphyromonas endodontalis, Prevotella nigrescens, Filifactor alocis, and Tannerela forsythia were frequently detected in both RC and AAA samples. The selected anaerobic species were distributed in diverse small bacteria consortia. The samples of RC and AAA that presented at least one of the targeted microorganisms were grouped in small clusters. Anaerobic species were frequently detected in acute endodontic infections and heterogeneous microbial communities with low clustering behavior were observed in paired samples of RC and AAA. Copyright © 2012. Published by Elsevier Inc.

  17. Tobacco, Marijuana, and Alcohol Use in University Students: A Cluster Analysis

    PubMed Central

    Primack, Brian A.; Kim, Kevin H.; Shensa, Ariel; Sidani, Jaime E.; Barnett, Tracey E.; Switzer, Galen E.

    2012-01-01

    Objective Segmentation of populations may facilitate development of targeted substance abuse prevention programs. We aimed to partition a national sample of university students according to profiles based on substance use. Participants We used 2008–2009 data from the National College Health Assessment from the American College Health Association. Our sample consisted of 111,245 individuals from 158 institutions. Method We partitioned the sample using cluster analysis according to current substance use behaviors. We examined the association of cluster membership with individual and institutional characteristics. Results Cluster analysis yielded six distinct clusters. Three individual factors—gender, year in school, and fraternity/sorority membership—were the most strongly associated with cluster membership. Conclusions In a large sample of university students, we were able to identify six distinct patterns of substance abuse. It may be valuable to target specific populations of college-aged substance users based on individual factors. However, comprehensive intervention will require a multifaceted approach. PMID:22686360

  18. Nonlinear dimension reduction and clustering by Minimum Curvilinearity unfold neuropathic pain and tissue embryological classes.

    PubMed

    Cannistraci, Carlo Vittorio; Ravasi, Timothy; Montevecchi, Franco Maria; Ideker, Trey; Alessio, Massimo

    2010-09-15

    Nonlinear small datasets, which are characterized by low numbers of samples and very high numbers of measures, occur frequently in computational biology, and pose problems in their investigation. Unsupervised hybrid-two-phase (H2P) procedures-specifically dimension reduction (DR), coupled with clustering-provide valuable assistance, not only for unsupervised data classification, but also for visualization of the patterns hidden in high-dimensional feature space. 'Minimum Curvilinearity' (MC) is a principle that-for small datasets-suggests the approximation of curvilinear sample distances in the feature space by pair-wise distances over their minimum spanning tree (MST), and thus avoids the introduction of any tuning parameter. MC is used to design two novel forms of nonlinear machine learning (NML): Minimum Curvilinear embedding (MCE) for DR, and Minimum Curvilinear affinity propagation (MCAP) for clustering. Compared with several other unsupervised and supervised algorithms, MCE and MCAP, whether individually or combined in H2P, overcome the limits of classical approaches. High performance was attained in the visualization and classification of: (i) pain patients (proteomic measurements) in peripheral neuropathy; (ii) human organ tissues (genomic transcription factor measurements) on the basis of their embryological origin. MC provides a valuable framework to estimate nonlinear distances in small datasets. Its extension to large datasets is prefigured for novel NMLs. Classification of neuropathic pain by proteomic profiles offers new insights for future molecular and systems biology characterization of pain. Improvements in tissue embryological classification refine results obtained in an earlier study, and suggest a possible reinterpretation of skin attribution as mesodermal. https://sites.google.com/site/carlovittoriocannistraci/home.

  19. Globular Cluster Star Classification: Application to M13

    NASA Astrophysics Data System (ADS)

    Caimmi, R.

    2013-06-01

    Starting from recent determination of Fe, O, Na abundances on a restricted sample (N=67) of halo and thick disk stars, a natural and well motivated selection criterion is defined for the classification globular cluster stars. An application is performed to M13 using a sample (N=113) for which Fe, O, Na abundances have been recently inferred from observations. A comparison is made between the current and earlier M13 star classifications. Both O and Na empirical differential abundance distributions are determined for each class and for the whole sample (with the addition of Fe in the last case) and compared with their theoretical counterparts due to cosmic scatter obeying a Gaussian distribution whose parameters are inferred from related subsamples. The occurrence of an agreement between the empirical and theoretical distributions is interpreted as absence of significant chemical evolution and vice versa. The procedure is repeated with regard to four additional classes depending on whether oxygen and sodium abundance is above (stage CE) or below (stage AF) a selected threshold. Both O and Na empirical differential abundance distributions, related to the whole sample, exhibit a linear fit for the AF and CE stage. Within the errors, the oxygen slope for the CE stage is equal and of opposite sign with respect to the sodium slope for AF stage, while the contrary holds when dealing with the oxygen slope for the AF stage with respect to the sodium slope for the CE stage. In the light of simple models of chemical evolution applied to M13, oxygen depletion appears to be mainly turned into sodium enrichment for [O/H]≥-1.35 and [Na/H]≤-1.45, while one or more largely preferred channels occur for [O/H]<-1.35 and [Na/H]>-1.45. In addition, the primordial to the current M13 mass ratio can be inferred from the true sodium yield in units of the sodium solar abundance. Though the above results are mainly qualitative due to large (∓.5 dex) uncertainties in abundance determination, still the exhibited trend is expected to be real. The proposed classification of globular cluster stars may be extended in a twofold manner, namely to: (i) elements other than Na and Fe and (ii) globular clusters other than M13.

  20. Reporting and methodological quality of sample size calculations in cluster randomized trials could be improved: a review.

    PubMed

    Rutterford, Clare; Taljaard, Monica; Dixon, Stephanie; Copas, Andrew; Eldridge, Sandra

    2015-06-01

    To assess the quality of reporting and accuracy of a priori estimates used in sample size calculations for cluster randomized trials (CRTs). We reviewed 300 CRTs published between 2000 and 2008. The prevalence of reporting sample size elements from the 2004 CONSORT recommendations was evaluated and a priori estimates compared with those observed in the trial. Of the 300 trials, 166 (55%) reported a sample size calculation. Only 36 of 166 (22%) reported all recommended descriptive elements. Elements specific to CRTs were the worst reported: a measure of within-cluster correlation was specified in only 58 of 166 (35%). Only 18 of 166 articles (11%) reported both a priori and observed within-cluster correlation values. Except in two cases, observed within-cluster correlation values were either close to or less than a priori values. Even with the CONSORT extension for cluster randomization, the reporting of sample size elements specific to these trials remains below that necessary for transparent reporting. Journal editors and peer reviewers should implement stricter requirements for authors to follow CONSORT recommendations. Authors should report observed and a priori within-cluster correlation values to enable comparisons between these over a wider range of trials. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  1. Effective implementation of hierarchical clustering

    NASA Astrophysics Data System (ADS)

    Verma, Mudita; Vijayarajan, V.; Sivashanmugam, G.; Bessie Amali, D. Geraldine

    2017-11-01

    Hierarchical clustering is generally used for cluster analysis in which we build up a hierarchy of clusters. In order to find that which cluster should be split a large amount of observations are being carried out. Here the data set of US based personalities has been considered for clustering. After implementation of hierarchical clustering on the data set we group it in three different clusters one is of politician, sports person and musicians. Training set is the main parameter which decides the category which has to be assigned to the observations that are being collected. The category of these observations must be known. Recognition comes from the formulation of classification. Supervised learning has the main instance in the form of classification. While on the other hand Clustering is an instance of unsupervised procedure. Clustering consists of grouping of data that have similar properties which are either their own or are inherited from some other sources.

  2. X-ray emission from a complete sample of Abell clusters of galaxies

    NASA Astrophysics Data System (ADS)

    Briel, Ulrich G.; Henry, J. Patrick

    1993-11-01

    The ROSAT All-Sky Survey (RASS) is used to investigate the X-ray properties of a complete sample of Abell clusters with measured redshifts and accurate positions. The sample comprises the 145 clusters within a 561 square degree region at high galactic latitude. The mean redshift is 0.17. This sample is especially well suited to be studied within the RASS since the mean exposure time is higher than average and the mean galactic column density is very low. These together produce a flux limit of about 4.2 x 10-13 erg/sq cm/s in the 0.5 to 2.5 keV energy band. Sixty-six (46%) individual clusters are detected at a significance level higher than 99.7% of which 7 could be chance coincidences of background or foreground sources. At redshifts greater than 0.3 six clusters out of seven (86%) are detected at the same significance level. The detected objects show a clear X-ray luminosity -- galaxy count relation with a dispersion consistent with other external estimates of the error in the counts. By analyzing the excess of positive fluctuations of the X-ray flux at the cluster positions, compared with the fluctuations of randomly drawn background fields, it is possible to extend these results below the nominal flux limit. We find 80% of richness R greater than or = 0 and 86% of R greater than or = 1 clusters are X-ray emitters with fluxes above 1 x 10-13 erg/sq cm/s. Nearly 90% of the clusters meeting the requirements to be in Abell's statistical sample emit above the same level. We therefore conclude that almost all Abell clusters are real clusters and the Abell catalog is not strongly contaminated by projection effects. We use the Kaplan-Meier product limit estimator to calculate the cumulative X-ray luminosity function. We show that the shape of the luminosity functions are similiar for different richness classes, but the characteristic luminosities of richness 2 clusters are about twice those of richness 1 clusters which are in turn about twice those of richness 0 clusters. This result is another manifestation of the luminosity -- richness elation for Abell clusters.

  3. Study of Colour Model for Segmenting Mycobacterium Tuberculosis in Sputum Images

    NASA Astrophysics Data System (ADS)

    Kurniawardhani, A.; Kurniawan, R.; Muhimmah, I.; Kusumadewi, S.

    2018-03-01

    One of method to diagnose Tuberculosis (TB) disease is sputum test. The presence and number of Mycobacterium tuberculosis (MTB) in sputum are identified. The presence of MTB can be seen under light microscope. Before investigating through stained light microscope, the sputum samples are stained using Ziehl-Neelsen (ZN) stain technique. Because there is no standard procedure in staining, the appearance of sputum samples may vary either in background colour or contrast level. It increases the difficulty in segmentation stage of automatic MTB identification. Thus, this study investigated the colour models to look for colour channels of colour model that can segment MTB well in different stained conditions. The colour models will be investigated are each channel in RGB, HSV, CIELAB, YCbCr, and C-Y colour model and the clustering algorithm used is k-Means. The sputum image dataset used in this study is obtained from community health clinic in a district in Indonesia. The size of each image was set to 1600x1200 pixels which is having variation in number of MTB, background colour, and contrast level. The experiment result indicates that in all image conditions, blue, hue, Cr, and Ry colour channel can be used to segment MTB in one cluster well.

  4. Training a Network of Electronic Neurons for Control of a Mobile Robot

    NASA Astrophysics Data System (ADS)

    Vromen, T. G. M.; Steur, E.; Nijmeijer, H.

    An adaptive training procedure is developed for a network of electronic neurons, which controls a mobile robot driving around in an unknown environment while avoiding obstacles. The neuronal network controls the angular velocity of the wheels of the robot based on the sensor readings. The nodes in the neuronal network controller are clusters of neurons rather than single neurons. The adaptive training procedure ensures that the input-output behavior of the clusters is identical, even though the constituting neurons are nonidentical and have, in isolation, nonidentical responses to the same input. In particular, we let the neurons interact via a diffusive coupling, and the proposed training procedure modifies the diffusion interaction weights such that the neurons behave synchronously with a predefined response. The working principle of the training procedure is experimentally validated and results of an experiment with a mobile robot that is completely autonomously driving in an unknown environment with obstacles are presented.

  5. Methods for sample size determination in cluster randomized trials

    PubMed Central

    Rutterford, Clare; Copas, Andrew; Eldridge, Sandra

    2015-01-01

    Background: The use of cluster randomized trials (CRTs) is increasing, along with the variety in their design and analysis. The simplest approach for their sample size calculation is to calculate the sample size assuming individual randomization and inflate this by a design effect to account for randomization by cluster. The assumptions of a simple design effect may not always be met; alternative or more complicated approaches are required. Methods: We summarise a wide range of sample size methods available for cluster randomized trials. For those familiar with sample size calculations for individually randomized trials but with less experience in the clustered case, this manuscript provides formulae for a wide range of scenarios with associated explanation and recommendations. For those with more experience, comprehensive summaries are provided that allow quick identification of methods for a given design, outcome and analysis method. Results: We present first those methods applicable to the simplest two-arm, parallel group, completely randomized design followed by methods that incorporate deviations from this design such as: variability in cluster sizes; attrition; non-compliance; or the inclusion of baseline covariates or repeated measures. The paper concludes with methods for alternative designs. Conclusions: There is a large amount of methodology available for sample size calculations in CRTs. This paper gives the most comprehensive description of published methodology for sample size calculation and provides an important resource for those designing these trials. PMID:26174515

  6. Paleomagnetically inferred ages of a cluster of Holocene monogenetic eruptions in the Tacámbaro-Puruarán area (Michoacán, México): Implications for volcanic hazards

    NASA Astrophysics Data System (ADS)

    Mahgoub, Ahmed Nasser; Böhnel, Harald; Siebe, Claus; Salinas, Sergio; Guilbaud, Marie-Noëlle

    2017-11-01

    The paleomagnetic dating procedure was applied to a cluster of four partly overlapping monogenetic Holocene volcanoes and associated lava flows, namely La Tinaja, La Palma, Mesa La Muerta, and Malpaís de Cutzaróndiro, located in the Tacámbaro-Puruarán area, at the southeastern margin of the Michoacán-Guanajuato volcanic field. For this purpose, 21 sites distributed as far apart as possible from each other were sampled to obtain a well-averaged mean paleomagnetic direction for each single lava flow. For intensity determinations, double-heating Thellier experiments using the IZZI protocol were conducted on 55 selected samples. La Tinaja is the oldest of these flows and was dated by the 14C method at 5115 ± 130 years BP (cal 4184-3655 BCE). It is stratigraphically underneath the other three flows with Malpaís de Cutzaróndiro lava flow being the youngest. The paleomagnetic dating procedure was applied using the Matlab archaeo-dating tool in couple with the geomagnetic field model SHA.DIF.14k. Accordingly, for La Tinaja several possible age ranges were obtained, of which the range 3650-3480 BCE is closest to the 14C age. Paleomagnetic dating on La Palma produced a unique age range of 3220-2880 BCE. Two ages ranges of 2240-2070 BCE and 760-630 BCE were obtained for Mesa La Muerta and a well-constrained age of 420-320 BCE for Malpaís de Cutzaróndiro. Although systematic archaeological excavations have so far not been carried out in this area, it is possible that the younger eruptions were contemporary to local human occupation. Our paleomagnetic dates indicate that all four eruptions, although closely clustered in space, occurred separately in time with varying recurrence intervals ranging between 300 and 2300 years. This finding should be considered when constraining the nature of the magmatic plumbing system and developing a strategy aimed at reducing risk in the volcanically active Michoacán-Guanajuato volcanic field, where several young monogenetic volcano clusters have been identified recently. These enigmatic small "flare-ups" (outbursts of small pods of magma in geologically short periods of time within a small area) have also been encountered in other subduction-related volcanic fields around the globe (e.g. Cascades arc in the western U.S.A.) and still require to be investigated by geophysical and petrological means in order to understand their origin.

  7. Spatially explicit population estimates for black bears based on cluster sampling

    USGS Publications Warehouse

    Humm, J.; McCown, J. Walter; Scheick, B.K.; Clark, Joseph D.

    2017-01-01

    We estimated abundance and density of the 5 major black bear (Ursus americanus) subpopulations (i.e., Eglin, Apalachicola, Osceola, Ocala-St. Johns, Big Cypress) in Florida, USA with spatially explicit capture-mark-recapture (SCR) by extracting DNA from hair samples collected at barbed-wire hair sampling sites. We employed a clustered sampling configuration with sampling sites arranged in 3 × 3 clusters spaced 2 km apart within each cluster and cluster centers spaced 16 km apart (center to center). We surveyed all 5 subpopulations encompassing 38,960 km2 during 2014 and 2015. Several landscape variables, most associated with forest cover, helped refine density estimates for the 5 subpopulations we sampled. Detection probabilities were affected by site-specific behavioral responses coupled with individual capture heterogeneity associated with sex. Model-averaged bear population estimates ranged from 120 (95% CI = 59–276) bears or a mean 0.025 bears/km2 (95% CI = 0.011–0.44) for the Eglin subpopulation to 1,198 bears (95% CI = 949–1,537) or 0.127 bears/km2 (95% CI = 0.101–0.163) for the Ocala-St. Johns subpopulation. The total population estimate for our 5 study areas was 3,916 bears (95% CI = 2,914–5,451). The clustered sampling method coupled with information on land cover was efficient and allowed us to estimate abundance across extensive areas that would not have been possible otherwise. Clustered sampling combined with spatially explicit capture-recapture methods has the potential to provide rigorous population estimates for a wide array of species that are extensive and heterogeneous in their distribution.

  8. Tracing Large Scale Structure with a Redshift Survey of Rich Clusters of Galaxies

    NASA Astrophysics Data System (ADS)

    Batuski, D.; Slinglend, K.; Haase, S.; Hill, J. M.

    1993-12-01

    Rich clusters of galaxies from Abell's catalog show evidence of structure on scales of 100 Mpc and hold promise of confirming the existence of structure in the more immediate universe on scales corresponding to COBE results (i.e., on the order of 10% or more of the horizon size of the universe). However, most Abell clusters do not as yet have measured redshifts (or, in the case of most low redshift clusters, have only one or two galaxies measured), so present knowledge of their three dimensional distribution has quite large uncertainties. The shortage of measured redshifts for these clusters may also mask a problem of projection effects corrupting the membership counts for the clusters, perhaps even to the point of spurious identifications of some of the clusters themselves. Our approach in this effort has been to use the MX multifiber spectrometer to measure redshifts of at least ten galaxies in each of about 80 Abell cluster fields with richness class R>= 1 and mag10 <= 16.8. This work will result in a somewhat deeper, much more complete (and reliable) sample of positions of rich clusters. Our primary use for the sample is for two-point correlation and other studies of the large scale structure traced by these clusters. We are also obtaining enough redshifts per cluster so that a much better sample of reliable cluster velocity dispersions will be available for other studies of cluster properties. To date, we have collected such data for 40 clusters, and for most of them, we have seven or more cluster members with redshifts, allowing for reliable velocity dispersion calculations. Velocity histograms for several interesting cluster fields are presented, along with summary tables of cluster redshift results. Also, with 10 or more redshifts in most of our cluster fields (30({') } square, just about an `Abell diameter' at z ~ 0.1) we have investigated the extent of projection effects within the Abell catalog in an effort to quantify and understand how this may effect the Abell sample.

  9. The Mass Function in h+(chi) Persei

    NASA Astrophysics Data System (ADS)

    Bragg, Ann; Kenyon, Scott

    2000-08-01

    Knowledge of the stellar initial mass function (IMF) is critical to understanding star formation and galaxy evolution. Past studies of the IMF in open clusters have primarily used luminosity functions to determine mass functions, frequently in relatively sparse clusters. Our goal with this project is to derive a reliable, well- sampled IMF for a pair of very dense young clusters (h+(chi) Persei) with ages, 1-2 × 10^7 yr (e.g., Vogt A& A 11:359), where stellar evolution theory is robust. We will construct the HR diagram using both photometry and spectral types to derive more accurate stellar masses and ages than are possible using photometry alone. Results from the two clusters will be compared to examine the universality of the IMF. We currently have a spectroscopic sample covering an area within 9 arc-minutes of the center of each cluster taken with the FAST Spectrograph. The sample is complete to V=15.4 and contains ~ 1000 stars. We request 2 nights at WIYN/HYDRA to extend this sample to deeper magnitudes, allowing us to determine the IMF of the clusters to a lower limiting mass and to search for a pre-main sequence, theoretically predicted to be present for clusters of this age. Note that both clusters are contained within a single HYDRA field.

  10. Recognition of genetically modified product based on affinity propagation clustering and terahertz spectroscopy

    NASA Astrophysics Data System (ADS)

    Liu, Jianjun; Kan, Jianquan

    2018-04-01

    In this paper, based on the terahertz spectrum, a new identification method of genetically modified material by support vector machine (SVM) based on affinity propagation clustering is proposed. This algorithm mainly uses affinity propagation clustering algorithm to make cluster analysis and labeling on unlabeled training samples, and in the iterative process, the existing SVM training data are continuously updated, when establishing the identification model, it does not need to manually label the training samples, thus, the error caused by the human labeled samples is reduced, and the identification accuracy of the model is greatly improved.

  11. Open star clusters and Galactic structure

    NASA Astrophysics Data System (ADS)

    Joshi, Yogesh C.

    2018-04-01

    In order to understand the Galactic structure, we perform a statistical analysis of the distribution of various cluster parameters based on an almost complete sample of Galactic open clusters yet available. The geometrical and physical characteristics of a large number of open clusters given in the MWSC catalogue are used to study the spatial distribution of clusters in the Galaxy and determine the scale height, solar offset, local mass density and distribution of reddening material in the solar neighbourhood. We also explored the mass-radius and mass-age relations in the Galactic open star clusters. We find that the estimated parameters of the Galactic disk are largely influenced by the choice of cluster sample.

  12. Declustering of clustered preferential sampling for histogram and semivariogram inference

    USGS Publications Warehouse

    Olea, R.A.

    2007-01-01

    Measurements of attributes obtained more as a consequence of business ventures than sampling design frequently result in samplings that are preferential both in location and value, typically in the form of clusters along the pay. Preferential sampling requires preprocessing for the purpose of properly inferring characteristics of the parent population, such as the cumulative distribution and the semivariogram. Consideration of the distance to the nearest neighbor allows preparation of resampled sets that produce comparable results to those from previously proposed methods. Clustered sampling of size 140, taken from an exhaustive sampling, is employed to illustrate this approach. ?? International Association for Mathematical Geology 2007.

  13. ALCHEMY: a reliable method for automated SNP genotype calling for small batch sizes and highly homozygous populations

    PubMed Central

    Wright, Mark H.; Tung, Chih-Wei; Zhao, Keyan; Reynolds, Andy; McCouch, Susan R.; Bustamante, Carlos D.

    2010-01-01

    Motivation: The development of new high-throughput genotyping products requires a significant investment in testing and training samples to evaluate and optimize the product before it can be used reliably on new samples. One reason for this is current methods for automated calling of genotypes are based on clustering approaches which require a large number of samples to be analyzed simultaneously, or an extensive training dataset to seed clusters. In systems where inbred samples are of primary interest, current clustering approaches perform poorly due to the inability to clearly identify a heterozygote cluster. Results: As part of the development of two custom single nucleotide polymorphism genotyping products for Oryza sativa (domestic rice), we have developed a new genotype calling algorithm called ‘ALCHEMY’ based on statistical modeling of the raw intensity data rather than modelless clustering. A novel feature of the model is the ability to estimate and incorporate inbreeding information on a per sample basis allowing accurate genotyping of both inbred and heterozygous samples even when analyzed simultaneously. Since clustering is not used explicitly, ALCHEMY performs well on small sample sizes with accuracy exceeding 99% with as few as 18 samples. Availability: ALCHEMY is available for both commercial and academic use free of charge and distributed under the GNU General Public License at http://alchemy.sourceforge.net/ Contact: mhw6@cornell.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:20926420

  14. Clusternomics: Integrative context-dependent clustering for heterogeneous datasets

    PubMed Central

    Wernisch, Lorenz

    2017-01-01

    Integrative clustering is used to identify groups of samples by jointly analysing multiple datasets describing the same set of biological samples, such as gene expression, copy number, methylation etc. Most existing algorithms for integrative clustering assume that there is a shared consistent set of clusters across all datasets, and most of the data samples follow this structure. However in practice, the structure across heterogeneous datasets can be more varied, with clusters being joined in some datasets and separated in others. In this paper, we present a probabilistic clustering method to identify groups across datasets that do not share the same cluster structure. The proposed algorithm, Clusternomics, identifies groups of samples that share their global behaviour across heterogeneous datasets. The algorithm models clusters on the level of individual datasets, while also extracting global structure that arises from the local cluster assignments. Clusters on both the local and the global level are modelled using a hierarchical Dirichlet mixture model to identify structure on both levels. We evaluated the model both on simulated and on real-world datasets. The simulated data exemplifies datasets with varying degrees of common structure. In such a setting Clusternomics outperforms existing algorithms for integrative and consensus clustering. In a real-world application, we used the algorithm for cancer subtyping, identifying subtypes of cancer from heterogeneous datasets. We applied the algorithm to TCGA breast cancer dataset, integrating gene expression, miRNA expression, DNA methylation and proteomics. The algorithm extracted clinically meaningful clusters with significantly different survival probabilities. We also evaluated the algorithm on lung and kidney cancer TCGA datasets with high dimensionality, again showing clinically significant results and scalability of the algorithm. PMID:29036190

  15. Clusternomics: Integrative context-dependent clustering for heterogeneous datasets.

    PubMed

    Gabasova, Evelina; Reid, John; Wernisch, Lorenz

    2017-10-01

    Integrative clustering is used to identify groups of samples by jointly analysing multiple datasets describing the same set of biological samples, such as gene expression, copy number, methylation etc. Most existing algorithms for integrative clustering assume that there is a shared consistent set of clusters across all datasets, and most of the data samples follow this structure. However in practice, the structure across heterogeneous datasets can be more varied, with clusters being joined in some datasets and separated in others. In this paper, we present a probabilistic clustering method to identify groups across datasets that do not share the same cluster structure. The proposed algorithm, Clusternomics, identifies groups of samples that share their global behaviour across heterogeneous datasets. The algorithm models clusters on the level of individual datasets, while also extracting global structure that arises from the local cluster assignments. Clusters on both the local and the global level are modelled using a hierarchical Dirichlet mixture model to identify structure on both levels. We evaluated the model both on simulated and on real-world datasets. The simulated data exemplifies datasets with varying degrees of common structure. In such a setting Clusternomics outperforms existing algorithms for integrative and consensus clustering. In a real-world application, we used the algorithm for cancer subtyping, identifying subtypes of cancer from heterogeneous datasets. We applied the algorithm to TCGA breast cancer dataset, integrating gene expression, miRNA expression, DNA methylation and proteomics. The algorithm extracted clinically meaningful clusters with significantly different survival probabilities. We also evaluated the algorithm on lung and kidney cancer TCGA datasets with high dimensionality, again showing clinically significant results and scalability of the algorithm.

  16. A Typology of Burnout in Professional Counselors

    ERIC Educational Resources Information Center

    Lee, Sang Min; Cho, Seong Ho; Kissinger, Daniel; Ogle, Nick T.

    2010-01-01

    The authors used a cluster analysis procedure and the Counselor Burnout Inventory (S. M. Lee et al., 2007) to identify professional counselors' burnout types. Three clusters were identified: well-adjusted, persevering, and disconnected counselors. The results also indicated that counselors' job satisfaction and self-esteem were good discriminators…

  17. Procedural Guide for Designation Surveys of Ocean Dredged Material Disposal Sites. Revision

    DTIC Science & Technology

    1990-04-01

    data standardization." One of the most frequently used clustering strategies is called UPGMA (unweighted pair-group method using arithmetic averages...Sneath and Sokal 1973). Romesburg (1984) 151 evaluated many possible methods and concluded that UPGMA is appropriate for most types of cluster

  18. Mail merge can be used to create personalized questionnaires in complex surveys.

    PubMed

    Taljaard, Monica; Chaudhry, Shazia Hira; Brehaut, Jamie C; Weijer, Charles; Grimshaw, Jeremy M

    2015-10-16

    Low response rates and inadequate question comprehension threaten the validity of survey results. We describe a simple procedure to implement personalized-as opposed to generically worded-questionnaires in the context of a complex web-based survey of corresponding authors of a random sample of 300 published cluster randomized trials. The purpose of the survey was to gather more detailed information about informed consent procedures used in the trial, over and above basic information provided in the trial report. We describe our approach-which allowed extensive personalization without the need for specialized computer technology-and discuss its potential application in similar settings. The mail merge feature of standard word processing software was used to generate unique, personalized questionnaires for each author by incorporating specific information from the article, including naming the randomization unit (e.g., family practice, school, worksite), and identifying specific individuals who may have been considered research participants at the cluster level (family doctors, teachers, employers) and individual level (patients, students, employees) in questions regarding informed consent procedures in the trial. The response rate was relatively high (64%, 182/285) and did not vary significantly by author, publication, or study characteristics. The refusal rate was low (7%). While controlled studies are required to examine the specific effects of our approach on comprehension, quality of responses, and response rates, we showed how mail merge can be used as a simple but useful tool to add personalized fields to complex survey questionnaires, or to request additional information required from study authors. One potential application is in eliciting specific information about published articles from study authors when conducting systematic reviews and meta-analyses.

  19. Review of Instructional Approaches in Ethics Education.

    PubMed

    Mulhearn, Tyler J; Steele, Logan M; Watts, Logan L; Medeiros, Kelsey E; Mumford, Michael D; Connelly, Shane

    2017-06-01

    Increased investment in ethics education has prompted a variety of instructional objectives and frameworks. Yet, no systematic procedure to classify these varying instructional approaches has been attempted. In the present study, a quantitative clustering procedure was conducted to derive a typology of instruction in ethics education. In total, 330 ethics training programs were included in the cluster analysis. The training programs were appraised with respect to four instructional categories including instructional content, processes, delivery methods, and activities. Eight instructional approaches were identified through this clustering procedure, and these instructional approaches showed different levels of effectiveness. Instructional effectiveness was assessed based on one of nine commonly used ethics criteria. With respect to specific training types, Professional Decision Processes Training (d = 0.50) and Field-Specific Compliance Training (d = 0.46) appear to be viable approaches to ethics training based on Cohen's d effect size estimates. By contrast, two commonly used approaches, General Discussion Training (d = 0.31) and Norm Adherence Training (d = 0.37), were found to be considerably less effective. The implications for instruction in ethics training are discussed.

  20. Rapid quality assessment of Radix Aconiti Preparata using direct analysis in real time mass spectrometry.

    PubMed

    Zhu, Hongbin; Wang, Chunyan; Qi, Yao; Song, Fengrui; Liu, Zhiqiang; Liu, Shuying

    2012-11-08

    This study presents a novel and rapid method to identify chemical markers for the quality control of Radix Aconiti Preparata, a world widely used traditional herbal medicine. In the method, the samples with a fast extraction procedure were analyzed using direct analysis in real time mass spectrometry (DART MS) combined with multivariate data analysis. At present, the quality assessment approach of Radix Aconiti Preparata was based on the two processing methods recorded in Chinese Pharmacopoeia for the purpose of reducing the toxicity of Radix Aconiti and ensuring its clinical therapeutic efficacy. In order to ensure the safety and effectivity in clinical use, the processing degree of Radix Aconiti should be well controlled and assessed. In the paper, hierarchical cluster analysis and principal component analysis were performed to evaluate the DART MS data of Radix Aconiti Preparata samples in different processing times. The results showed that the well processed Radix Aconiti Preparata, unqualified processed and the raw Radix Aconiti could be clustered reasonably corresponding to their constituents. The loading plot shows that the main chemical markers having the most influence on the discrimination amongst the qualified and unqualified samples were mainly some monoester diterpenoid aconitines and diester diterpenoid aconitines, i.e. benzoylmesaconine, hypaconitine, mesaconitine, neoline, benzoylhypaconine, benzoylaconine, fuziline, aconitine and 10-OH-mesaconitine. The established DART MS approach in combination with multivariate data analysis provides a very flexible and reliable method for quality assessment of toxic herbal medicine. Copyright © 2012 Elsevier B.V. All rights reserved.

  1. Spectroscopic studies of clusterization of methanol molecules isolated in a nitrogen matrix

    NASA Astrophysics Data System (ADS)

    Vaskivskyi, Ye.; Doroshenko, I.; Chernolevska, Ye.; Pogorelov, V.; Pitsevich, G.

    2017-12-01

    IR absorption spectra of methanol isolated in a nitrogen matrix are recorded at temperatures ranging from 9 to 34 K. The changes in the spectra with increasing matrix temperature are analyzed. Based on quantum-chemical calculations of the geometric and spectral parameters of different methanol clusters, the observed absorption bands are identified. The cluster composition of the sample is determined at each temperature. It is shown that as the matrix is heated there is a redistribution among the different cluster structures in the sample, from smaller to larger clusters.

  2. Using Cluster Analysis and ICP-MS to Identify Groups of Ecstasy Tablets in Sao Paulo State, Brazil.

    PubMed

    Maione, Camila; de Oliveira Souza, Vanessa Cristina; Togni, Loraine Rezende; da Costa, José Luiz; Campiglia, Andres Dobal; Barbosa, Fernando; Barbosa, Rommel Melgaço

    2017-11-01

    The variations found in the elemental composition in ecstasy samples result in spectral profiles with useful information for data analysis, and cluster analysis of these profiles can help uncover different categories of the drug. We provide a cluster analysis of ecstasy tablets based on their elemental composition. Twenty-five elements were determined by ICP-MS in tablets apprehended by Sao Paulo's State Police, Brazil. We employ the K-means clustering algorithm along with C4.5 decision tree to help us interpret the clustering results. We found a better number of two clusters within the data, which can refer to the approximated number of sources of the drug which supply the cities of seizures. The C4.5 model was capable of differentiating the ecstasy samples from the two clusters with high prediction accuracy using the leave-one-out cross-validation. The model used only Nd, Ni, and Pb concentration values in the classification of the samples. © 2017 American Academy of Forensic Sciences.

  3. EVIDENCE FOR THE UNIVERSALITY OF PROPERTIES OF RED-SEQUENCE GALAXIES IN X-RAY- AND RED-SEQUENCE-SELECTED CLUSTERS AT z ∼ 1

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Foltz, R.; Wilson, G.; DeGroot, A.

    We study the slope, intercept, and scatter of the color–magnitude and color–mass relations for a sample of 10 infrared red-sequence-selected clusters at z ∼ 1. The quiescent galaxies in these clusters formed the bulk of their stars above z ≳ 3 with an age spread Δt ≳ 1 Gyr. We compare UVJ color–color and spectroscopic-based galaxy selection techniques, and find a 15% difference in the galaxy populations classified as quiescent by these methods. We compare the color–magnitude relations from our red-sequence selected sample with X-ray- and photometric-redshift-selected cluster samples of similar mass and redshift. Within uncertainties, we are unable tomore » detect any difference in the ages and star formation histories of quiescent cluster members in clusters selected by different methods, suggesting that the dominant quenching mechanism is insensitive to cluster baryon partitioning at z ∼ 1.« less

  4. Measuring consistent masses for 25 Milky Way globular clusters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kimmig, Brian; Seth, Anil; Ivans, Inese I.

    2015-02-01

    We present central velocity dispersions, masses, mass-to-light ratios (M/Ls ), and rotation strengths for 25 Galactic globular clusters (GCs). We derive radial velocities of 1951 stars in 12 GCs from single order spectra taken with Hectochelle on the MMT telescope. To this sample we add an analysis of available archival data of individual stars. For the full set of data we fit King models to derive consistent dynamical parameters for the clusters. We find good agreement between single-mass King models and the observed radial dispersion profiles. The large, uniform sample of dynamical masses we derive enables us to examine trendsmore » of M/L with cluster mass and metallicity. The overall values of M/L and the trends with mass and metallicity are consistent with existing measurements from a large sample of M31 clusters. This includes a clear trend of increasing M/L with cluster mass and lower than expected M/Ls for the metal-rich clusters. We find no clear trend of increasing rotation with increasing cluster metallicity suggested in previous work.« less

  5. Nonlinear dimensionality reduction of data lying on the multicluster manifold.

    PubMed

    Meng, Deyu; Leung, Yee; Fung, Tung; Xu, Zongben

    2008-08-01

    A new method, which is called decomposition-composition (D-C) method, is proposed for the nonlinear dimensionality reduction (NLDR) of data lying on the multicluster manifold. The main idea is first to decompose a given data set into clusters and independently calculate the low-dimensional embeddings of each cluster by the decomposition procedure. Based on the intercluster connections, the embeddings of all clusters are then composed into their proper positions and orientations by the composition procedure. Different from other NLDR methods for multicluster data, which consider associatively the intracluster and intercluster information, the D-C method capitalizes on the separate employment of the intracluster neighborhood structures and the intercluster topologies for effective dimensionality reduction. This, on one hand, isometrically preserves the rigid-body shapes of the clusters in the embedding process and, on the other hand, guarantees the proper locations and orientations of all clusters. The theoretical arguments are supported by a series of experiments performed on the synthetic and real-life data sets. In addition, the computational complexity of the proposed method is analyzed, and its efficiency is theoretically analyzed and experimentally demonstrated. Related strategies for automatic parameter selection are also examined.

  6. Galaxy Cluster Mass Reconstruction Project – III. The impact of dynamical substructure on cluster mass estimates

    DOE PAGES

    Old, L.; Wojtak, R.; Pearce, F. R.; ...

    2017-12-20

    With the advent of wide-field cosmological surveys, we are approaching samples of hundreds of thousands of galaxy clusters. While such large numbers will help reduce statistical uncertainties, the control of systematics in cluster masses is crucial. Here we examine the effects of an important source of systematic uncertainty in galaxy-based cluster mass estimation techniques: the presence of significant dynamical substructure. Dynamical substructure manifests as dynamically distinct subgroups in phase-space, indicating an ‘unrelaxed’ state. This issue affects around a quarter of clusters in a generally selected sample. We employ a set of mock clusters whose masses have been measured homogeneously withmore » commonly used galaxy-based mass estimation techniques (kinematic, richness, caustic, radial methods). We use these to study how the relation between observationally estimated and true cluster mass depends on the presence of substructure, as identified by various popular diagnostics. We find that the scatter for an ensemble of clusters does not increase dramatically for clusters with dynamical substructure. However, we find a systematic bias for all methods, such that clusters with significant substructure have higher measured masses than their relaxed counterparts. This bias depends on cluster mass: the most massive clusters are largely unaffected by the presence of significant substructure, but masses are significantly overestimated for lower mass clusters, by ~ 10 percent at 10 14 and ≳ 20 percent for ≲ 10 13.5. Finally, the use of cluster samples with different levels of substructure can therefore bias certain cosmological parameters up to a level comparable to the typical uncertainties in current cosmological studies.« less

  7. Galaxy Cluster Mass Reconstruction Project – III. The impact of dynamical substructure on cluster mass estimates

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Old, L.; Wojtak, R.; Pearce, F. R.

    With the advent of wide-field cosmological surveys, we are approaching samples of hundreds of thousands of galaxy clusters. While such large numbers will help reduce statistical uncertainties, the control of systematics in cluster masses is crucial. Here we examine the effects of an important source of systematic uncertainty in galaxy-based cluster mass estimation techniques: the presence of significant dynamical substructure. Dynamical substructure manifests as dynamically distinct subgroups in phase-space, indicating an ‘unrelaxed’ state. This issue affects around a quarter of clusters in a generally selected sample. We employ a set of mock clusters whose masses have been measured homogeneously withmore » commonly used galaxy-based mass estimation techniques (kinematic, richness, caustic, radial methods). We use these to study how the relation between observationally estimated and true cluster mass depends on the presence of substructure, as identified by various popular diagnostics. We find that the scatter for an ensemble of clusters does not increase dramatically for clusters with dynamical substructure. However, we find a systematic bias for all methods, such that clusters with significant substructure have higher measured masses than their relaxed counterparts. This bias depends on cluster mass: the most massive clusters are largely unaffected by the presence of significant substructure, but masses are significantly overestimated for lower mass clusters, by ~ 10 percent at 10 14 and ≳ 20 percent for ≲ 10 13.5. Finally, the use of cluster samples with different levels of substructure can therefore bias certain cosmological parameters up to a level comparable to the typical uncertainties in current cosmological studies.« less

  8. Using the XMM-Newton Optical Monitor to Study Cluster Galaxy Evolution

    NASA Technical Reports Server (NTRS)

    Miller, Neal A.; O'Steen, Richard; Yen, Steffi; Kuntz, K. D.; Hammer, Derek

    2012-01-01

    We explore the application of XMM Newton Optical Monitor (XMM-OM) ultraviolet (UV) data to study galaxy evolution. Our sample is constructed as the intersection of all Abell clusters with z < 0.05 and having archival XMM-OM data in either the UVM2 or UVW1 filters, plus optical and UV photometry from the Sloan Digital Sky Survey and GALEX, respectively. The 11 resulting clusters include 726 galaxies with measured redshifts, 520 of which have redshifts placing them within their parent Abell clusters. We develop procedures for manipulating the XMM-OM images and measuring galaxy photometry from them, and we confirm our results via comparison with published catalogs. Color-magnitude diagrams (CMDs) constructed using the XMM-OM data along with SDSS optical data show promise for evolutionary studies, with good separation between red and blue sequences and real variation in the width of the red sequence that is likely indicative of differences in star formation history. This is particularly true for UVW1 data, as the relative abundance of data collected using this filter and its depth make it an attractive choice. Available tools that use stellar synthesis libraries to fit the UV and optical photometric data may also be used, thereby better describing star formation history within the past billion years and providing estimates of total stellar mass that include contributions from young stars. Finally, color-color diagrams that include XMM-OM UV data appear useful to the photometric identification of both extragalactic and stellar sources.

  9. Using the XMM-Newton Optical Monitor to Study Cluster Galaxy Evolution

    NASA Astrophysics Data System (ADS)

    Miller, Neal A.; O'Steen, Richard; Yen, Steffi; Kuntz, K. D.; Hammer, Derek

    2012-02-01

    We explore the application of XMM-Newton Optical Monitor (XMM-OM) ultraviolet (UV) data to study galaxy evolution. Our sample is constructed as the intersection of all Abell clusters with z < 0.05 and having archival XMM-OM data in either the UVM2 or UVW1 filters, plus optical and UV photometry from the Sloan Digital Sky Survey and GALEX, respectively. The 11 resulting clusters include 726 galaxies with measured redshifts, 520 of which have redshifts placing them within their parent Abell clusters. We develop procedures for manipulating the XMM-OM images and measuring galaxy photometry from them, and we confirm our results via comparison with published catalogs. Color-magnitude diagrams (CMDs) constructed using the XMM-OM data along with SDSS optical data show promise for evolutionary studies, with good separation between red and blue sequences and real variation in the width of the red sequence that is likely indicative of differences in star formation history. This is particularly true for UVW1 data, as the relative abundance of data collected using this filter and its depth make it an attractive choice. Available tools that use stellar synthesis libraries to fit the UV and optical photometric data may also be used, thereby better describing star formation history within the past billion years and providing estimates of total stellar mass that include contributions from young stars. Finally, color-color diagrams that include XMM-OM UV data appear useful to the photometric identification of both extragalactic and stellar sources.

  10. Does reflective functioning mediate the relationship between attachment and personality?

    PubMed

    Nazzaro, Maria Paola; Boldrini, Tommaso; Tanzilli, Annalisa; Muzi, Laura; Giovanardi, Guido; Lingiardi, Vittorio

    2017-10-01

    Mentalization, operationalized as reflective functioning (RF), can play a crucial role in the psychological mechanisms underlying personality functioning. This study aimed to: (a) study the association between RF, personality disorders (cluster level) and functioning; (b) investigate whether RF and personality functioning are influenced by (secure vs. insecure) attachment; and (c) explore the potential mediating effect of RF on the relationship between attachment and personality functioning. The Shedler-Westen Assessment Procedure (SWAP-200) was used to assess personality disorders and levels of psychological functioning in a clinical sample (N = 88). Attachment and RF were evaluated with the Adult Attachment Interview (AAI) and Reflective Functioning Scale (RFS). Findings showed that RF had significant negative associations with cluster A and B personality disorders, and a significant positive association with psychological functioning. Moreover, levels of RF and personality functioning were influenced by attachment patterns. Finally, RF completely mediated the relationship between (secure/insecure) attachment and adaptive psychological features, and thus accounted for differences in overall personality functioning. Lack of mentalization seemed strongly associated with vulnerabilities in personality functioning, especially in patients with cluster A and B personality disorders. These findings provide support for the development of therapeutic interventions to improve patients' RF. Copyright © 2017 Elsevier B.V. All rights reserved.

  11. Analyzing coastal environments by means of functional data analysis

    NASA Astrophysics Data System (ADS)

    Sierra, Carlos; Flor-Blanco, Germán; Ordoñez, Celestino; Flor, Germán; Gallego, José R.

    2017-07-01

    Here we used Functional Data Analysis (FDA) to examine particle-size distributions (PSDs) in a beach/shallow marine sedimentary environment in Gijón Bay (NW Spain). The work involved both Functional Principal Components Analysis (FPCA) and Functional Cluster Analysis (FCA). The grainsize of the sand samples was characterized by means of laser dispersion spectroscopy. Within this framework, FPCA was used as a dimension reduction technique to explore and uncover patterns in grain-size frequency curves. This procedure proved useful to describe variability in the structure of the data set. Moreover, an alternative approach, FCA, was applied to identify clusters and to interpret their spatial distribution. Results obtained with this latter technique were compared with those obtained by means of two vector approaches that combine PCA with CA (Cluster Analysis). The first method, the point density function (PDF), was employed after adapting a log-normal distribution to each PSD and resuming each of the density functions by its mean, sorting, skewness and kurtosis. The second applied a centered-log-ratio (clr) to the original data. PCA was then applied to the transformed data, and finally CA to the retained principal component scores. The study revealed functional data analysis, specifically FPCA and FCA, as a suitable alternative with considerable advantages over traditional vector analysis techniques in sedimentary geology studies.

  12. Segmenting Student Markets with a Student Satisfaction and Priorities Survey.

    ERIC Educational Resources Information Center

    Borden, Victor M. H.

    1995-01-01

    A market segmentation analysis of 872 university students compared 2 hierarchical clustering procedures for deriving market segments: 1 using matching-type measures and an agglomerative clustering algorithm, and 1 using the chi-square based automatic interaction detection. Results and implications for planning, evaluating, and improving academic…

  13. The Equivalence of Three Statistical Packages for Performing Hierarchical Cluster Analysis

    ERIC Educational Resources Information Center

    Blashfield, Roger

    1977-01-01

    Three different software programs which contain hierarchical agglomerative cluster analysis procedures were shown to generate different solutions on the same data set using apparently the same options. The basis for the differences in the solutions was the formulae used to calculate Euclidean distance. (Author/JKS)

  14. Identifying Peer Institutions Using Cluster Analysis

    ERIC Educational Resources Information Center

    Boronico, Jess; Choksi, Shail S.

    2012-01-01

    The New York Institute of Technology's (NYIT) School of Management (SOM) wishes to develop a list of peer institutions for the purpose of benchmarking and monitoring/improving performance against other business schools. The procedure utilizes relevant criteria for the purpose of establishing this peer group by way of a cluster analysis. The…

  15. Fully Automated Single-Zone Elliptic Grid Generation for Mars Science Laboratory (MSL) Aeroshell and Canopy Geometries

    NASA Technical Reports Server (NTRS)

    kaul, Upender K.

    2008-01-01

    A procedure for generating smooth uniformly clustered single-zone grids using enhanced elliptic grid generation has been demonstrated here for the Mars Science Laboratory (MSL) geometries such as aeroshell and canopy. The procedure obviates the need for generating multizone grids for such geometries, as reported in the literature. This has been possible because the enhanced elliptic grid generator automatically generates clustered grids without manual prescription of decay parameters needed with the conventional approach. In fact, these decay parameters are calculated as decay functions as part of the solution, and they are not constant over a given boundary. Since these decay functions vary over a given boundary, orthogonal grids near any arbitrary boundary can be clustered automatically without having to break up the boundaries and the corresponding interior domains into various zones for grid generation.

  16. VizieR Online Data Catalog: 44 SZ-selected galaxy clusters ACT observations (Sifon+, 2016)

    NASA Astrophysics Data System (ADS)

    Sifon, C.; Battaglia, N.; Hasselfield, M.; Menanteau, F.; Barrientos, L. F.; Bond, J. R.; Crichton, D.; Devlin, M. J.; Dunner, R.; Hilton, M.; Hincks, A. D.; Hlozek, R.; Huffenberger, K. M.; Hughes, J. P.; Infante, L.; Kosowsky, A.; Marsden, D.; Marriage, T. A.; Moodley, K.; Niemack, M. D.; Page, L. A.; Spergel, D. N.; Staggs, S. T.; Trac, H.; Wollack, E. J.

    2017-11-01

    ACT is a 6-metre off-axis Gregorian telescope located at an altitude of 5200um in the Atacama desert in Chile, designed to observe the CMB at arcminute resolution. Galaxy clusters were detected in the 148GHz band by matched-filtering the maps with the pressure profile suggested by Arnaud et al. (2010A&A...517A..92A), fit to X-ray selected local (z<0.2) clusters, with varying cluster sizes,θ500, from 1.18 to 27-arcmin. Because of the complete overlap of ACT equatorial observations with Sloan Digital Sky Survey Data Release 8 (SDSS DR8; Aihara et al., 2011ApJS..193...29A) imaging, all cluster candidates were assessed with optical data (Menanteau et al., 2013ApJ...765...67M). We observed 20 clusters from the equatorial sample with the Gemini Multi-Object Spectrograph (GMOS) on the Gemini-South telescope, split in semesters 2011B (ObsID:GS-2011B-C-1, PI:Barrientos/Menanteau) and 2012A (ObsID:GS-2012A-C-1, PI:Menanteau), prioritizing clusters in the cosmological sample at 0.3

  17. Qualitative mechanism models and the rationalization of procedures

    NASA Technical Reports Server (NTRS)

    Farley, Arthur M.

    1989-01-01

    A qualitative, cluster-based approach to the representation of hydraulic systems is described and its potential for generating and explaining procedures is demonstrated. Many ideas are formalized and implemented as part of an interactive, computer-based system. The system allows for designing, displaying, and reasoning about hydraulic systems. The interactive system has an interface consisting of three windows: a design/control window, a cluster window, and a diagnosis/plan window. A qualitative mechanism model for the ORS (Orbital Refueling System) is presented to coordinate with ongoing research on this system being conducted at NASA Ames Research Center.

  18. Evaluation of large area crop estimation techniques using LANDSAT and ground-derived data. [Missouri

    NASA Technical Reports Server (NTRS)

    Amis, M. L.; Lennington, R. K.; Martin, M. V.; Mcguire, W. G.; Shen, S. S. (Principal Investigator)

    1981-01-01

    The results of the Domestic Crops and Land Cover Classification and Clustering study on large area crop estimation using LANDSAT and ground truth data are reported. The current crop area estimation approach of the Economics and Statistics Service of the U.S. Department of Agriculture was evaluated in terms of the factors that are likely to influence the bias and variance of the estimator. Also, alternative procedures involving replacements for the clustering algorithm, the classifier, or the regression model used in the original U.S. Department of Agriculture procedures were investigated.

  19. An Analysis of Rich Cluster Redshift Survey Data for Large Scale Structure Studies

    NASA Astrophysics Data System (ADS)

    Slinglend, K.; Batuski, D.; Haase, S.; Hill, J.

    1994-12-01

    The results from the COBE satellite show the existence of structure on scales on the order of 10% or more of the horizon scale of the universe. Rich clusters of galaxies from Abell's catalog show evidence of structure on scales of 100 Mpc and may hold the promise of confirming structure on the scale of the COBE result. However, many Abell clusters have zero or only one measured redshift, so present knowledge of their three dimensional distribution has quite large uncertainties. The shortage of measured redshifts for these clusters may also mask a problem of projection effects corrupting the membership counts for the clusters. Our approach in this effort has been to use the MX multifiber spectrometer on the Steward 2.3m to measure redshifts of at least ten galaxies in each of 80 Abell cluster fields with richness class R>= 1 and mag10 <= 16.8 (estimated z<= 0.12) and zero or one measured redshifts. This work will result in a deeper, more complete (and reliable) sample of positions of rich clusters. Our primary intent for the sample is for two-point correlation and other studies of the large scale structure traced by these clusters in an effort to constrain theoretical models for structure formation. We are also obtaining enough redshifts per cluster so that a much better sample of reliable cluster velocity dispersions will be available for other studies of cluster properties. To date, we have collected such data for 64 clusters, and for most of them, we have seven or more cluster members with redshifts, allowing for reliable velocity dispersion calculations. Velocity histograms and stripe density plots for several interesting cluster fields are presented, along with summary tables of cluster redshift results. Also, with 10 or more redshifts in most of our cluster fields (30({') } square, just about an `Abell diameter' at z ~ 0.1) we have investigated the extent of projection effects within the Abell catalog in an effort to quantify and understand how this may effect the Abell sample.

  20. THE SWIFT AGN AND CLUSTER SURVEY. II. CLUSTER CONFIRMATION WITH SDSS DATA

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Griffin, Rhiannon D.; Dai, Xinyu; Kochanek, Christopher S.

    2016-01-15

    We study 203 (of 442) Swift AGN and Cluster Survey extended X-ray sources located in the SDSS DR8 footprint to search for galaxy over-densities in three-dimensional space using SDSS galaxy photometric redshifts and positions near the Swift cluster candidates. We find 104 Swift clusters with a >3σ galaxy over-density. The remaining targets are potentially located at higher redshifts and require deeper optical follow-up observations for confirmation as galaxy clusters. We present a series of cluster properties including the redshift, brightest cluster galaxy (BCG) magnitude, BCG-to-X-ray center offset, optical richness, and X-ray luminosity. We also detect red sequences in ∼85% ofmore » the 104 confirmed clusters. The X-ray luminosity and optical richness for the SDSS confirmed Swift clusters are correlated and follow previously established relations. The distribution of the separations between the X-ray centroids and the most likely BCG is also consistent with expectation. We compare the observed redshift distribution of the sample with a theoretical model, and find that our sample is complete for z ≲ 0.3 and is still 80% complete up to z ≃ 0.4, consistent with the SDSS survey depth. These analysis results suggest that our Swift cluster selection algorithm has yielded a statistically well-defined cluster sample for further study of cluster evolution and cosmology. We also match our SDSS confirmed Swift clusters to existing cluster catalogs, and find 42, 23, and 1 matches in optical, X-ray, and Sunyaev–Zel’dovich catalogs, respectively, and so the majority of these clusters are new detections.« less

  1. Cluster lot quality assurance sampling: effect of increasing the number of clusters on classification precision and operational feasibility.

    PubMed

    Okayasu, Hiromasa; Brown, Alexandra E; Nzioki, Michael M; Gasasira, Alex N; Takane, Marina; Mkanda, Pascal; Wassilak, Steven G F; Sutter, Roland W

    2014-11-01

    To assess the quality of supplementary immunization activities (SIAs), the Global Polio Eradication Initiative (GPEI) has used cluster lot quality assurance sampling (C-LQAS) methods since 2009. However, since the inception of C-LQAS, questions have been raised about the optimal balance between operational feasibility and precision of classification of lots to identify areas with low SIA quality that require corrective programmatic action. To determine if an increased precision in classification would result in differential programmatic decision making, we conducted a pilot evaluation in 4 local government areas (LGAs) in Nigeria with an expanded LQAS sample size of 16 clusters (instead of the standard 6 clusters) of 10 subjects each. The results showed greater heterogeneity between clusters than the assumed standard deviation of 10%, ranging from 12% to 23%. Comparing the distribution of 4-outcome classifications obtained from all possible combinations of 6-cluster subsamples to the observed classification of the 16-cluster sample, we obtained an exact match in classification in 56% to 85% of instances. We concluded that the 6-cluster C-LQAS provides acceptable classification precision for programmatic action. Considering the greater resources required to implement an expanded C-LQAS, the improvement in precision was deemed insufficient to warrant the effort. Published by Oxford University Press on behalf of the Infectious Diseases Society of America 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  2. Clustering on very small scales from a large sample of confirmed quasar pairs: does quasar clustering track from Mpc to kpc scales?

    NASA Astrophysics Data System (ADS)

    Eftekharzadeh, S.; Myers, A. D.; Hennawi, J. F.; Djorgovski, S. G.; Richards, G. T.; Mahabal, A. A.; Graham, M. J.

    2017-06-01

    We present the most precise estimate to date of the clustering of quasars on very small scales, based on a sample of 47 binary quasars with magnitudes of g < 20.85 and proper transverse separations of ˜25 h-1 kpc. Our sample of binary quasars, which is about six times larger than any previous spectroscopically confirmed sample on these scales, is targeted using a kernel density estimation (KDE) technique applied to Sloan Digital Sky Survey (SDSS) imaging over most of the SDSS area. Our sample is 'complete' in that all of the KDE target pairs with 17.0 ≲ R ≲ 36.2 h-1 kpc in our area of interest have been spectroscopically confirmed from a combination of previous surveys and our own long-slit observational campaign. We catalogue 230 candidate quasar pairs with angular separations of <8 arcsec, from which our binary quasars were identified. We determine the projected correlation function of quasars (\\bar{W}_p) in four bins of proper transverse scale over the range 17.0 ≲ R ≲ 36.2 h-1 kpc. The implied small-scale quasar clustering amplitude from the projected correlation function, integrated across our entire redshift range, is A = 24.1 ± 3.6 at ˜26.6 h-1 kpc. Our sample is the first spectroscopically confirmed sample of quasar pairs that is sufficiently large to study how quasar clustering evolves with redshift at ˜25 h-1 kpc. We find that empirical descriptions of how quasar clustering evolves with redshift at ˜25 h-1 Mpc also adequately describe the evolution of quasar clustering at ˜25 h-1 kpc.

  3. Online clustering algorithms for radar emitter classification.

    PubMed

    Liu, Jun; Lee, Jim P Y; Senior; Li, Lingjie; Luo, Zhi-Quan; Wong, K Max

    2005-08-01

    Radar emitter classification is a special application of data clustering for classifying unknown radar emitters from received radar pulse samples. The main challenges of this task are the high dimensionality of radar pulse samples, small sample group size, and closely located radar pulse clusters. In this paper, two new online clustering algorithms are developed for radar emitter classification: One is model-based using the Minimum Description Length (MDL) criterion and the other is based on competitive learning. Computational complexity is analyzed for each algorithm and then compared. Simulation results show the superior performance of the model-based algorithm over competitive learning in terms of better classification accuracy, flexibility, and stability.

  4. Estimating regression coefficients from clustered samples: Sampling errors and optimum sample allocation

    NASA Technical Reports Server (NTRS)

    Kalton, G.

    1983-01-01

    A number of surveys were conducted to study the relationship between the level of aircraft or traffic noise exposure experienced by people living in a particular area and their annoyance with it. These surveys generally employ a clustered sample design which affects the precision of the survey estimates. Regression analysis of annoyance on noise measures and other variables is often an important component of the survey analysis. Formulae are presented for estimating the standard errors of regression coefficients and ratio of regression coefficients that are applicable with a two- or three-stage clustered sample design. Using a simple cost function, they also determine the optimum allocation of the sample across the stages of the sample design for the estimation of a regression coefficient.

  5. Magnetic signature of overbank sediment in industry impacted floodplains identified by data mining methods

    NASA Astrophysics Data System (ADS)

    Chudaničová, Monika; Hutchinson, Simon M.

    2016-11-01

    Our study attempts to identify a characteristic magnetic signature of overbank sediments exhibiting anthropogenically induced magnetic enhancement and thereby to distinguish them from unenhanced sediments with weak magnetic background values, using a novel approach based on data mining methods, thus providing a mean of rapid pollution determination. Data were obtained from 539 bulk samples from vertical profiles through overbank sediment, collected on seven rivers in the eastern Czech Republic and three rivers in northwest England. k-Means clustering and hierarchical clustering methods, paired group (UPGMA) and Ward's method, were used to divide the samples to natural groups according to their attributes. Interparametric ratios: SIRM/χ; SIRM/ARM; and S-0.1T were chosen as attributes for analyses making the resultant model more widely applicable as magnetic concentration values can differ by two orders. Division into three clusters appeared to be optimal and corresponded to inherent clusters in the data scatter. Clustering managed to separate samples with relatively weak anthropogenically induced enhancement, relatively strong anthropogenically induced enhancement and samples lacking enhancement. To describe the clusters explicitly and thus obtain a discrete magnetic signature, classification rules (JRip method) and decision trees (J4.8 and Simple Cart methods) were used. Samples lacking anthropogenic enhancement typically exhibited an S-0.1T < c. 0.5, SIRM/ARM < c. 150 and SIRM/χ < c. 6000 A m-1. Samples with magnetic enhancement all exhibited an S-0.1T > 0.5. Samples with relatively stronger anthropogenic enhancement were unequivocally distinguished from the samples with weaker enhancement by an SIRM/ARM > c. 150. Samples with SIRM/ARM in a range c. 126-150 were classified as relatively strongly enhanced when their SIRM/χ > 18 000 A m-1 and relatively less enhanced when their SIRM/χ < 18 000 A m-1. An additional rule was arbitrary added to exclude samples with χfd% > 6 per cent from anthropogenically enhanced clusters as samples with natural magnetic enhancement. The characteristics of the clusters resulted mainly from the relationship between SIRM/ARM and the S-0.1T, and SIRM/χ and the S-0.1T. Both SIRM/ARM and SIRM/χ increase with increasing S-0.1T values reflecting a greater level of anthropogenic magnetic particles. Overall, data mining methods demonstrated good potential for utilization in environmental magnetism.

  6. Quantifying the impact of fixed effects modeling of clusters in multiple imputation for cluster randomized trials

    PubMed Central

    Andridge, Rebecca. R.

    2011-01-01

    In cluster randomized trials (CRTs), identifiable clusters rather than individuals are randomized to study groups. Resulting data often consist of a small number of clusters with correlated observations within a treatment group. Missing data often present a problem in the analysis of such trials, and multiple imputation (MI) has been used to create complete data sets, enabling subsequent analysis with well-established analysis methods for CRTs. We discuss strategies for accounting for clustering when multiply imputing a missing continuous outcome, focusing on estimation of the variance of group means as used in an adjusted t-test or ANOVA. These analysis procedures are congenial to (can be derived from) a mixed effects imputation model; however, this imputation procedure is not yet available in commercial statistical software. An alternative approach that is readily available and has been used in recent studies is to include fixed effects for cluster, but the impact of using this convenient method has not been studied. We show that under this imputation model the MI variance estimator is positively biased and that smaller ICCs lead to larger overestimation of the MI variance. Analytical expressions for the bias of the variance estimator are derived in the case of data missing completely at random (MCAR), and cases in which data are missing at random (MAR) are illustrated through simulation. Finally, various imputation methods are applied to data from the Detroit Middle School Asthma Project, a recent school-based CRT, and differences in inference are compared. PMID:21259309

  7. Unsupervised active learning based on hierarchical graph-theoretic clustering.

    PubMed

    Hu, Weiming; Hu, Wei; Xie, Nianhua; Maybank, Steve

    2009-10-01

    Most existing active learning approaches are supervised. Supervised active learning has the following problems: inefficiency in dealing with the semantic gap between the distribution of samples in the feature space and their labels, lack of ability in selecting new samples that belong to new categories that have not yet appeared in the training samples, and lack of adaptability to changes in the semantic interpretation of sample categories. To tackle these problems, we propose an unsupervised active learning framework based on hierarchical graph-theoretic clustering. In the framework, two promising graph-theoretic clustering algorithms, namely, dominant-set clustering and spectral clustering, are combined in a hierarchical fashion. Our framework has some advantages, such as ease of implementation, flexibility in architecture, and adaptability to changes in the labeling. Evaluations on data sets for network intrusion detection, image classification, and video classification have demonstrated that our active learning framework can effectively reduce the workload of manual classification while maintaining a high accuracy of automatic classification. It is shown that, overall, our framework outperforms the support-vector-machine-based supervised active learning, particularly in terms of dealing much more efficiently with new samples whose categories have not yet appeared in the training samples.

  8. Self-similarity of temperature profiles in distant galaxy clusters: the quest for a universal law

    NASA Astrophysics Data System (ADS)

    Baldi, A.; Ettori, S.; Molendi, S.; Gastaldello, F.

    2012-09-01

    Context. We present the XMM-Newton temperature profiles of 12 bright (LX > 4 × 1044 erg s-1) clusters of galaxies at 0.4 < z < 0.9, having an average temperature in the range 5 ≲ kT ≲ 11 keV. Aims: The main goal of this paper is to study for the first time the temperature profiles of a sample of high-redshift clusters, to investigate their properties, and to define a universal law to describe the temperature radial profiles in galaxy clusters as a function of both cosmic time and their state of relaxation. Methods: We performed a spatially resolved spectral analysis, using Cash statistics, to measure the temperature in the intracluster medium at different radii. Results: We extracted temperature profiles for the clusters in our sample, finding that all profiles are declining toward larger radii. The normalized temperature profiles (normalized by the mean temperature T500) are found to be generally self-similar. The sample was subdivided into five cool-core (CC) and seven non cool-core (NCC) clusters by introducing a pseudo-entropy ratio σ = (TIN/TOUT) × (EMIN/EMOUT)-1/3 and defining the objects with σ < 0.6 as CC clusters and those with σ ≥ 0.6 as NCC clusters. The profiles of CC and NCC clusters differ mainly in the central regions, with the latter exhibiting a slightly flatter central profile. A significant dependence of the temperature profiles on the pseudo-entropy ratio σ is detected by fitting a function of r and σ, showing an indication that the outer part of the profiles becomes steeper for higher values of σ (i.e. transitioning toward the NCC clusters). No significant evidence of redshift evolution could be found within the redshift range sampled by our clusters (0.4 < z < 0.9). A comparison of our high-z sample with intermediate clusters at 0.1 < z < 0.3 showed how the CC and NCC cluster temperature profiles have experienced some sort of evolution. This can happen because higher z clusters are at a less advanced stage of their formation and did not have enough time to create a relaxed structure, which is characterized by a central temperature dip in CC clusters and by flatter profiles in NCC clusters. Conclusions: This is the first time that a systematic study of the temperature profiles of galaxy clusters at z > 0.4 has been attempted. We were able to define the closest possible relation to a universal law for the temperature profiles of galaxy clusters at 0.1 < z < 0.9, showing a dependence on both the relaxation state of the clusters and the redshift. Appendix A is only available in electronic form at http://www.aanda.org

  9. U.S. consumer demand for restaurant calorie information: targeting demographic and behavioral segments in labeling initiatives.

    PubMed

    Kolodinsky, Jane; Reynolds, Travis William; Cannella, Mark; Timmons, David; Bromberg, Daniel

    2009-01-01

    To identify different segments of U.S. consumers based on food choices, exercise patterns, and desire for restaurant calorie labeling. Using a stratified (by region) random sample of the U.S. population, trained interviewers collected data for this cross-sectional study through telephone surveys. Center for Rural Studies U.S. national health survey. The final sample included 580 responses (22% response rate); data were weighted to be representative of age and gender characteristics of the U.S. population. Self-reported behaviors related to food choices, exercise patterns, desire for calorie information in restaurants, and sample demographics. Clusters were identified using Schwartz Bayesian criteria. Impacts of demographic characteristics on cluster membership were analyzed using bivariate tests of association and multinomial logit regression. Cluster analysis revealed three clusters based on respondents' food choices, activity levels, and desire for restaurant labeling. Two clusters, comprising three quarters of the sample, desired calorie labeling in restaurants. The remaining cluster opposed restaurant labeling. Demographic variables significantly predicting cluster membership included region of residence (p < .10), income (p < .05), gender (p < .01), and age (p < .10). Though limited by a low response and potential self-reporting bias in the phone survey, this study suggests that several groups are likely to benefit from restaurant calorie labeling. Specific demographic clusters could be targeted through labeling initiatives.

  10. Social Network Clustering and the Spread of HIV/AIDS Among Persons Who Inject Drugs in 2 Cities in the Philippines.

    PubMed

    Verdery, Ashton M; Siripong, Nalyn; Pence, Brian W

    2017-09-01

    The Philippines has seen rapid increases in HIV prevalence among people who inject drugs. We study 2 neighboring cities where a linked HIV epidemic differed in timing of onset and levels of prevalence. In Cebu, prevalence rose rapidly from below 1% to 54% between 2009 and 2011 and remained high through 2013. In nearby Mandaue, HIV remained below 4% through 2011 then rose rapidly to 38% by 2013. We hypothesize that infection prevalence differences in these cities may owe to aspects of social network structure, specifically levels of network clustering. Building on previous research, we hypothesize that higher levels of network clustering are associated with greater epidemic potential. Data were collected with respondent-driven sampling among men who inject drugs in Cebu and Mandaue in 2013. We first examine sample composition using estimators for population means. We then apply new estimators of network clustering in respondent-driven sampling data to examine associations with HIV prevalence. Samples in both cities were comparable in composition by age, education, and injection locations. Dyadic needle-sharing levels were also similar between the 2 cities, but network clustering in the needle-sharing network differed dramatically. We found higher clustering in Cebu than Mandaue, consistent with expectations that higher clustering is associated with faster epidemic spread. This article is the first to apply estimators of network clustering to empirical respondent-driven samples, and it offers suggestive evidence that researchers should pay greater attention to network structure's role in HIV transmission dynamics.

  11. A clustering algorithm for sample data based on environmental pollution characteristics

    NASA Astrophysics Data System (ADS)

    Chen, Mei; Wang, Pengfei; Chen, Qiang; Wu, Jiadong; Chen, Xiaoyun

    2015-04-01

    Environmental pollution has become an issue of serious international concern in recent years. Among the receptor-oriented pollution models, CMB, PMF, UNMIX, and PCA are widely used as source apportionment models. To improve the accuracy of source apportionment and classify the sample data for these models, this study proposes an easy-to-use, high-dimensional EPC algorithm that not only organizes all of the sample data into different groups according to the similarities in pollution characteristics such as pollution sources and concentrations but also simultaneously detects outliers. The main clustering process consists of selecting the first unlabelled point as the cluster centre, then assigning each data point in the sample dataset to its most similar cluster centre according to both the user-defined threshold and the value of similarity function in each iteration, and finally modifying the clusters using a method similar to k-Means. The validity and accuracy of the algorithm are tested using both real and synthetic datasets, which makes the EPC algorithm practical and effective for appropriately classifying sample data for source apportionment models and helpful for better understanding and interpreting the sources of pollution.

  12. Cosmology with XMM galaxy clusters: the X-CLASS/GROND catalogue and photometric redshifts

    NASA Astrophysics Data System (ADS)

    Ridl, J.; Clerc, N.; Sadibekova, T.; Faccioli, L.; Pacaud, F.; Greiner, J.; Krühler, T.; Rau, A.; Salvato, M.; Menzel, M.-L.; Steinle, H.; Wiseman, P.; Nandra, K.; Sanders, J.

    2017-06-01

    The XMM Cluster Archive Super Survey (X-CLASS) is a serendipitously detected X-ray-selected sample of 845 galaxy clusters based on 2774 XMM archival observations and covering an approximately 90 deg2 spread across the high-Galactic latitude (|b| > 20°) sky. The primary goal of this survey is to produce a well-selected sample of galaxy clusters on which cosmological analyses can be performed. This paper presents the photometric redshift follow-up of a high signal-to-noise ratio subset of 265 of these clusters with declination δ < +20° with Gamma-Ray Burst Optical and Near-Infrared Detector (GROND), a 7-channel (grizJHK) simultaneous imager on the MPG 2.2-m telescope at the ESO La Silla Observatory. We use a newly developed technique based on the red sequence colour-redshift relation, enhanced with information coming from the X-ray detection to provide photometric redshifts for this sample. We determine photometric redshifts for 232 clusters, finding a median redshift of z = 0.39 with an accuracy of Δz = 0.02(1 + z) when compared to a sample of 76 spectroscopically confirmed clusters. We also compute X-ray luminosities for the entire sample and find a median bolometric luminosity of 7.2 × 1043 erg s-1 and a median temperature of 2.9 keV. We compare our results to those of the XMM-XCS and XMM-XXL surveys, finding good agreement in both samples. The X-CLASS catalogue is available online at http://xmm-lss.in2p3.fr:8080/l4sdb/.

  13. Changes in cluster magnetism and suppression of local superconductivity in amorphous FeCrB alloy irradiated by Ar+ ions

    NASA Astrophysics Data System (ADS)

    Okunev, V. D.; Samoilenko, Z. A.; Szymczak, H.; Szewczyk, A.; Szymczak, R.; Lewandowski, S. J.; Aleshkevych, P.; Malinowski, A.; Gierłowski, P.; Więckowski, J.; Wolny-Marszałek, M.; Jeżabek, M.; Varyukhin, V. N.; Antoshina, I. A.

    2016-02-01

    We show that сluster magnetism in ferromagnetic amorphous Fe67Cr18B15 alloy is related to the presence of large, D=150-250 Å, α-(Fe Cr) clusters responsible for basic changes in cluster magnetism, small, D=30-100 Å, α-(Fe, Cr) and Fe3B clusters and subcluster atomic α-(Fe, Cr, B) groupings, D=10-20 Å, in disordered intercluster medium. For initial sample and irradiated one (Φ=1.5×1018 ions/cm2) superconductivity exists in the cluster shells of metallic α-(Fe, Cr) phase where ferromagnetism of iron is counterbalanced by antiferromagnetism of chromium. At Φ=3×1018 ions/cm2, the internal stresses intensify and the process of iron and chromium phase separation, favorable for mesoscopic superconductivity, changes for inverse one promoting more homogeneous distribution of iron and chromium in the clusters as well as gigantic (twice as much) increase in density of the samples. As a result, in the cluster shells ferromagnetism is restored leading to the increase in magnetization of the sample and suppression of local superconductivity. For initial samples, the temperature dependence of resistivity ρ(T) T2 is determined by the electron scattering on quantum defects. In strongly inhomogeneous samples, after irradiation by fluence Φ=1.5×1018 ions/cm2, the transition to a dependence ρ(T) T1/2 is caused by the effects of weak localization. In more homogeneous samples, at Φ=3×1018 ions/cm2, a return to the dependence ρ(T) T2 is observed.

  14. [Indicators and factors of influence on the long-term follow-up of psychogenic diseases--a comparison of extreme groups].

    PubMed

    Franz, M; Schellberg, D; Schepank, H

    1995-02-01

    The present investigation aimed at the identification of possible indicators of course, predictors, and etiologically relevant factors of psychogenic diseases. According to their complaints a sample of probands suffering from psychogenic impairment of medium degree (n = 240) was chosen out of a representative sample of an urban adult population (n = 528). This procedure should ensure a relatively high intraindividual variance of course of the criterion, since a sufficient variability of course seems improbable with chronic and severe psychogenic impaired or stabile healthy probands. Within 10 years the sample was investigated three times by psychodynamically trained physicians and psychologists. By means of cluster analysis the sample was subdivided in different types of course of psychogenic impairment. Both extreme types of course-the probands who showed the most positive and the most negative spontaneous longterm course-were investigated univariately and by means of a multivariate discriminant analysis with regard to potentially course determining variables. It became obvious that personality variables and conditions of early childhood considerably influenced the spontaneous longterm course of psychogenic impairment.

  15. Fast structure similarity searches among protein models: efficient clustering of protein fragments

    PubMed Central

    2012-01-01

    Background For many predictive applications a large number of models is generated and later clustered in subsets based on structure similarity. In most clustering algorithms an all-vs-all root mean square deviation (RMSD) comparison is performed. Most of the time is typically spent on comparison of non-similar structures. For sets with more than, say, 10,000 models this procedure is very time-consuming and alternative faster algorithms, restricting comparisons only to most similar structures would be useful. Results We exploit the inverse triangle inequality on the RMSD between two structures given the RMSDs with a third structure. The lower bound on RMSD may be used, when restricting the search of similarity to a reasonably low RMSD threshold value, to speed up similarity searches significantly. Tests are performed on large sets of decoys which are widely used as test cases for predictive methods, with a speed-up of up to 100 times with respect to all-vs-all comparison depending on the set and parameters used. Sample applications are shown. Conclusions The algorithm presented here allows fast comparison of large data sets of structures with limited memory requirements. As an example of application we present clustering of more than 100000 fragments of length 5 from the top500H dataset into few hundred representative fragments. A more realistic scenario is provided by the search of similarity within the very large decoy sets used for the tests. Other applications regard filtering nearly-indentical conformation in selected CASP9 datasets and clustering molecular dynamics snapshots. Availability A linux executable and a Perl script with examples are given in the supplementary material (Additional file 1). The source code is available upon request from the authors. PMID:22642815

  16. Complex regional pain syndrome: evidence for warm and cold subtypes in a large prospective clinical sample.

    PubMed

    Bruehl, Stephen; Maihöfner, Christian; Stanton-Hicks, Michael; Perez, Roberto S G M; Vatine, Jean-Jacques; Brunner, Florian; Birklein, Frank; Schlereth, Tanja; Mackey, Sean; Mailis-Gagnon, Angela; Livshitz, Anatoly; Harden, R Norman

    2016-08-01

    Limited research suggests that there may be Warm complex regional pain syndrome (CRPS) and Cold CRPS subtypes, with inflammatory mechanisms contributing most strongly to the former. This study for the first time used an unbiased statistical pattern recognition technique to evaluate whether distinct Warm vs Cold CRPS subtypes can be discerned in the clinical population. An international, multisite study was conducted using standardized procedures to evaluate signs and symptoms in 152 patients with clinical CRPS at baseline, with 3-month follow-up evaluations in 112 of these patients. Two-step cluster analysis using automated cluster selection identified a 2-cluster solution as optimal. Results revealed a Warm CRPS patient cluster characterized by a warm, red, edematous, and sweaty extremity and a Cold CRPS patient cluster characterized by a cold, blue, and less edematous extremity. Median pain duration was significantly (P < 0.001) shorter in the Warm CRPS (4.7 months) than in the Cold CRPS subtype (20 months), with pain intensity comparable. A derived total inflammatory score was significantly (P < 0.001) elevated in the Warm CRPS group (compared with Cold CRPS) at baseline but diminished significantly (P < 0.001) over the follow-up period, whereas this score did not diminish in the Cold CRPS group (time × subtype interaction: P < 0.001). Results support the existence of a Warm CRPS subtype common in patients with acute (<6 months) CRPS and a relatively distinct Cold CRPS subtype most common in chronic CRPS. The pattern of clinical features suggests that inflammatory mechanisms contribute most prominently to the Warm CRPS subtype but that these mechanisms diminish substantially during the first year postinjury.

  17. Density-based clustering of small peptide conformations sampled from a molecular dynamics simulation.

    PubMed

    Kim, Minkyoung; Choi, Seung-Hoon; Kim, Junhyoung; Choi, Kihang; Shin, Jae-Min; Kang, Sang-Kee; Choi, Yun-Jaie; Jung, Dong Hyun

    2009-11-01

    This study describes the application of a density-based algorithm to clustering small peptide conformations after a molecular dynamics simulation. We propose a clustering method for small peptide conformations that enables adjacent clusters to be separated more clearly on the basis of neighbor density. Neighbor density means the number of neighboring conformations, so if a conformation has too few neighboring conformations, then it is considered as noise or an outlier and is excluded from the list of cluster members. With this approach, we can easily identify clusters in which the members are densely crowded in the conformational space, and we can safely avoid misclustering individual clusters linked by noise or outliers. Consideration of neighbor density significantly improves the efficiency of clustering of small peptide conformations sampled from molecular dynamics simulations and can be used for predicting peptide structures.

  18. A Weight-Adaptive Laplacian Embedding for Graph-Based Clustering.

    PubMed

    Cheng, De; Nie, Feiping; Sun, Jiande; Gong, Yihong

    2017-07-01

    Graph-based clustering methods perform clustering on a fixed input data graph. Thus such clustering results are sensitive to the particular graph construction. If this initial construction is of low quality, the resulting clustering may also be of low quality. We address this drawback by allowing the data graph itself to be adaptively adjusted in the clustering procedure. In particular, our proposed weight adaptive Laplacian (WAL) method learns a new data similarity matrix that can adaptively adjust the initial graph according to the similarity weight in the input data graph. We develop three versions of these methods based on the L2-norm, fuzzy entropy regularizer, and another exponential-based weight strategy, that yield three new graph-based clustering objectives. We derive optimization algorithms to solve these objectives. Experimental results on synthetic data sets and real-world benchmark data sets exhibit the effectiveness of these new graph-based clustering methods.

  19. Regression analysis of clustered failure time data with informative cluster size under the additive transformation models.

    PubMed

    Chen, Ling; Feng, Yanqin; Sun, Jianguo

    2017-10-01

    This paper discusses regression analysis of clustered failure time data, which occur when the failure times of interest are collected from clusters. In particular, we consider the situation where the correlated failure times of interest may be related to cluster sizes. For inference, we present two estimation procedures, the weighted estimating equation-based method and the within-cluster resampling-based method, when the correlated failure times of interest arise from a class of additive transformation models. The former makes use of the inverse of cluster sizes as weights in the estimating equations, while the latter can be easily implemented by using the existing software packages for right-censored failure time data. An extensive simulation study is conducted and indicates that the proposed approaches work well in both the situations with and without informative cluster size. They are applied to a dental study that motivated this study.

  20. Accounting for twin births in sample size calculations for randomised trials.

    PubMed

    Yelland, Lisa N; Sullivan, Thomas R; Collins, Carmel T; Price, David J; McPhee, Andrew J; Lee, Katherine J

    2018-05-04

    Including twins in randomised trials leads to non-independence or clustering in the data. Clustering has important implications for sample size calculations, yet few trials take this into account. Estimates of the intracluster correlation coefficient (ICC), or the correlation between outcomes of twins, are needed to assist with sample size planning. Our aims were to provide ICC estimates for infant outcomes, describe the information that must be specified in order to account for clustering due to twins in sample size calculations, and develop a simple tool for performing sample size calculations for trials including twins. ICCs were estimated for infant outcomes collected in four randomised trials that included twins. The information required to account for clustering due to twins in sample size calculations is described. A tool that calculates the sample size based on this information was developed in Microsoft Excel and in R as a Shiny web app. ICC estimates ranged between -0.12, indicating a weak negative relationship, and 0.98, indicating a strong positive relationship between outcomes of twins. Example calculations illustrate how the ICC estimates and sample size calculator can be used to determine the target sample size for trials including twins. Clustering among outcomes measured on twins should be taken into account in sample size calculations to obtain the desired power. Our ICC estimates and sample size calculator will be useful for designing future trials that include twins. Publication of additional ICCs is needed to further assist with sample size planning for future trials. © 2018 John Wiley & Sons Ltd.

  1. Impact of non-uniform correlation structure on sample size and power in multiple-period cluster randomised trials.

    PubMed

    Kasza, J; Hemming, K; Hooper, R; Matthews, Jns; Forbes, A B

    2017-01-01

    Stepped wedge and cluster randomised crossover trials are examples of cluster randomised designs conducted over multiple time periods that are being used with increasing frequency in health research. Recent systematic reviews of both of these designs indicate that the within-cluster correlation is typically taken account of in the analysis of data using a random intercept mixed model, implying a constant correlation between any two individuals in the same cluster no matter how far apart in time they are measured: within-period and between-period intra-cluster correlations are assumed to be identical. Recently proposed extensions allow the within- and between-period intra-cluster correlations to differ, although these methods require that all between-period intra-cluster correlations are identical, which may not be appropriate in all situations. Motivated by a proposed intensive care cluster randomised trial, we propose an alternative correlation structure for repeated cross-sectional multiple-period cluster randomised trials in which the between-period intra-cluster correlation is allowed to decay depending on the distance between measurements. We present results for the variance of treatment effect estimators for varying amounts of decay, investigating the consequences of the variation in decay on sample size planning for stepped wedge, cluster crossover and multiple-period parallel-arm cluster randomised trials. We also investigate the impact of assuming constant between-period intra-cluster correlations instead of decaying between-period intra-cluster correlations. Our results indicate that in certain design configurations, including the one corresponding to the proposed trial, a correlation decay can have an important impact on variances of treatment effect estimators, and hence on sample size and power. An R Shiny app allows readers to interactively explore the impact of correlation decay.

  2. A Systematic Approach for Determining Vertical Pile Depth of Embedment in Cohensionless Soils to Withstand Lateral Barge Train Impact Loads

    DTIC Science & Technology

    2017-01-30

    dynamic structural time- history response analysis of flexible approach walls founded on clustered pile groups using Impact_Deck. In Preparation, ERDC...research (Ebeling et al. 2012) has developed simplified analysis procedures for flexible approach wall systems founded on clustered groups of vertical...history response analysis of flexible approach walls founded on clustered pile groups using Impact_Deck. In Preparation, ERDC/ITL TR-16-X. Vicksburg, MS

  3. Effect of W self-implantation and He plasma exposure on early-stage defect and bubble formation in tungsten

    NASA Astrophysics Data System (ADS)

    Thompson, M.; Drummond, D.; Sullivan, J.; Elliman, R.; Kluth, P.; Kirby, N.; Riley, D.; Corr, C. S.

    2018-06-01

    To determine the effect of pre-existing defects on helium-vacancy cluster nucleation and growth, tungsten samples were self-implanted with 1 MeV tungsten ions at varying fluences to induce radiation damage, then subsequently exposed to helium plasma in the MAGPIE linear plasma device. Positron annihilation lifetime spectroscopy was performed both immediately after self-implantation, and again after plasma exposure. After self-implantation vacancies clusters were not observed near the sample surface (<30 nm). At greater depths (30–150 nm) vacancy clusters formed, and were found to increase in size with increasing W-ion fluence. After helium plasma exposure in the MAGPIE linear plasma device at ~300 K with a fluence of 1023 He-m‑2, deep (30–150 nm) vacancy clusters showed similar positron lifetimes, while shallow (<30 nm) clusters were not observed. The intensity of positron lifetime signals fell for most samples after plasma exposure, indicating that defects were filling with helium. The absence of shallow clusters indicates that helium requires pre-existing defects in order to drive vacancy cluster growth at 300 K. Further samples that had not been pre-damaged with W-ions were also exposed to helium plasma in MAGPIE across fluences from 1  ×  1022 to 1.2  ×  1024 He-m‑2. Samples exposed to fluences up to 1  ×  1023 He-m‑2 showed no signs of damage. Fluences of 5  ×  1023 He-m‑2 and higher showed significant helium-cluster formation within the first 30 nm, with positron lifetimes in the vicinity 0.5–0.6 ns. The sample temperature was significantly higher for these higher fluence exposures (~400 K) due to plasma heating. This higher temperature likely enhanced bubble formation by significantly increasing the rate interstitial helium clusters generate vacancies, which is we suspect is the rate-limiting step for helium-vacancy cluster/bubble nucleation in the absence of pre-existing defects.

  4. Parallel Clustering Algorithm for Large-Scale Biological Data Sets

    PubMed Central

    Wang, Minchao; Zhang, Wu; Ding, Wang; Dai, Dongbo; Zhang, Huiran; Xie, Hao; Chen, Luonan; Guo, Yike; Xie, Jiang

    2014-01-01

    Backgrounds Recent explosion of biological data brings a great challenge for the traditional clustering algorithms. With increasing scale of data sets, much larger memory and longer runtime are required for the cluster identification problems. The affinity propagation algorithm outperforms many other classical clustering algorithms and is widely applied into the biological researches. However, the time and space complexity become a great bottleneck when handling the large-scale data sets. Moreover, the similarity matrix, whose constructing procedure takes long runtime, is required before running the affinity propagation algorithm, since the algorithm clusters data sets based on the similarities between data pairs. Methods Two types of parallel architectures are proposed in this paper to accelerate the similarity matrix constructing procedure and the affinity propagation algorithm. The memory-shared architecture is used to construct the similarity matrix, and the distributed system is taken for the affinity propagation algorithm, because of its large memory size and great computing capacity. An appropriate way of data partition and reduction is designed in our method, in order to minimize the global communication cost among processes. Result A speedup of 100 is gained with 128 cores. The runtime is reduced from serval hours to a few seconds, which indicates that parallel algorithm is capable of handling large-scale data sets effectively. The parallel affinity propagation also achieves a good performance when clustering large-scale gene data (microarray) and detecting families in large protein superfamilies. PMID:24705246

  5. Formation of metallic clusters in oxide insulators by means of ion beam mixing

    NASA Astrophysics Data System (ADS)

    Talut, G.; Potzger, K.; Mücklich, A.; Zhou, Shengqiang

    2008-04-01

    The intermixing and near-interface cluster formation of Pt and FePt thin films deposited on different oxide surfaces by means of Pt+ ion irradiation and subsequent annealing was investigated. Irradiated as well as postannealed samples were investigated using high resolution transmission electron microscopy. In MgO and Y :ZrO2 covered with Pt, crystalline clusters with mean sizes of 2 and 3.5nm were found after the Pt+ irradiations with 8×1015 and 2×1016cm-2 and subsequent annealing, respectively. In MgO samples covered with FePt, clusters with mean sizes of 1 and 2nm were found after the Pt+ irradiations with 8×1015 and 2×1016cm-2 and subsequent annealing, respectively. In Y :ZrO2 samples covered with FePt, clusters up to 5nm in size were found after the Pt+ irradiation with 2×1016cm-2 and subsequent annealing. In LaAlO3 the irradiation was accompanied by a full amorphization of the host matrix and appearance of embedded clusters of different sizes. The determination of the lattice constant and thus the kind of the clusters in samples covered by FePt was hindered due to strong deviation of the electron beam by the ferromagnetic FePt.

  6. the-wizz: clustering redshift estimation for everyone

    NASA Astrophysics Data System (ADS)

    Morrison, C. B.; Hildebrandt, H.; Schmidt, S. J.; Baldry, I. K.; Bilicki, M.; Choi, A.; Erben, T.; Schneider, P.

    2017-05-01

    We present the-wizz, an open source and user-friendly software for estimating the redshift distributions of photometric galaxies with unknown redshifts by spatially cross-correlating them against a reference sample with known redshifts. The main benefit of the-wizz is in separating the angular pair finding and correlation estimation from the computation of the output clustering redshifts allowing anyone to create a clustering redshift for their sample without the intervention of an 'expert'. It allows the end user of a given survey to select any subsample of photometric galaxies with unknown redshifts, match this sample's catalogue indices into a value-added data file and produce a clustering redshift estimation for this sample in a fraction of the time it would take to run all the angular correlations needed to produce a clustering redshift. We show results with this software using photometric data from the Kilo-Degree Survey (KiDS) and spectroscopic redshifts from the Galaxy and Mass Assembly survey and the Sloan Digital Sky Survey. The results we present for KiDS are consistent with the redshift distributions used in a recent cosmic shear analysis from the survey. We also present results using a hybrid machine learning-clustering redshift analysis that enables the estimation of clustering redshifts for individual galaxies. the-wizz can be downloaded at http://github.com/morriscb/The-wiZZ/.

  7. Terahertz imaging applied to cancer diagnosis.

    PubMed

    Brun, M-A; Formanek, F; Yasuda, A; Sekine, M; Ando, N; Eishii, Y

    2010-08-21

    We report on terahertz (THz) time-domain spectroscopy imaging of 10 microm thick histological sections. The sections are prepared according to standard pathological procedures and deposited on a quartz window for measurements in reflection geometry. Simultaneous acquisition of visible images enables registration of THz images and thus the use of digital pathology tools to investigate the links between the underlying cellular structure and specific THz information. An analytic model taking into account the polarization of the THz beam, its incidence angle, the beam shift between the reference and sample pulses as well as multiple reflections within the sample is employed to determine the frequency-dependent complex refractive index. Spectral images are produced through segmentation of the extracted refractive index data using clustering methods. Comparisons of visible and THz images demonstrate spectral differences not only between tumor and healthy tissues but also within tumors. Further visualization using principal component analysis suggests different mechanisms as to the origin of image contrast.

  8. Multivariate Cluster Analysis.

    ERIC Educational Resources Information Center

    McRae, Douglas J.

    Procedures for grouping students into homogeneous subsets have long interested educational researchers. The research reported in this paper is an investigation of a set of objective grouping procedures based on multivariate analysis considerations. Four multivariate functions that might serve as criteria for adequate grouping are given and…

  9. Bimetallic clustered thin films with variable electro-optical properties

    NASA Astrophysics Data System (ADS)

    Antipov, A.; Bukharov, D.; Arakelyan, S.; Osipov, A.; Lelekova, A.

    2018-01-01

    The drop deposition of colloidal nanoparticles was performed from water-based colloidal solutions. The proposed procedure is based on the agglomeration of colloidal particles in laser-assisted evaporation processes. The evaporation process was resulted in the formation of clustered thin films on a glass substrate. In the experiments with bimetallic Au:Ag solutions, the clustered films are grown, the formation of the clustered films with the average height of 100 nm was achieved. Optical properties of the deposited structures were investigated experimentally. It is shown that the obtained films may become transparent and its properties are defined by its morphology.

  10. Mapping Dark Matter in Simulated Galaxy Clusters

    NASA Astrophysics Data System (ADS)

    Bowyer, Rachel

    2018-01-01

    Galaxy clusters are the most massive bound objects in the Universe with most of their mass being dark matter. Cosmological simulations of structure formation show that clusters are embedded in a cosmic web of dark matter filaments and large scale structure. It is thought that these filaments are found preferentially close to the long axes of clusters. We extract galaxy clusters from the simulations "cosmo-OWLS" in order to study their properties directly and also to infer their properties from weak gravitational lensing signatures. We investigate various stacking procedures to enhance the signal of the filaments and large scale structure surrounding the clusters to better understand how the filaments of the cosmic web connect with galaxy clusters. This project was supported in part by the NSF REU grant AST-1358980 and by the Nantucket Maria Mitchell Association.

  11. Reproducibility of Cognitive Profiles in Psychosis Using Cluster Analysis.

    PubMed

    Lewandowski, Kathryn E; Baker, Justin T; McCarthy, Julie M; Norris, Lesley A; Öngür, Dost

    2018-04-01

    Cognitive dysfunction is a core symptom dimension that cuts across the psychoses. Recent findings support classification of patients along the cognitive dimension using cluster analysis; however, data-derived groupings may be highly determined by sampling characteristics and the measures used to derive the clusters, and so their interpretability must be established. We examined cognitive clusters in a cross-diagnostic sample of patients with psychosis and associations with clinical and functional outcomes. We then compared our findings to a previous report of cognitive clusters in a separate sample using a different cognitive battery. Participants with affective or non-affective psychosis (n=120) and healthy controls (n=31) were administered the MATRICS Consensus Cognitive Battery, and clinical and community functioning assessments. Cluster analyses were performed on cognitive variables, and clusters were compared on demographic, cognitive, and clinical measures. Results were compared to findings from our previous report. A four-cluster solution provided a good fit to the data; profiles included a neuropsychologically normal cluster, a globally impaired cluster, and two clusters of mixed profiles. Cognitive burden was associated with symptom severity and poorer community functioning. The patterns of cognitive performance by cluster were highly consistent with our previous findings. We found evidence of four cognitive subgroups of patients with psychosis, with cognitive profiles that map closely to those produced in our previous work. Clusters were associated with clinical and community variables and a measure of premorbid functioning, suggesting that they reflect meaningful groupings: replicable, and related to clinical presentation and functional outcomes. (JINS, 2018, 24, 382-390).

  12. The properties of the disk system of globular clusters

    NASA Technical Reports Server (NTRS)

    Armandroff, Taft E.

    1989-01-01

    A large refined data sample is used to study the properties and origin of the disk system of globular clusters. A scale height for the disk cluster system of 800-1500 pc is found which is consistent with scale-height determinations for samples of field stars identified with the Galactic thick disk. A rotational velocity of 193 + or - 29 km/s and a line-of-sight velocity dispersion of 59 + or - 14 km/s have been found for the metal-rich clusters.

  13. The X-ray luminosity functions of Abell clusters from the Einstein Cluster Survey

    NASA Technical Reports Server (NTRS)

    Burg, R.; Giacconi, R.; Forman, W.; Jones, C.

    1994-01-01

    We have derived the present epoch X-ray luminosity function of northern Abell clusters using luminosities from the Einstein Cluster Survey. The sample is sufficiently large that we can determine the luminosity function for each richness class separately with sufficient precision to study and compare the different luminosity functions. We find that, within each richness class, the range of X-ray luminosity is quite large and spans nearly a factor of 25. Characterizing the luminosity function for each richness class with a Schechter function, we find that the characteristic X-ray luminosity, L(sub *), scales with richness class as (L(sub *) varies as N(sub*)(exp gamma), where N(sub *) is the corrected, mean number of galaxies in a richness class, and the best-fitting exponent is gamma = 1.3 +/- 0.4. Finally, our analysis suggests that there is a lower limit to the X-ray luminosity of clusters which is determined by the integrated emission of the cluster member galaxies, and this also scales with richness class. The present sample forms a baseline for testing cosmological evolution of Abell-like clusters when an appropriate high-redshift cluster sample becomes available.

  14. An Archival Search For Young Globular Clusters in Galaxies

    NASA Astrophysics Data System (ADS)

    Whitmore, Brad

    1995-07-01

    One of the most intriguing results from HST has been the discovery of ultraluminous star clusters in interacting and merging galaxies. These clusters have the luminosities, colors, and sizes that would be expected of young globular clusters produced by the interaction. We propose to use the data in the HST Archive to determine how prevalent this phenomena is, and to determine whether similar clusters are produced in other environments. Three samples will be extracted and studied in a systematic and consistent manner: 1} interacting and merging galaxies, 2} starburst galaxies, 3} a control sample of ``normal'' galaxies. A preliminary search of the archives shows that there are at least 20 galaxies in each of these samples, and the number will grow by about 50 observations become available. The data will be used to determine the luminosity function, color histogram , spatial distribution, and structural properties of the clusters using the same techniques employed in our study of NGC 7252 {``Atoms -for-Peace'' galaxy} and NGC 4038/4039 {``The Antennae''}. Our ultimate goals are: 1} to understand how globular clusters form, and 2} to use the clusters as evolutionary tracers to unravel the histories of interacting galaxies.

  15. Herschel And Alma Observations Of The Ism In Massive High-Redshift Galaxy Clusters

    NASA Astrophysics Data System (ADS)

    Wu, John F.; Aguirre, Paula; Baker, Andrew J.; Devlin, Mark J.; Hilton, Matt; Hughes, John P.; Infante, Leopoldo; Lindner, Robert R.; Sifón, Cristóbal

    2017-06-01

    The Sunyaev-Zel'dovich effect (SZE) can be used to select samples of galaxy clusters that are essentially mass-limited out to arbitrarily high redshifts. I will present results from an investigation of the star formation properties of galaxies in four massive clusters, extending to z 1, which were selected on the basis of their SZE decrements in the Atacama Cosmology Telescope (ACT) survey. All four clusters have been imaged with Herschel/PACS (tracing star formation rate) and two with ALMA (tracing dust and cold gas mass); newly discovered ALMA CO(4-3) and [CI] line detections expand an already large sample of spectroscopically confirmed cluster members. Star formation rate appears to anti-correlate with environmental density, but this trend vanishes after controlling for stellar mass. Elevated star formation and higher CO excitation are seen in "El Gordo," a violent cluster merger, relative to a virialized cluster at a similar high (z 1) redshift. Also exploiting ATCA 2.1 GHz observations to identify radio-loud active galactic nuclei (AGN) in our sample, I will use these data to develop a coherent picture of how environment influences galaxies' ISM properties and evolution in the most massive clusters at early cosmic times.

  16. DCE: A Distributed Energy-Efficient Clustering Protocol for Wireless Sensor Network Based on Double-Phase Cluster-Head Election.

    PubMed

    Han, Ruisong; Yang, Wei; Wang, Yipeng; You, Kaiming

    2017-05-01

    Clustering is an effective technique used to reduce energy consumption and extend the lifetime of wireless sensor network (WSN). The characteristic of energy heterogeneity of WSNs should be considered when designing clustering protocols. We propose and evaluate a novel distributed energy-efficient clustering protocol called DCE for heterogeneous wireless sensor networks, based on a Double-phase Cluster-head Election scheme. In DCE, the procedure of cluster head election is divided into two phases. In the first phase, tentative cluster heads are elected with the probabilities which are decided by the relative levels of initial and residual energy. Then, in the second phase, the tentative cluster heads are replaced by their cluster members to form the final set of cluster heads if any member in their cluster has more residual energy. Employing two phases for cluster-head election ensures that the nodes with more energy have a higher chance to be cluster heads. Energy consumption is well-distributed in the proposed protocol, and the simulation results show that DCE achieves longer stability periods than other typical clustering protocols in heterogeneous scenarios.

  17. Couple Differentiation: Mediator or Moderator of Depressive Symptoms and Relationship Satisfaction?

    PubMed

    Bartle-Haring, Suzanne; Ferriby, Megan; Day, Randal

    2018-03-09

    The purpose of this investigation was to determine whether differentiation at the couple level would act as a moderator or a mediator in the association between marital satisfaction and depressive symptoms over time. In a sample of 412 couples, a latent profile analysis was performed to determine how couple differentiation scores were clustered. An Actor/Partner Interdependence Model was then estimated via a group comparison procedure in structural equation modeling. There was no evidence of a moderating effect of differentiation. A mediating model was then estimated and there was evidence that differentiation mediated the association between depressive symptoms and relationship satisfaction via actor and partner effects. © 2018 American Association for Marriage and Family Therapy.

  18. Proportion of elementary school pupils’ anthropometric characteristics with dimensions of classroom furniture in Isfahan, Iran

    PubMed Central

    Habibi, Ehsanollah; Asaadi, Zahra; Hosseini, Seyed Mohsen

    2011-01-01

    BACKGROUND: This study is aimed to examine the appropriacy of school furniture to Iranian pupils′ anthropometric features. METHODS: The participants in this cross-sectional study were 493 boys and 489 girls with the age-range of 7 to 12 years who were selected through a multistage random cluster sampling procedure. Age, weight, height, and anthropometric dimensions were determined. RESULTS: This study indicates that there is a significant difference between the minimum and maximum acceptable dimensions and those of the available furniture (p < 0.001). CONCLUSIONS: In designing suitable furniture for pupils the anthropometric differences of age and gender must be taken into account. PMID:21448391

  19. Birth Cohort, Age, and Sex Strongly Modulate Effects of Lipid Risk Alleles Identified in Genome-Wide Association Studies

    PubMed Central

    Kulminski, Alexander M.; Culminskaya, Irina; Arbeev, Konstantin G.; Arbeeva, Liubov; Ukraintseva, Svetlana V.; Stallard, Eric; Wu, Deqing; Yashin, Anatoliy I.

    2015-01-01

    Insights into genetic origin of diseases and related traits could substantially impact strategies for improving human health. The results of genome-wide association studies (GWAS) are often positioned as discoveries of unconditional risk alleles of complex health traits. We re-analyzed the associations of single nucleotide polymorphisms (SNPs) associated with total cholesterol (TC) in a large-scale GWAS meta-analysis. We focused on three generations of genotyped participants of the Framingham Heart Study (FHS). We show that the effects of all ten directly-genotyped SNPs were clustered in different FHS generations and/or birth cohorts in a sex-specific or sex-unspecific manner. The sample size and procedure-therapeutic issues play, at most, a minor role in this clustering. An important result was clustering of significant associations with the strongest effects in the youngest, or 3rd Generation, cohort. These results imply that an assumption of unconditional connections of these SNPs with TC is generally implausible and that a demographic perspective can substantially improve GWAS efficiency. The analyses of genetic effects in age-matched samples suggest a role of environmental and age-related mechanisms in the associations of different SNPs with TC. Analysis of the literature supports systemic roles for genes for these SNPs beyond those related to lipid metabolism. Our analyses reveal strong antagonistic effects of rs2479409 (the PCSK9 gene) that cautions strategies aimed at targeting this gene in the next generation of lipid drugs. Our results suggest that standard GWAS strategies need to be advanced in order to appropriately address the problem of genetic susceptibility to complex traits that is imperative for translation to health care. PMID:26295473

  20. Latent Class Detection and Class Assignment: A Comparison of the MAXEIG Taxometric Procedure and Factor Mixture Modeling Approaches

    ERIC Educational Resources Information Center

    Lubke, Gitta; Tueller, Stephen

    2010-01-01

    Taxometric procedures such as MAXEIG and factor mixture modeling (FMM) are used in latent class clustering, but they have very different sets of strengths and weaknesses. Taxometric procedures, popular in psychiatric and psychopathology applications, do not rely on distributional assumptions. Their sole purpose is to detect the presence of latent…

  1. Weak-lensing mass calibration of the Atacama Cosmology Telescope equatorial Sunyaev-Zeldovich cluster sample with the Canada-France-Hawaii telescope stripe 82 survey

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Battaglia, N.; Miyatake, H.; Hasselfield, M.

    Mass calibration uncertainty is the largest systematic effect for using clusters of galaxies to constrain cosmological parameters. We present weak lensing mass measurements from the Canada-France-Hawaii Telescope Stripe 82 Survey for galaxy clusters selected through their high signal-to-noise thermal Sunyaev-Zeldovich (tSZ) signal measured with the Atacama Cosmology Telescope (ACT). For a sample of 9 ACT clusters with a tSZ signal-to-noise greater than five the average weak lensing mass is (4.8±0.8) ×10{sup 14} M{sub ⊙}, consistent with the tSZ mass estimate of (4.70±1.0) ×10{sup 14} M{sub ⊙} which assumes a universal pressure profile for the cluster gas. Our results are consistentmore » with previous weak-lensing measurements of tSZ-detected clusters from the Planck satellite. When comparing our results, we estimate the Eddington bias correction for the sample intersection of Planck and weak-lensing clusters which was previously excluded.« less

  2. The Morphologies and Alignments of Gas, Mass, and the Central Galaxies of CLASH Clusters of Galaxies

    NASA Astrophysics Data System (ADS)

    Donahue, Megan; Ettori, Stefano; Rasia, Elena; Sayers, Jack; Zitrin, Adi; Meneghetti, Massimo; Voit, G. Mark; Golwala, Sunil; Czakon, Nicole; Yepes, Gustavo; Baldi, Alessandro; Koekemoer, Anton; Postman, Marc

    2016-03-01

    Morphology is often used to infer the state of relaxation of galaxy clusters. The regularity, symmetry, and degree to which a cluster is centrally concentrated inform quantitative measures of cluster morphology. The Cluster Lensing and Supernova survey with Hubble Space Telescope (CLASH) used weak and strong lensing to measure the distribution of matter within a sample of 25 clusters, 20 of which were deemed to be “relaxed” based on their X-ray morphology and alignment of the X-ray emission with the Brightest Cluster Galaxy. Toward a quantitative characterization of this important sample of clusters, we present uniformly estimated X-ray morphological statistics for all 25 CLASH clusters. We compare X-ray morphologies of CLASH clusters with those identically measured for a large sample of simulated clusters from the MUSIC-2 simulations, selected by mass. We confirm a threshold in X-ray surface brightness concentration of C ≳ 0.4 for cool-core clusters, where C is the ratio of X-ray emission inside 100 h70-1 kpc compared to inside 500 {h}70-1 kpc. We report and compare morphologies of these clusters inferred from Sunyaev-Zeldovich Effect (SZE) maps of the hot gas and in from projected mass maps based on strong and weak lensing. We find a strong agreement in alignments of the orientation of major axes for the lensing, X-ray, and SZE maps of nearly all of the CLASH clusters at radii of 500 kpc (approximately 1/2 R500 for these clusters). We also find a striking alignment of clusters shapes at the 500 kpc scale, as measured with X-ray, SZE, and lensing, with that of the near-infrared stellar light at 10 kpc scales for the 20 “relaxed” clusters. This strong alignment indicates a powerful coupling between the cluster- and galaxy-scale galaxy formation processes.

  3. Multiwavelength study of X-ray luminous clusters in the Hyper Suprime-Cam Subaru Strategic Program S16A field

    NASA Astrophysics Data System (ADS)

    Miyaoka, Keita; Okabe, Nobuhiro; Kitaguchi, Takao; Oguri, Masamune; Fukazawa, Yasushi; Mandelbaum, Rachel; Medezinski, Elinor; Babazaki, Yasunori; Nishizawa, Atsushi J.; Hamana, Takashi; Lin, Yen-Ting; Akamatsu, Hiroki; Chiu, I.-Non; Fujita, Yutaka; Ichinohe, Yuto; Komiyama, Yutaka; Sasaki, Toru; Takizawa, Motokazu; Ueda, Shutaro; Umetsu, Keiichi; Coupon, Jean; Hikage, Chiaki; Hoshino, Akio; Leauthaud, Alexie; Matsushita, Kyoko; Mitsuishi, Ikuyuki; Miyatake, Hironao; Miyazaki, Satoshi; More, Surhud; Nakazawa, Kazuhiro; Ota, Naomi; Sato, Kousuke; Spergel, David; Tamura, Takayuki; Tanaka, Masayuki; Tanaka, Manobu M.; Utsumi, Yousuke

    2018-01-01

    We present a joint X-ray, optical, and weak-lensing analysis for X-ray luminous galaxy clusters selected from the MCXC (Meta-Catalog of X-Ray Detected Clusters of Galaxies) cluster catalog in the Hyper Suprime-Cam Subaru Strategic Program (HSC-SSP) survey field with S16A data. As a pilot study for a series of papers, we measure hydrostatic equilibrium (HE) masses using XMM-Newton data for four clusters in the current coverage area out of a sample of 22 MCXC clusters. We additionally analyze a non-MCXC cluster associated with one MCXC cluster. We show that HE masses for the MCXC clusters are correlated with cluster richness from the CAMIRA catalog, while that for the non-MCXC cluster deviates from the scaling relation. The mass normalization of the relationship between cluster richness and HE mass is compatible with one inferred by matching CAMIRA cluster abundance with a theoretical halo mass function. The mean gas mass fraction based on HE masses for the MCXC clusters is = 0.125 ± 0.012 at spherical overdensity Δ = 500, which is ˜80%-90% of the cosmic mean baryon fraction, Ωb/Ωm, measured by cosmic microwave background experiments. We find that the mean baryon fraction estimated from X-ray and HSC-SSP optical data is comparable to Ωb/Ωm. A weak-lensing shear catalog of background galaxies, combined with photometric redshifts, is currently available only for three clusters in our sample. Hydrostatic equilibrium masses roughly agree with weak-lensing masses, albeit with large uncertainty. This study demonstrates that further multiwavelength study for a large sample of clusters using X-ray, HSC-SSP optical, and weak-lensing data will enable us to understand cluster physics and utilize cluster-based cosmology.

  4. Atomically precise (catalytic) particles synthesized by a novel cluster deposition instrument

    DOE PAGES

    Yin, C.; Tyo, E.; Kuchta, K.; ...

    2014-05-06

    Here, we report a new high vacuum instrument which is dedicated to the preparation of well-defined clusters supported on model and technologically relevant supports for catalytic and materials investigations. The instrument is based on deposition of size selected metallic cluster ions that are produced by a high flux magnetron cluster source. Furthermore, we maximize the throughput of the apparatus by collecting and focusing ions utilizing a conical octupole ion guide and a linear ion guide. The size selection is achieved by a quadrupole mass filter. The new design of the sample holder provides for the preparation of multiple samples onmore » supports of various sizes and shapes in one session. After cluster deposition onto the support of interest, samples will be taken out of the chamber for a variety of testing and characterization.« less

  5. Mutation Clusters from Cancer Exome.

    PubMed

    Kakushadze, Zura; Yu, Willie

    2017-08-15

    We apply our statistically deterministic machine learning/clustering algorithm *K-means (recently developed in https://ssrn.com/abstract=2908286) to 10,656 published exome samples for 32 cancer types. A majority of cancer types exhibit a mutation clustering structure. Our results are in-sample stable. They are also out-of-sample stable when applied to 1389 published genome samples across 14 cancer types. In contrast, we find in- and out-of-sample instabilities in cancer signatures extracted from exome samples via nonnegative matrix factorization (NMF), a computationally-costly and non-deterministic method. Extracting stable mutation structures from exome data could have important implications for speed and cost, which are critical for early-stage cancer diagnostics, such as novel blood-test methods currently in development.

  6. Mutation Clusters from Cancer Exome

    PubMed Central

    Kakushadze, Zura; Yu, Willie

    2017-01-01

    We apply our statistically deterministic machine learning/clustering algorithm *K-means (recently developed in https://ssrn.com/abstract=2908286) to 10,656 published exome samples for 32 cancer types. A majority of cancer types exhibit a mutation clustering structure. Our results are in-sample stable. They are also out-of-sample stable when applied to 1389 published genome samples across 14 cancer types. In contrast, we find in- and out-of-sample instabilities in cancer signatures extracted from exome samples via nonnegative matrix factorization (NMF), a computationally-costly and non-deterministic method. Extracting stable mutation structures from exome data could have important implications for speed and cost, which are critical for early-stage cancer diagnostics, such as novel blood-test methods currently in development. PMID:28809811

  7. BioCluster: tool for identification and clustering of Enterobacteriaceae based on biochemical data.

    PubMed

    Abdullah, Ahmed; Sabbir Alam, S M; Sultana, Munawar; Hossain, M Anwar

    2015-06-01

    Presumptive identification of different Enterobacteriaceae species is routinely achieved based on biochemical properties. Traditional practice includes manual comparison of each biochemical property of the unknown sample with known reference samples and inference of its identity based on the maximum similarity pattern with the known samples. This process is labor-intensive, time-consuming, error-prone, and subjective. Therefore, automation of sorting and similarity in calculation would be advantageous. Here we present a MATLAB-based graphical user interface (GUI) tool named BioCluster. This tool was designed for automated clustering and identification of Enterobacteriaceae based on biochemical test results. In this tool, we used two types of algorithms, i.e., traditional hierarchical clustering (HC) and the Improved Hierarchical Clustering (IHC), a modified algorithm that was developed specifically for the clustering and identification of Enterobacteriaceae species. IHC takes into account the variability in result of 1-47 biochemical tests within this Enterobacteriaceae family. This tool also provides different options to optimize the clustering in a user-friendly way. Using computer-generated synthetic data and some real data, we have demonstrated that BioCluster has high accuracy in clustering and identifying enterobacterial species based on biochemical test data. This tool can be freely downloaded at http://microbialgen.du.ac.bd/biocluster/. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.

  8. a Snapshot Survey of X-Ray Selected Central Cluster Galaxies

    NASA Astrophysics Data System (ADS)

    Edge, Alastair

    1999-07-01

    Central cluster galaxies are the most massive stellar systems known and have been used as standard candles for many decades. Only recently have central cluster galaxies been recognised to exhibit a wide variety of small scale {<100 pc} features that can only be reliably detected with HST resolution. The most intriguing of these are dust lanes which have been detected in many central cluster galaxies. Dust is not expected to survive long in the hostile cluster environment unless shielded by the ISM of a disk galaxy or very dense clouds of cold gas. WFPC2 snapshot images of a representative subset of the central cluster galaxies from an X-ray selected cluster sample would provide important constraints on the formation and evolution of dust in cluster cores that cannot be obtained from ground-based observations. In addition, these images will allow the AGN component, the frequency of multiple nuclei, and the amount of massive-star formation in central cluster galaxies to be ass es sed. The proposed HST observatio ns would also provide high-resolution images of previously unresolved gravitational arcs in the most massive clusters in our sample resulting in constraints on the shape of the gravitational potential of these systems. This project will complement our extensive multi-frequency work on this sample that includes optical spectroscopy and photometry, VLA and X-ray images for the majority of the 210 targets.

  9. Cosmological Constraints from Galaxy Clustering and the Mass-to-number Ratio of Galaxy Clusters

    NASA Astrophysics Data System (ADS)

    Tinker, Jeremy L.; Sheldon, Erin S.; Wechsler, Risa H.; Becker, Matthew R.; Rozo, Eduardo; Zu, Ying; Weinberg, David H.; Zehavi, Idit; Blanton, Michael R.; Busha, Michael T.; Koester, Benjamin P.

    2012-01-01

    We place constraints on the average density (Ω m ) and clustering amplitude (σ8) of matter using a combination of two measurements from the Sloan Digital Sky Survey: the galaxy two-point correlation function, wp (rp ), and the mass-to-galaxy-number ratio within galaxy clusters, M/N, analogous to cluster M/L ratios. Our wp (rp ) measurements are obtained from DR7 while the sample of clusters is the maxBCG sample, with cluster masses derived from weak gravitational lensing. We construct nonlinear galaxy bias models using the Halo Occupation Distribution (HOD) to fit both wp (rp ) and M/N for different cosmological parameters. HOD models that match the same two-point clustering predict different numbers of galaxies in massive halos when Ω m or σ8 is varied, thereby breaking the degeneracy between cosmology and bias. We demonstrate that this technique yields constraints that are consistent and competitive with current results from cluster abundance studies, without the use of abundance information. Using wp (rp ) and M/N alone, we find Ω0.5 m σ8 = 0.465 ± 0.026, with individual constraints of Ω m = 0.29 ± 0.03 and σ8 = 0.85 ± 0.06. Combined with current cosmic microwave background data, these constraints are Ω m = 0.290 ± 0.016 and σ8 = 0.826 ± 0.020. All errors are 1σ. The systematic uncertainties that the M/N technique are most sensitive to are the amplitude of the bias function of dark matter halos and the possibility of redshift evolution between the SDSS Main sample and the maxBCG cluster sample. Our derived constraints are insensitive to the current level of uncertainties in the halo mass function and in the mass-richness relation of clusters and its scatter, making the M/N technique complementary to cluster abundances as a method for constraining cosmology with future galaxy surveys.

  10. Finding gene clusters for a replicated time course study

    PubMed Central

    2014-01-01

    Background Finding genes that share similar expression patterns across samples is an important question that is frequently asked in high-throughput microarray studies. Traditional clustering algorithms such as K-means clustering and hierarchical clustering base gene clustering directly on the observed measurements and do not take into account the specific experimental design under which the microarray data were collected. A new model-based clustering method, the clustering of regression models method, takes into account the specific design of the microarray study and bases the clustering on how genes are related to sample covariates. It can find useful gene clusters for studies from complicated study designs such as replicated time course studies. Findings In this paper, we applied the clustering of regression models method to data from a time course study of yeast on two genotypes, wild type and YOX1 mutant, each with two technical replicates, and compared the clustering results with K-means clustering. We identified gene clusters that have similar expression patterns in wild type yeast, two of which were missed by K-means clustering. We further identified gene clusters whose expression patterns were changed in YOX1 mutant yeast compared to wild type yeast. Conclusions The clustering of regression models method can be a valuable tool for identifying genes that are coordinately transcribed by a common mechanism. PMID:24460656

  11. Cluster randomised crossover trials with binary data and unbalanced cluster sizes: application to studies of near-universal interventions in intensive care.

    PubMed

    Forbes, Andrew B; Akram, Muhammad; Pilcher, David; Cooper, Jamie; Bellomo, Rinaldo

    2015-02-01

    Cluster randomised crossover trials have been utilised in recent years in the health and social sciences. Methods for analysis have been proposed; however, for binary outcomes, these have received little assessment of their appropriateness. In addition, methods for determination of sample size are currently limited to balanced cluster sizes both between clusters and between periods within clusters. This article aims to extend this work to unbalanced situations and to evaluate the properties of a variety of methods for analysis of binary data, with a particular focus on the setting of potential trials of near-universal interventions in intensive care to reduce in-hospital mortality. We derive a formula for sample size estimation for unbalanced cluster sizes, and apply it to the intensive care setting to demonstrate the utility of the cluster crossover design. We conduct a numerical simulation of the design in the intensive care setting and for more general configurations, and we assess the performance of three cluster summary estimators and an individual-data estimator based on binomial-identity-link regression. For settings similar to the intensive care scenario involving large cluster sizes and small intra-cluster correlations, the sample size formulae developed and analysis methods investigated are found to be appropriate, with the unweighted cluster summary method performing well relative to the more optimal but more complex inverse-variance weighted method. More generally, we find that the unweighted and cluster-size-weighted summary methods perform well, with the relative efficiency of each largely determined systematically from the study design parameters. Performance of individual-data regression is adequate with small cluster sizes but becomes inefficient for large, unbalanced cluster sizes. When outcome prevalences are 6% or less and the within-cluster-within-period correlation is 0.05 or larger, all methods display sub-nominal confidence interval coverage, with the less prevalent the outcome the worse the coverage. As with all simulation studies, conclusions are limited to the configurations studied. We confined attention to detecting intervention effects on an absolute risk scale using marginal models and did not explore properties of binary random effects models. Cluster crossover designs with binary outcomes can be analysed using simple cluster summary methods, and sample size in unbalanced cluster size settings can be determined using relatively straightforward formulae. However, caution needs to be applied in situations with low prevalence outcomes and moderate to high intra-cluster correlations. © The Author(s) 2014.

  12. Dependence of the clustering properties of galaxies on stellar velocity dispersion in the Main galaxy sample of SDSS DR10

    NASA Astrophysics Data System (ADS)

    Deng, Xin-Fa; Song, Jun; Chen, Yi-Qing; Jiang, Peng; Ding, Ying-Ping

    2014-08-01

    Using two volume-limited Main galaxy samples of the Sloan Digital Sky Survey Data Release 10 (SDSS DR10), we investigate the dependence of the clustering properties of galaxies on stellar velocity dispersion by cluster analysis. It is found that in the luminous volume-limited Main galaxy sample, except at r=1.2, richer and larger systems can be more easily formed in the large stellar velocity dispersion subsample, while in the faint volume-limited Main galaxy sample, at r≥0.9, an opposite trend is observed. According to statistical analyses of the multiplicity functions, we conclude in two volume-limited Main galaxy samples: small stellar velocity dispersion galaxies preferentially form isolated galaxies, close pairs and small group, while large stellar velocity dispersion galaxies preferentially inhabit the dense groups and clusters. However, we note the difference between two volume-limited Main galaxy samples: in the faint volume-limited Main galaxy sample, at r≥0.9, the small stellar velocity dispersion subsample has a higher proportion of galaxies in superclusters ( n≥200) than the large stellar velocity dispersion subsample.

  13. See Change: the Supernova Sample from the Supernova Cosmology Project High Redshift Cluster Supernova Survey

    NASA Astrophysics Data System (ADS)

    Hayden, Brian; Perlmutter, Saul; Boone, Kyle; Nordin, Jakob; Rubin, David; Lidman, Chris; Deustua, Susana E.; Fruchter, Andrew S.; Aldering, Greg Scott; Brodwin, Mark; Cunha, Carlos E.; Eisenhardt, Peter R.; Gonzalez, Anthony H.; Jee, James; Hildebrandt, Hendrik; Hoekstra, Henk; Santos, Joana; Stanford, S. Adam; Stern, Daniel; Fassbender, Rene; Richard, Johan; Rosati, Piero; Wechsler, Risa H.; Muzzin, Adam; Willis, Jon; Boehringer, Hans; Gladders, Michael; Goobar, Ariel; Amanullah, Rahman; Hook, Isobel; Huterer, Dragan; Huang, Xiaosheng; Kim, Alex G.; Kowalski, Marek; Linder, Eric; Pain, Reynald; Saunders, Clare; Suzuki, Nao; Barbary, Kyle H.; Rykoff, Eli S.; Meyers, Joshua; Spadafora, Anthony L.; Sofiatti, Caroline; Wilson, Gillian; Rozo, Eduardo; Hilton, Matt; Ruiz-Lapuente, Pilar; Luther, Kyle; Yen, Mike; Fagrelius, Parker; Dixon, Samantha; Williams, Steven

    2017-01-01

    The Supernova Cosmology Project has finished executing a large (174 orbits, cycles 22-23) Hubble Space Telescope program, which has measured ~30 type Ia Supernovae above z~1 in the highest-redshift, most massive galaxy clusters known to date. Our SN Ia sample closely matches our pre-survey predictions; this sample will improve the constraint by a factor of 3 on the Dark Energy equation of state above z~1, allowing an unprecedented probe of Dark Energy time variation. When combined with the improved cluster mass calibration from gravitational lensing provided by the deep WFC3-IR observations of the clusters, See Change will triple the Dark Energy Task Force Figure of Merit. With the primary observing campaign completed, we present the preliminary supernova sample and our path forward to the supernova cosmology results. We also compare the number of SNe Ia discovered in each cluster with our pre-survey expectations based on cluster mass and SFR estimates. Our extensive HST and ground-based campaign has already produced unique results; we have confirmed several of the highest redshift cluster members known to date, confirmed the redshift of one of the most massive galaxy clusters at z~1.2 expected across the entire sky, and characterized one of the most extreme starburst environments yet known in a z~1.7 cluster. We have also discovered a lensed SN Ia at z=2.22 magnified by a factor of ~2.7, which is the highest spectroscopic redshift SN Ia currently known.

  14. Comorbid Visual and Psychiatric Disabilities Among the Chinese Elderly: A National Population-Based Survey.

    PubMed

    Guo, Chao; Wang, Zhenjie; Li, Ning; Chen, Gong; Zheng, Xiaoying

    2017-12-01

    To estimate the prevalence of, and association between, co-morbid visual and psychiatric disabilities among elderly (>65 years-of-age) persons in China. Random representative samples were obtained using multistage, stratified, cluster sampling, with probabilities proportional to size. Standard weighting procedures were used to construct sample weights that reflected this multistage, stratified cluster sampling survey scheme. Logistic regression models were used to elucidate associations between visual and psychiatric disabilities. Among the Chinese elderly, >160,000 persons have co-morbid visual and psychiatric disabilities. The weighted prevalence among this cohort is 123.7 per 100,000 persons. A higher prevalence of co-morbid visual and psychiatric disabilities was found in the oldest-old (p<0.001); women (65-79 years-of-age, p=0.001; ≥80 years-of-age, p=0.004); illiterate (65-79 years-of-age, p<0.001; ≥80 years-of-age, p=0.02); and single elders (65-79 years-of-age, p=0.01; ≥80 years-of-age, p=0.001). Presence of a visual disability was significantly associated with a higher risk of having a psychiatric disability among persons aged ≥80 years-of-age [adjusted odds ratio, 1.24; 95% confidence interval (CI), 1.03-1.54]. A significant number of Chinese elderly persons were living with co-morbid visual and psychiatric disabilities. To address the challenge of these co-morbid disorders among Chinese elders, it is incumbent upon the government to implement additional and more comprehensive prevention and rehabilitation strategies for health-care systems, reinforce health promotion among the elderly, and improve accessibility to health-care services.

  15. OGLE Collection of Star Clusters. New Objects in the Outskirts of the Large Magellanic Cloud

    NASA Astrophysics Data System (ADS)

    Sitek, M.; Szymański, M. K.; Skowron, D. M.; Udalski, A.; Kostrzewa-Rutkowska, Z.; Skowron, J.; Karczmarek, P.; Cieślar, M.; Wyrzykowski, Ł.; Kozłowski, S.; Pietrukowicz, P.; Soszyński, I.; Mróz, P.; Pawlak, M.; Poleski, R.; Ulaczyk, K.

    2016-09-01

    The Magellanic System (MS), consisting of the Large Magellanic Cloud (LMC), the Small Magellanic Cloud (SMC) and the Magellanic Bridge (MBR), contains diverse sample of star clusters. Their spatial distribution, ages and chemical abundances may provide important information about the history of formation of the whole System. We use deep photometric maps derived from the images collected during the fourth phase of the Optical Gravitational Lensing Experiment (OGLE-IV) to construct the most complete catalog of star clusters in the Large Magellanic Cloud using the homogeneous photometric data. In this paper we present the collection of star clusters found in the area of about 225 square degrees in the outer regions of the LMC. Our sample contains 679 visually identified star cluster candidates, 226 of which were not listed in any of the previously published catalogs. The new clusters are mainly young small open clusters or clusters similar to associations.

  16. The Effect of Cluster Sampling Design in Survey Research on the Standard Error Statistic.

    ERIC Educational Resources Information Center

    Wang, Lin; Fan, Xitao

    Standard statistical methods are used to analyze data that is assumed to be collected using a simple random sampling scheme. These methods, however, tend to underestimate variance when the data is collected with a cluster design, which is often found in educational survey research. The purposes of this paper are to demonstrate how a cluster design…

  17. A Comparison of Single Sample and Bootstrap Methods to Assess Mediation in Cluster Randomized Trials

    ERIC Educational Resources Information Center

    Pituch, Keenan A.; Stapleton, Laura M.; Kang, Joo Youn

    2006-01-01

    A Monte Carlo study examined the statistical performance of single sample and bootstrap methods that can be used to test and form confidence interval estimates of indirect effects in two cluster randomized experimental designs. The designs were similar in that they featured random assignment of clusters to one of two treatment conditions and…

  18. Characterization of Oxygen Defect Clusters in UO2+ x Using Neutron Scattering and PDF Analysis.

    PubMed

    Ma, Yue; Garcia, Philippe; Lechelle, Jacques; Miard, Audrey; Desgranges, Lionel; Baldinozzi, Gianguido; Simeone, David; Fischer, Henry E

    2018-06-18

    In hyper-stoichiometric uranium oxide, both neutron diffraction work and, more recently, theoretical analyses report the existence of clusters such as the 2:2:2 cluster, comprising two anion vacancies and two types of anion interstitials. However, little is known about whether there exists a region of low deviation-from-stoichiometry in which defects remain isolated, or indeed whether at high deviation-from-stoichiometry defect clusters prevail that contain more excess oxygen atoms than the di-interstitial cluster. In this study, we report pair distribution function (PDF) analyses of UO 2 and UO 2+ x ( x ≈ 0.007 and x ≈ 0.16) samples obtained from high-temperature in situ neutron scattering experiments. PDF refinement for the lower deviation from stoichiometry sample suggests the system is too dilute to differentiate between isolated defects and di-interstitial clusters. For the UO 2.16 sample, several defect structures are tested, and it is found that the data are best represented assuming the presence of center-occupied cuboctahedra.

  19. A good mass proxy for galaxy clusters with XMM-Newton

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhao, Hai-Hui; Jia, Shu-Mei; Chen, Yong

    2013-12-01

    We use a sample of 39 galaxy clusters at redshift z < 0.1 observed by XMM-Newton to investigate the relations between X-ray observables and total mass. Based on central cooling time and central temperature drop, the clusters in this sample are divided into two groups: 25 cool core clusters and 14 non-cool core clusters, respectively. We study the scaling relations of L {sub bol}-M {sub 500}, M {sub 500}-T, M {sub 500}-M {sub g}, and M {sub 500}-Y {sub X}, and also the influences of cool core on these relations. The results show that the M {sub 500}-Y {sub X}more » relation has a slope close to the standard self-similar value, has the smallest scatter and does not vary with the cluster sample. Moreover, the M {sub 500}-Y {sub X} relation is not affected by the cool core. Thus, the parameter of Y{sub X} may be the best mass indicator.« less

  20. The MUSIC of CLASH: Predictions on the Concentration-Mass Relation

    NASA Astrophysics Data System (ADS)

    Meneghetti, M.; Rasia, E.; Vega, J.; Merten, J.; Postman, M.; Yepes, G.; Sembolini, F.; Donahue, M.; Ettori, S.; Umetsu, K.; Balestra, I.; Bartelmann, M.; Benítez, N.; Biviano, A.; Bouwens, R.; Bradley, L.; Broadhurst, T.; Coe, D.; Czakon, N.; De Petris, M.; Ford, H.; Giocoli, C.; Gottlöber, S.; Grillo, C.; Infante, L.; Jouvel, S.; Kelson, D.; Koekemoer, A.; Lahav, O.; Lemze, D.; Medezinski, E.; Melchior, P.; Mercurio, A.; Molino, A.; Moscardini, L.; Monna, A.; Moustakas, J.; Moustakas, L. A.; Nonino, M.; Rhodes, J.; Rosati, P.; Sayers, J.; Seitz, S.; Zheng, W.; Zitrin, A.

    2014-12-01

    We present an analysis of the MUSIC-2 N-body/hydrodynamical simulations aimed at estimating the expected concentration-mass relation for the CLASH (Cluster Lensing and Supernova Survey with Hubble) cluster sample. We study nearly 1,400 halos simulated at high spatial and mass resolution. We study the shape of both their density and surface-density profiles and fit them with a variety of radial functions, including the Navarro-Frenk-White (NFW), the generalized NFW, and the Einasto density profiles. We derive concentrations and masses from these fits. We produce simulated Chandra observations of the halos, and we use them to identify objects resembling the X-ray morphologies and masses of the clusters in the CLASH X-ray-selected sample. We also derive a concentration-mass relation for strong-lensing clusters. We find that the sample of simulated halos that resembles the X-ray morphology of the CLASH clusters is composed mainly of relaxed halos, but it also contains a significant fraction of unrelaxed systems. For such a heterogeneous sample we measure an average two-dimensional concentration that is ~11% higher than is found for the full sample of simulated halos. After accounting for projection and selection effects, the average NFW concentrations of CLASH clusters are expected to be intermediate between those predicted in three dimensions for relaxed and super-relaxed halos. Matching the simulations to the individual CLASH clusters on the basis of the X-ray morphology, we expect that the NFW concentrations recovered from the lensing analysis of the CLASH clusters are in the range [3-6], with an average value of 3.87 and a standard deviation of 0.61.

  1. The music of clash: predictions on the concentration-mass relation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Meneghetti, M.; Rasia, E.; Vega, J.

    We present an analysis of the MUSIC-2 N-body/hydrodynamical simulations aimed at estimating the expected concentration-mass relation for the CLASH (Cluster Lensing and Supernova Survey with Hubble) cluster sample. We study nearly 1,400 halos simulated at high spatial and mass resolution. We study the shape of both their density and surface-density profiles and fit them with a variety of radial functions, including the Navarro-Frenk-White (NFW), the generalized NFW, and the Einasto density profiles. We derive concentrations and masses from these fits. We produce simulated Chandra observations of the halos, and we use them to identify objects resembling the X-ray morphologies andmore » masses of the clusters in the CLASH X-ray-selected sample. We also derive a concentration-mass relation for strong-lensing clusters. We find that the sample of simulated halos that resembles the X-ray morphology of the CLASH clusters is composed mainly of relaxed halos, but it also contains a significant fraction of unrelaxed systems. For such a heterogeneous sample we measure an average two-dimensional concentration that is ∼11% higher than is found for the full sample of simulated halos. After accounting for projection and selection effects, the average NFW concentrations of CLASH clusters are expected to be intermediate between those predicted in three dimensions for relaxed and super-relaxed halos. Matching the simulations to the individual CLASH clusters on the basis of the X-ray morphology, we expect that the NFW concentrations recovered from the lensing analysis of the CLASH clusters are in the range [3-6], with an average value of 3.87 and a standard deviation of 0.61.« less

  2. Denaturing gradient gel electrophoresis profiles of bacteria from the saliva of twenty four different individuals form clusters that showed no relationship to the yeasts present.

    PubMed

    M Weerasekera, Manjula; H Sissons, Chris; Wong, Lisa; A Anderson, Sally; R Holmes, Ann; D Cannon, Richard

    2017-10-01

    The aim was to investigate the relationship between groups of bacteria identified by cluster analysis of the DGGE fingerprints and the amounts and diversity of yeast present. Bacterial and yeast populations in saliva samples from 24 adults were analysed using denaturing gradient gel electrophoresis (DGGE) of the bacteria present and by yeast culture. Eubacterial DGGE banding patterns showed considerable variation between individuals. Seventy one different amplicon bands were detected, the band number per saliva sample ranged from 21 to 39 (mean±SD=29.3±4.9). Cluster and principal component analysis of the bacterial DGGE patterns yielded three major clusters containing 20 of the samples. Seventeen of the 24 (71%) saliva samples were yeast positive with concentrations up to 10 3 cfu/mL. Candida albicans was the predominant species in saliva samples although six other yeast species, including Candida dubliniensis, Candida tropicalis, Candida krusei, Candida guilliermondii, Candida rugosa and Saccharomyces cerevisiae, were identified. The presence, concentration, and species of yeast in samples showed no clear relationship to the bacterial clusters. Despite indications of in vitro bacteria-yeast interactions, there was a lack of association between the presence, identity and diversity of yeasts and the bacterial DGGE fingerprint clusters in saliva. This suggests significant ecological individual-specificity of these associations in highly complex in vivo oral biofilm systems under normal oral conditions. Copyright © 2017 Elsevier Ltd. All rights reserved.

  3. Differences in soil biological activity by terrain types at the sub-field scale in central Iowa US

    DOE PAGES

    Kaleita, Amy L.; Schott, Linda R.; Hargreaves, Sarah K.; ...

    2017-07-07

    Soil microbial communities are structured by biogeochemical processes that occur at many different spatial scales, which makes soil sampling difficult. Because soil microbial communities are important in nutrient cycling and soil fertility, it is important to understand how microbial communities function within the heterogeneous soil landscape. In this study, a self-organizing map was used to determine whether landscape data can be used to characterize the distribution of microbial biomass and activity in order to provide an improved understanding of soil microbial community function. Points within a row crop field in south-central Iowa were clustered via a self-organizing map using sixmore » landscape properties into three separate landscape clusters. Twelve sampling locations per cluster were chosen for a total of 36 locations. After the soil samples were collected, the samples were then analysed for various metabolic indicators, such as nitrogen and carbon mineralization, extractable organic carbon, microbial biomass, etc. It was found that sampling locations located in the potholes and toe slope positions had significantly greater microbial biomass nitrogen and carbon, total carbon, total nitrogen and extractable organic carbon than the other two landscape position clusters, while locations located on the upslope did not differ significantly from the other landscape clusters. However, factors such as nitrate, ammonia, and nitrogen and carbon mineralization did not differ significantly across the landscape. Altogether, this research demonstrates the effectiveness of a terrain-based clustering method for guiding soil sampling of microbial communities.« less

  4. Differences in soil biological activity by terrain types at the sub-field scale in central Iowa US

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kaleita, Amy L.; Schott, Linda R.; Hargreaves, Sarah K.

    Soil microbial communities are structured by biogeochemical processes that occur at many different spatial scales, which makes soil sampling difficult. Because soil microbial communities are important in nutrient cycling and soil fertility, it is important to understand how microbial communities function within the heterogeneous soil landscape. In this study, a self-organizing map was used to determine whether landscape data can be used to characterize the distribution of microbial biomass and activity in order to provide an improved understanding of soil microbial community function. Points within a row crop field in south-central Iowa were clustered via a self-organizing map using sixmore » landscape properties into three separate landscape clusters. Twelve sampling locations per cluster were chosen for a total of 36 locations. After the soil samples were collected, the samples were then analysed for various metabolic indicators, such as nitrogen and carbon mineralization, extractable organic carbon, microbial biomass, etc. It was found that sampling locations located in the potholes and toe slope positions had significantly greater microbial biomass nitrogen and carbon, total carbon, total nitrogen and extractable organic carbon than the other two landscape position clusters, while locations located on the upslope did not differ significantly from the other landscape clusters. However, factors such as nitrate, ammonia, and nitrogen and carbon mineralization did not differ significantly across the landscape. Altogether, this research demonstrates the effectiveness of a terrain-based clustering method for guiding soil sampling of microbial communities.« less

  5. The Gaia-ESO Survey. Mg-Al anti-correlation in iDR4 globular clusters

    NASA Astrophysics Data System (ADS)

    Pancino, E.; Romano, D.; Tang, B.; Tautvaišienė, G.; Casey, A. R.; Gruyters, P.; Geisler, D.; San Roman, I.; Randich, S.; Alfaro, E. J.; Bragaglia, A.; Flaccomio, E.; Korn, A. J.; Recio-Blanco, A.; Smiljanic, R.; Carraro, G.; Bayo, A.; Costado, M. T.; Damiani, F.; Jofré, P.; Lardo, C.; de Laverny, P.; Monaco, L.; Morbidelli, L.; Sbordone, L.; Sousa, S. G.; Villanova, S.

    2017-05-01

    We use Gaia-ESO (GES) Survey iDR4 data to explore the Mg-Al anti-correlation in globular clusters that were observed as calibrators, as a demonstration of the quality of Gaia-ESO Survey data and analysis. The results compare well with the available literature, within 0.1 dex or less, after a small (compared to the internal spreads) offset between the UVES and GIRAFFE data of 0.10-0.15 dex was taken into account. In particular, for the first time we present data for NGC 5927, which is one of the most metal-rich globular clusters studied in the literature so far with [ Fe / H ] = - 0.39 ± 0.04 dex; this cluster was included to connect with the open cluster regime in the Gaia-ESO Survey internal calibration. The extent and shape of the Mg-Al anti-correlation provide strong constraints on the multiple population phenomenon in globular clusters. In particular, we studied the dependency of the Mg-Al anti-correlation extension with metallicity, present-day mass,and age of the clusters, using GES data in combination with a large set of homogenized literature measurements.We find a dependency with both metallicity and mass, which is evident when fitting for the two parameters simultaneously, but we do not find significant dependency with age. We confirm that the Mg-Al anti-correlation is not seen in all clusters, but disappears for the less massive or most metal-rich clusters. We also use our data set to see whether a normal anti-correlation would explain the low [Mg/α] observed in some extragalactic globular clusters, but find that none of the clusters in our sample can reproduce it; a more extreme chemical composition, such as that of NGC 2419, would be required. We conclude that GES iDR4 data already meet the requirements set by the main survey goals and can be used to study globular clusters in detail, even if the analysis procedures were not specifically designed for them. Based on data products from observations made with ESO Telescopes at the La Silla Paranal Observatory under programme ID 188.B-3002.Full Table 2 is only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/601/A112

  6. Modeling of correlated data with informative cluster sizes: An evaluation of joint modeling and within-cluster resampling approaches.

    PubMed

    Zhang, Bo; Liu, Wei; Zhang, Zhiwei; Qu, Yanping; Chen, Zhen; Albert, Paul S

    2017-08-01

    Joint modeling and within-cluster resampling are two approaches that are used for analyzing correlated data with informative cluster sizes. Motivated by a developmental toxicity study, we examined the performances and validity of these two approaches in testing covariate effects in generalized linear mixed-effects models. We show that the joint modeling approach is robust to the misspecification of cluster size models in terms of Type I and Type II errors when the corresponding covariates are not included in the random effects structure; otherwise, statistical tests may be affected. We also evaluate the performance of the within-cluster resampling procedure and thoroughly investigate the validity of it in modeling correlated data with informative cluster sizes. We show that within-cluster resampling is a valid alternative to joint modeling for cluster-specific covariates, but it is invalid for time-dependent covariates. The two methods are applied to a developmental toxicity study that investigated the effect of exposure to diethylene glycol dimethyl ether.

  7. A QUANTITATIVE ANALYSIS OF DISTANT OPEN CLUSTERS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Janes, Kenneth A.; Hoq, Sadia

    2011-03-15

    The oldest open star clusters are important for tracing the history of the Galactic disk, but many of the more distant clusters are heavily reddened and projected against the rich stellar background of the Galaxy. We have undertaken an investigation of several distant clusters (Berkeley 19, Berkeley 44, King 25, NGC 6802, NGC 6827, Berkeley 52, Berkeley 56, NGC 7142, NGC 7245, and King 9) to develop procedures for separating probable cluster members from the background field. We next created a simple quantitative approach for finding approximate cluster distances, reddenings, and ages. We first conclude that with the possible exceptionmore » of King 25 they are probably all physical clusters. We also find that for these distant clusters our typical errors are about {+-}0.07 in E(B - V), {+-}0.15 in log(age), and {+-}0.25 in (m - M){sub o}. The clusters range in age from 470 Myr to 7 Gyr and range from 7.1 to 16.4 kpc from the Galactic center.« less

  8. Use of market segmentation to identify untapped consumer needs in vision correction surgery for future growth.

    PubMed

    Loarie, Thomas M; Applegate, David; Kuenne, Christopher B; Choi, Lawrence J; Horowitz, Diane P

    2003-01-01

    Market segmentation analysis identifies discrete segments of the population whose beliefs are consistent with exhibited behaviors such as purchase choice. This study applies market segmentation analysis to low myopes (-1 to -3 D with less than 1 D cylinder) in their consideration and choice of a refractive surgery procedure to discover opportunities within the market. A quantitative survey based on focus group research was sent to a demographically balanced sample of myopes using contact lenses and/or glasses. A variable reduction process followed by a clustering analysis was used to discover discrete belief-based segments. The resulting segments were validated both analytically and through in-market testing. Discontented individuals who wear contact lenses are the primary target for vision correction surgery. However, 81% of the target group is apprehensive about laser in situ keratomileusis (LASIK). They are nervous about the procedure and strongly desire reversibility and exchangeability. There exists a large untapped opportunity for vision correction surgery within the low myope population. Market segmentation analysis helped determine how to best meet this opportunity through repositioning existing procedures or developing new vision correction technology, and could also be applied to identify opportunities in other vision correction populations.

  9. Improving the sampling strategy of the Joint Danube Survey 3 (2013) by means of multivariate statistical techniques applied on selected physico-chemical and biological data.

    PubMed

    Hamchevici, Carmen; Udrea, Ion

    2013-11-01

    The concept of basin-wide Joint Danube Survey (JDS) was launched by the International Commission for the Protection of the Danube River (ICPDR) as a tool for investigative monitoring under the Water Framework Directive (WFD), with a frequency of 6 years. The first JDS was carried out in 2001 and its success in providing key information for characterisation of the Danube River Basin District as required by WFD lead to the organisation of the second JDS in 2007, which was the world's biggest river research expedition in that year. The present paper presents an approach for improving the survey strategy for the next planned survey JDS3 (2013) by means of several multivariate statistical techniques. In order to design the optimum structure in terms of parameters and sampling sites, principal component analysis (PCA), factor analysis (FA) and cluster analysis were applied on JDS2 data for 13 selected physico-chemical and one biological element measured in 78 sampling sites located on the main course of the Danube. Results from PCA/FA showed that most of the dataset variance (above 75%) was explained by five varifactors loaded with 8 out of 14 variables: physical (transparency and total suspended solids), relevant nutrients (N-nitrates and P-orthophosphates), feedback effects of primary production (pH, alkalinity and dissolved oxygen) and algal biomass. Taking into account the representation of the factor scores given by FA versus sampling sites and the major groups generated by the clustering procedure, the spatial network of the next survey could be carefully tailored, leading to a decreasing of sampling sites by more than 30%. The approach of target oriented sampling strategy based on the selected multivariate statistics can provide a strong reduction in dimensionality of the original data and corresponding costs as well, without any loss of information.

  10. On the Analysis of Case-Control Studies in Cluster-correlated Data Settings.

    PubMed

    Haneuse, Sebastien; Rivera-Rodriguez, Claudia

    2018-01-01

    In resource-limited settings, long-term evaluation of national antiretroviral treatment (ART) programs often relies on aggregated data, the analysis of which may be subject to ecological bias. As researchers and policy makers consider evaluating individual-level outcomes such as treatment adherence or mortality, the well-known case-control design is appealing in that it provides efficiency gains over random sampling. In the context that motivates this article, valid estimation and inference requires acknowledging any clustering, although, to our knowledge, no statistical methods have been published for the analysis of case-control data for which the underlying population exhibits clustering. Furthermore, in the specific context of an ongoing collaboration in Malawi, rather than performing case-control sampling across all clinics, case-control sampling within clinics has been suggested as a more practical strategy. To our knowledge, although similar outcome-dependent sampling schemes have been described in the literature, a case-control design specific to correlated data settings is new. In this article, we describe this design, discuss balanced versus unbalanced sampling techniques, and provide a general approach to analyzing case-control studies in cluster-correlated settings based on inverse probability-weighted generalized estimating equations. Inference is based on a robust sandwich estimator with correlation parameters estimated to ensure appropriate accounting of the outcome-dependent sampling scheme. We conduct comprehensive simulations, based in part on real data on a sample of N = 78,155 program registrants in Malawi between 2005 and 2007, to evaluate small-sample operating characteristics and potential trade-offs associated with standard case-control sampling or when case-control sampling is performed within clusters.

  11. Nanostructured Ag-zeolite Composites as Luminescence-based Humidity Sensors.

    PubMed

    Coutino-Gonzalez, Eduardo; Baekelant, Wouter; Dieu, Bjorn; Roeffaers, Maarten B J; Hofkens, Johan

    2016-11-15

    Small silver clusters confined inside zeolite matrices have recently emerged as a novel type of highly luminescent materials. Their emission has high external quantum efficiencies (EQE) and spans the whole visible spectrum. It has been recently reported that the UV excited luminescence of partially Li-exchanged sodium Linde type A zeolites [LTA(Na)] containing luminescent silver clusters can be controlled by adjusting the water content of the zeolite. These samples showed a dynamic change in their emission color from blue to green and yellow upon an increase of the hydration level of the zeolite, showing the great potential that these materials can have as luminescence-based humidity sensors at the macro and micro scale. Here, we describe the detailed procedure to fabricate a humidity sensor prototype using silver-exchanged zeolite composites. The sensor is produced by suspending the luminescent Ag-zeolites in an aqueous solution of polyethylenimine (PEI) to subsequently deposit a film of the material onto a quartz plate. The coated plate is subjected to several hydration/dehydration cycles to show the functionality of the sensing film.

  12. Nanostructured Ag-zeolite Composites as Luminescence-based Humidity Sensors

    PubMed Central

    Dieu, Bjorn; Roeffaers, Maarten B.J.; Hofkens, Johan

    2016-01-01

    Small silver clusters confined inside zeolite matrices have recently emerged as a novel type of highly luminescent materials. Their emission has high external quantum efficiencies (EQE) and spans the whole visible spectrum. It has been recently reported that the UV excited luminescence of partially Li-exchanged sodium Linde type A zeolites [LTA(Na)] containing luminescent silver clusters can be controlled by adjusting the water content of the zeolite. These samples showed a dynamic change in their emission color from blue to green and yellow upon an increase of the hydration level of the zeolite, showing the great potential that these materials can have as luminescence-based humidity sensors at the macro and micro scale. Here, we describe the detailed procedure to fabricate a humidity sensor prototype using silver-exchanged zeolite composites. The sensor is produced by suspending the luminescent Ag-zeolites in an aqueous solution of polyethylenimine (PEI) to subsequently deposit a film of the material onto a quartz plate. The coated plate is subjected to several hydration/dehydration cycles to show the functionality of the sensing film. PMID:27911397

  13. The theory of variational hybrid quantum-classical algorithms

    NASA Astrophysics Data System (ADS)

    McClean, Jarrod R.; Romero, Jonathan; Babbush, Ryan; Aspuru-Guzik, Alán

    2016-02-01

    Many quantum algorithms have daunting resource requirements when compared to what is available today. To address this discrepancy, a quantum-classical hybrid optimization scheme known as ‘the quantum variational eigensolver’ was developed (Peruzzo et al 2014 Nat. Commun. 5 4213) with the philosophy that even minimal quantum resources could be made useful when used in conjunction with classical routines. In this work we extend the general theory of this algorithm and suggest algorithmic improvements for practical implementations. Specifically, we develop a variational adiabatic ansatz and explore unitary coupled cluster where we establish a connection from second order unitary coupled cluster to universal gate sets through a relaxation of exponential operator splitting. We introduce the concept of quantum variational error suppression that allows some errors to be suppressed naturally in this algorithm on a pre-threshold quantum device. Additionally, we analyze truncation and correlated sampling in Hamiltonian averaging as ways to reduce the cost of this procedure. Finally, we show how the use of modern derivative free optimization techniques can offer dramatic computational savings of up to three orders of magnitude over previously used optimization techniques.

  14. The clustering-based case-based reasoning for imbalanced business failure prediction: a hybrid approach through integrating unsupervised process with supervised process

    NASA Astrophysics Data System (ADS)

    Li, Hui; Yu, Jun-Ling; Yu, Le-An; Sun, Jie

    2014-05-01

    Case-based reasoning (CBR) is one of the main forecasting methods in business forecasting, which performs well in prediction and holds the ability of giving explanations for the results. In business failure prediction (BFP), the number of failed enterprises is relatively small, compared with the number of non-failed ones. However, the loss is huge when an enterprise fails. Therefore, it is necessary to develop methods (trained on imbalanced samples) which forecast well for this small proportion of failed enterprises and performs accurately on total accuracy meanwhile. Commonly used methods constructed on the assumption of balanced samples do not perform well in predicting minority samples on imbalanced samples consisting of the minority/failed enterprises and the majority/non-failed ones. This article develops a new method called clustering-based CBR (CBCBR), which integrates clustering analysis, an unsupervised process, with CBR, a supervised process, to enhance the efficiency of retrieving information from both minority and majority in CBR. In CBCBR, various case classes are firstly generated through hierarchical clustering inside stored experienced cases, and class centres are calculated out by integrating cases information in the same clustered class. When predicting the label of a target case, its nearest clustered case class is firstly retrieved by ranking similarities between the target case and each clustered case class centre. Then, nearest neighbours of the target case in the determined clustered case class are retrieved. Finally, labels of the nearest experienced cases are used in prediction. In the empirical experiment with two imbalanced samples from China, the performance of CBCBR was compared with the classical CBR, a support vector machine, a logistic regression and a multi-variant discriminate analysis. The results show that compared with the other four methods, CBCBR performed significantly better in terms of sensitivity for identifying the minority samples and generated high total accuracy meanwhile. The proposed approach makes CBR useful in imbalanced forecasting.

  15. Cluster and principal component analysis based on SSR markers of Amomum tsao-ko in Jinping County of Yunnan Province

    NASA Astrophysics Data System (ADS)

    Ma, Mengli; Lei, En; Meng, Hengling; Wang, Tiantao; Xie, Linyan; Shen, Dong; Xianwang, Zhou; Lu, Bingyue

    2017-08-01

    Amomum tsao-ko is a commercial plant that used for various purposes in medicinal and food industries. For the present investigation, 44 germplasm samples were collected from Jinping County of Yunnan Province. Clusters analysis and 2-dimensional principal component analysis (PCA) was used to represent the genetic relations among Amomum tsao-ko by using simple sequence repeat (SSR) markers. Clustering analysis clearly distinguished the samples groups. Two major clusters were formed; first (Cluster I) consisted of 34 individuals, the second (Cluster II) consisted of 10 individuals, Cluster I as the main group contained multiple sub-clusters. PCA also showed 2 groups: PCA Group 1 included 29 individuals, PCA Group 2 included 12 individuals, consistent with the results of cluster analysis. The purpose of the present investigation was to provide information on genetic relationship of Amomum tsao-ko germplasm resources in main producing areas, also provide a theoretical basis for the protection and utilization of Amomum tsao-ko resources.

  16. The kinematics of dense clusters of galaxies. II - The distribution of velocity dispersions

    NASA Technical Reports Server (NTRS)

    Zabludoff, Ann I.; Geller, Margaret J.; Huchra, John P.; Ramella, Massimo

    1993-01-01

    From the survey of 31 Abell R above 1 cluster fields within z of 0.02-0.05, we extract 25 dense clusters with velocity dispersions omicron above 300 km/s and with number densities exceeding the mean for the Great Wall of galaxies by one deviation. From the CfA Redshift Survey (in preparation), we obtain an approximately volume-limited catalog of 31 groups with velocity dispersions above 100 km/s and with the same number density limit. We combine these well-defined samples to obtain the distribution of cluster velocity dispersions. The group sample enables us to correct for incompleteness in the Abell catalog at low velocity dispersions. The clusters from the Abell cluster fields populate the high dispersion tail. For systems with velocity dispersions above 700 km/s, approximately the median for R = 1 clusters, the group and cluster abundances are consistent. The combined distribution is consistent with cluster X-ray temperature functions.

  17. The cluster-cluster correlation function. [of galaxies

    NASA Technical Reports Server (NTRS)

    Postman, M.; Geller, M. J.; Huchra, J. P.

    1986-01-01

    The clustering properties of the Abell and Zwicky cluster catalogs are studied using the two-point angular and spatial correlation functions. The catalogs are divided into eight subsamples to determine the dependence of the correlation function on distance, richness, and the method of cluster identification. It is found that the Corona Borealis supercluster contributes significant power to the spatial correlation function to the Abell cluster sample with distance class of four or less. The distance-limited catalog of 152 Abell clusters, which is not greatly affected by a single system, has a spatial correlation function consistent with the power law Xi(r) = 300r exp -1.8. In both the distance class four or less and distance-limited samples the signal in the spatial correlation function is a power law detectable out to 60/h Mpc. The amplitude of Xi(r) for clusters of richness class two is about three times that for richness class one clusters. The two-point spatial correlation function is sensitive to the use of estimated redshifts.

  18. Participation of adults with visual and severe or profound intellectual disabilities: Definition and operationalization.

    PubMed

    Hanzen, Gineke; van Nispen, Ruth M A; van der Putten, Annette A J; Waninge, Aly

    2017-02-01

    The available opinions regarding participation do not appear to be applicable to adults with visual and severe or profound intellectual disabilities (VSPID). Because a clear definition and operationalization are lacking, it is difficult for support professionals to give meaning to participation for adults with VSPID. The purpose of the present study was to develop a definition and operationalization of the concept of participation of adults with VSPID. Parents or family members, professionals, and experts participated in an online concept mapping procedure. This procedure includes generating statements, clustering them, and rating their importance. The data were analyzed quantitatively using multidimensional scaling and qualitatively with triangulation. A total of 53 participants generated 319 statements of which 125 were clustered and rated. The final cluster map of the statements contained seven clusters: (1) Experience and discover; (2) Inclusion; (3) Involvement; (4) Leisure and recreation; (5) Communication and being understood; (6) Social relations; and (7) Self-management and autonomy. The average importance rating of the statements varied from 6.49 to 8.95. A definition of participation of this population was developed which included these seven clusters. The combination of the developed definition, the clusters, and the statements in these clusters, derived from the perceptions of parents or family members, professionals, and experts, can be employed to operationalize the construct of participation of adults with VSPID. This operationalization supports professionals in their ability to give meaning to participation in these adults. Future research will focus on using the operationalization as a checklist of participation for adults with VSPID. Copyright © 2016 Elsevier Ltd. All rights reserved.

  19. The effect of clustering on perceived quantity in humans (Homo sapiens) and in chicks (Gallus gallus).

    PubMed

    Bertamini, Marco; Guest, Martin; Vallortigara, Giorgio; Rugani, Rosa; Regolin, Lucia

    2018-04-30

    Animals can perceive the numerosity of sets of visual elements. Qualitative and quantitative similarities in different species suggest the existence of a shared system (approximate number system). Biases associated with sensory properties are informative about the underlying mechanisms. In humans, regular spacing increases perceived numerosity (regular-random numerosity illusion). This has led to a model that predicts numerosity based on occupancy (a measure that decreases when elements are close together). We used a procedure in which observers selected one of two stimuli and were given feedback with respect to whether the choice was correct. One configuration had 20 elements and the other 40, randomly placed inside a circular region. Participants had to discover the rule based on feedback. Because density and clustering covaried with numerosity, different dimensions could be used. After reaching a criterion, test trials presented two types of configurations with 30 elements. One type had a larger interelement distance than the other (high or low clustering). If observers had adopted a numerosity strategy, they would choose low clustering (if reinforced with 40) and high clustering (if reinforced with 20). A clustering or density strategy predicts the opposite. Human adults used a numerosity strategy. Chicks were tested using a similar procedure. There were two behavioral measures: first approach response and final circumnavigation (walking behind the screen). The prediction based on numerosity was confirmed by the first approach data. For chicks, one clear pattern from both responses was a preference for the configurations with higher clustering. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  20. Edge Principal Components and Squash Clustering: Using the Special Structure of Phylogenetic Placement Data for Sample Comparison

    PubMed Central

    Matsen IV, Frederick A.; Evans, Steven N.

    2013-01-01

    Principal components analysis (PCA) and hierarchical clustering are two of the most heavily used techniques for analyzing the differences between nucleic acid sequence samples taken from a given environment. They have led to many insights regarding the structure of microbial communities. We have developed two new complementary methods that leverage how this microbial community data sits on a phylogenetic tree. Edge principal components analysis enables the detection of important differences between samples that contain closely related taxa. Each principal component axis is a collection of signed weights on the edges of the phylogenetic tree, and these weights are easily visualized by a suitable thickening and coloring of the edges. Squash clustering outputs a (rooted) clustering tree in which each internal node corresponds to an appropriate “average” of the original samples at the leaves below the node. Moreover, the length of an edge is a suitably defined distance between the averaged samples associated with the two incident nodes, rather than the less interpretable average of distances produced by UPGMA, the most widely used hierarchical clustering method in this context. We present these methods and illustrate their use with data from the human microbiome. PMID:23505415

  1. Uranium hydrogeochemical and stream sediment reconnaissance of the Albuquerque NTMS Quadrangle, New Mexico, including concentrations of forty-three additional elements

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Maassen, L.W.; Bolivar, S.L.

    1979-06-01

    The Los Alamos Scientific Laboratory conducted a hydrogeochemical and stream sediment reconnaissance for uranium. Totals of 408 water and 1538 sediment samples were collected from 1802 locations over a 20 100-km/sup 2/ area at an average density of one location per 11 km/sup 2/. Water samples were collected from springs, wells, and streams; sediments samples were collected predominantly from streams, but also from springs. All water samples were analyzed for uranium and 12 other elements. Sediment samples were analyzed for uranium and 42 additional elements. The uranium concentrations in water samples range from below the detection limit of 0.02 ppBmore » to 194.06 ppB. The mean uranium concentration for all water types containing < 40 ppB uranium is 1.98 ppB. Six samples contained uranium concentrations > 40.00 ppB. Well waters have the highest mean uranium concentration; spring waters have the lowest. Clusters of water samples that contain anomalous uranium concentrations are delineated in nine areas. Sediments collected from the quadrangle have uranium concentrations that range between 0.63 ppM and 28.52 ppM, with a mean for all sediments of 3.53 ppM. Eight areas containing clusters of sediments with anomalous uranium concentrations are delineated. One cluster contains sample locations within the Ambrosia Lake uranium district. Five clusters of sediment samples with anomalous uranium concentrations were collected from streams that drain the Jemez volcanic field. Another cluster defines an area just northeast of Albuquerque where streams drain Precambrian rocks, predominantly granites, of the Sandia Mountains. The last cluster, consisting of spring sediments from Mesa Portales, was collected near the contact of the Tertiary Ojo Alamo sandstone with underlying Cretaceous sediments. Sediments from these springs exhibit some of the highest uranium values reported and are associated with high uranium/thorium ratios.« less

  2. Automatic detection of spiculation of pulmonary nodules in computed tomography images

    NASA Astrophysics Data System (ADS)

    Ciompi, F.; Jacobs, C.; Scholten, E. T.; van Riel, S. J.; W. Wille, M. M.; Prokop, M.; van Ginneken, B.

    2015-03-01

    We present a fully automatic method for the assessment of spiculation of pulmonary nodules in low-dose Computed Tomography (CT) images. Spiculation is considered as one of the indicators of nodule malignancy and an important feature to assess in order to decide on a patient-tailored follow-up procedure. For this reason, lung cancer screening scenario would benefit from the presence of a fully automatic system for the assessment of spiculation. The presented framework relies on the fact that spiculated nodules mainly differ from non-spiculated ones in their morphology. In order to discriminate the two categories, information on morphology is captured by sampling intensity profiles along circular patterns on spherical surfaces centered on the nodule, in a multi-scale fashion. Each intensity profile is interpreted as a periodic signal, where the Fourier transform is applied, obtaining a spectrum. A library of spectra is created by clustering data via unsupervised learning. The centroids of the clusters are used to label back each spectrum in the sampling pattern. A compact descriptor encoding the nodule morphology is obtained as the histogram of labels along all the spherical surfaces and used to classify spiculated nodules via supervised learning. We tested our approach on a set of nodules from the Danish Lung Cancer Screening Trial (DLCST) dataset. Our results show that the proposed method outperforms other 3-D descriptors of morphology in the automatic assessment of spiculation.

  3. Feature learning and change feature classification based on deep learning for ternary change detection in SAR images

    NASA Astrophysics Data System (ADS)

    Gong, Maoguo; Yang, Hailun; Zhang, Puzhao

    2017-07-01

    Ternary change detection aims to detect changes and group the changes into positive change and negative change. It is of great significance in the joint interpretation of spatial-temporal synthetic aperture radar images. In this study, sparse autoencoder, convolutional neural networks (CNN) and unsupervised clustering are combined to solve ternary change detection problem without any supervison. Firstly, sparse autoencoder is used to transform log-ratio difference image into a suitable feature space for extracting key changes and suppressing outliers and noise. And then the learned features are clustered into three classes, which are taken as the pseudo labels for training a CNN model as change feature classifier. The reliable training samples for CNN are selected from the feature maps learned by sparse autoencoder with certain selection rules. Having training samples and the corresponding pseudo labels, the CNN model can be trained by using back propagation with stochastic gradient descent. During its training procedure, CNN is driven to learn the concept of change, and more powerful model is established to distinguish different types of changes. Unlike the traditional methods, the proposed framework integrates the merits of sparse autoencoder and CNN to learn more robust difference representations and the concept of change for ternary change detection. Experimental results on real datasets validate the effectiveness and superiority of the proposed framework.

  4. Correlates of comorbid depression, anxiety and helplessness with obsessive-compulsive disorder in Chinese adolescents.

    PubMed

    Sun, Jing; Li, Zhanjiang; Buys, Nicholas; Storch, Eric A

    2015-03-15

    Youth with obsessive-compulsive disorder (OCD) are at risk of experiencing comorbid psychiatric conditions, such as depression and anxiety. Studies of Chinese adolescents with OCD are limited. The aim of this study was to investigate the association of depression, anxiety, and helplessness with the occurrence of OCD in Chinese adolescents. This study consisted of two stages. The first stage used a cross-sectional design involving a stratified clustered non-clinical sample of 3174 secondary school students. A clinical interview procedure was then employed to diagnose OCD in students who had a Leyton 'yes' score of 15 or above. The second phase used a case-control study design to examine the relationship of OCD to depression, anxiety and helplessness in a matched sample of 288 adolescents with clinically diagnosed OCD and 246 students without OCD. Helplessness, depression and anxiety scores were directly associated with the probability of OCD caseness. Canonical correlation analysis indicated that the OCD correlated significantly with depression, anxiety, and helplessness. Cluster analysis further indicated that the degree of the OCD is also associated with severity of depression and anxiety, and the level of helplessness. These findings suggest that depression, anxiety and helplessness are important correlates of OCD in Chinese adolescents. Future studies using longitudinal and prospective designs are required to confirm these relationships as causal. Copyright © 2014 Elsevier B.V. All rights reserved.

  5. Measuring Vocational Preferences: Ranking versus Categorical Rating Procedures.

    ERIC Educational Resources Information Center

    Carifio, James

    1978-01-01

    Describes a study to compare the relative validities of ranking v categorical rating procedures for obtaining student vocational preference data in exploratory program assignment situations. Students indicated their vocational program preferences from career clusters, and the frequency of wrong assignments made by each method was analyzed. (MF)

  6. Inference from clustering with application to gene-expression microarrays.

    PubMed

    Dougherty, Edward R; Barrera, Junior; Brun, Marcel; Kim, Seungchan; Cesar, Roberto M; Chen, Yidong; Bittner, Michael; Trent, Jeffrey M

    2002-01-01

    There are many algorithms to cluster sample data points based on nearness or a similarity measure. Often the implication is that points in different clusters come from different underlying classes, whereas those in the same cluster come from the same class. Stochastically, the underlying classes represent different random processes. The inference is that clusters represent a partition of the sample points according to which process they belong. This paper discusses a model-based clustering toolbox that evaluates cluster accuracy. Each random process is modeled as its mean plus independent noise, sample points are generated, the points are clustered, and the clustering error is the number of points clustered incorrectly according to the generating random processes. Various clustering algorithms are evaluated based on process variance and the key issue of the rate at which algorithmic performance improves with increasing numbers of experimental replications. The model means can be selected by hand to test the separability of expected types of biological expression patterns. Alternatively, the model can be seeded by real data to test the expected precision of that output or the extent of improvement in precision that replication could provide. In the latter case, a clustering algorithm is used to form clusters, and the model is seeded with the means and variances of these clusters. Other algorithms are then tested relative to the seeding algorithm. Results are averaged over various seeds. Output includes error tables and graphs, confusion matrices, principal-component plots, and validation measures. Five algorithms are studied in detail: K-means, fuzzy C-means, self-organizing maps, hierarchical Euclidean-distance-based and correlation-based clustering. The toolbox is applied to gene-expression clustering based on cDNA microarrays using real data. Expression profile graphics are generated and error analysis is displayed within the context of these profile graphics. A large amount of generated output is available over the web.

  7. Clustered lot quality assurance sampling: a tool to monitor immunization coverage rapidly during a national yellow fever and polio vaccination campaign in Cameroon, May 2009.

    PubMed

    Pezzoli, L; Tchio, R; Dzossa, A D; Ndjomo, S; Takeu, A; Anya, B; Ticha, J; Ronveaux, O; Lewis, R F

    2012-01-01

    We used the clustered lot quality assurance sampling (clustered-LQAS) technique to identify districts with low immunization coverage and guide mop-up actions during the last 4 days of a combined oral polio vaccine (OPV) and yellow fever (YF) vaccination campaign conducted in Cameroon in May 2009. We monitored 17 pre-selected districts at risk for low coverage. We designed LQAS plans to reject districts with YF vaccination coverage <90% and with OPV coverage <95%. In each lot the sample size was 50 (five clusters of 10) with decision values of 3 for assessing OPV and 7 for YF coverage. We 'rejected' 10 districts for low YF coverage and 14 for low OPV coverage. Hence we recommended a 2-day extension of the campaign. Clustered-LQAS proved to be useful in guiding the campaign vaccination strategy before the completion of the operations.

  8. The evolution in the stellar mass of brightest cluster galaxies over the past 10 billion years

    NASA Astrophysics Data System (ADS)

    Bellstedt, Sabine; Lidman, Chris; Muzzin, Adam; Franx, Marijn; Guatelli, Susanna; Hill, Allison R.; Hoekstra, Henk; Kurinsky, Noah; Labbe, Ivo; Marchesini, Danilo; Marsan, Z. Cemile; Safavi-Naeini, Mitra; Sifón, Cristóbal; Stefanon, Mauro; van de Sande, Jesse; van Dokkum, Pieter; Weigel, Catherine

    2016-08-01

    Using a sample of 98 galaxy clusters recently imaged in the near-infrared with the European Southern Observatory (ESO) New Technology Telescope, WIYN telescope and William Herschel Telescope, supplemented with 33 clusters from the ESO archive, we measure how the stellar mass of the most massive galaxies in the universe, namely brightest cluster galaxies (BCGs), increases with time. Most of the BCGs in this new sample lie in the redshift range 0.2 < z < 0.6, which has been noted in recent works to mark an epoch over which the growth in the stellar mass of BCGs stalls. From this sample of 132 clusters, we create a subsample of 102 systems that includes only those clusters that have estimates of the cluster mass. We combine the BCGs in this subsample with BCGs from the literature, and find that the growth in stellar mass of BCGs from 10 billion years ago to the present epoch is broadly consistent with recent semi-analytic and semi-empirical models. As in other recent studies, tentative evidence indicates that the stellar mass growth rate of BCGs may be slowing in the past 3.5 billion years. Further work in collecting larger samples, and in better comparing observations with theory using mock images, is required if a more detailed comparison between the models and the data is to be made.

  9. VizieR Online Data Catalog: Star clusters distances and extinctions. II. (Buckner+, 2014)

    NASA Astrophysics Data System (ADS)

    Buckner, A. S. M.; Froebrich, D.

    2015-04-01

    Until now, it has been impossible to observationally measure how star cluster scaleheight evolves beyond 1Gyr as only small samples have been available. Here, we establish a novel method to determine the scaleheight of a cluster sample using modelled distributions and Kolmogorov-Smirnov tests. This allows us to determine the scaleheight with a 25% accuracy for samples of 38 clusters or more. We apply our method to investigate the temporal evolution of cluster scaleheight, using homogeneously selected sub-samples of Kharchenko et al. (MWSC, 2012, Cat. J/A+A/543/A156, 2013, J/A+A/558/A53 ), Dias et al. (DAML02, 2002A&A...389..871D, Cat. B/ocl), WEBDA, and Froebrich et al. (FSR, 2007MNRAS.374..399F, Cat. J/MNRAS/374/399). We identify a linear relationship between scaleheight and log(age/yr) of clusters, considerably different from field stars. The scaleheight increases from about 40pc at 1Myr to 75pc at 1Gyr, most likely due to internal evolution and external scattering events. After 1Gyr, there is a marked change of the behaviour, with the scaleheight linearly increasing with log(age/yr) to about 550pc at 3.5Gyr. The most likely interpretation is that the surviving clusters are only observable because they have been scattered away from the mid-plane in their past. A detailed understanding of this observational evidence can only be achieved with numerical simulations of the evolution of cluster samples in the Galactic disc. Furthermore, we find a weak trend of an age-independent increase in scaleheight with Galactocentric distance. There are no significant temporal or spatial variations of the cluster distribution zero-point. We determine the Sun's vertical displacement from the Galactic plane as Z⊙=18.5+/-1.2pc. (1 data file).

  10. Massive and refined: A sample of large galaxy clusters simulated at high resolution. I: Thermal gas and properties of shock waves

    NASA Astrophysics Data System (ADS)

    Vazza, F.; Brunetti, G.; Gheller, C.; Brunino, R.

    2010-11-01

    We present a sample of 20 massive galaxy clusters with total virial masses in the range of 6 × 10 14 M ⊙ ⩽ Mvir ⩽ 2 × 10 15 M ⊙, re-simulated with a customized version of the 1.5. ENZO code employing adaptive mesh refinement. This technique allowed us to obtain unprecedented high spatial resolution (≈25 kpc/h) up to the distance of ˜3 virial radii from the clusters center, and makes it possible to focus with the same level of detail on the physical properties of the innermost and of the outermost cluster regions, providing new clues on the role of shock waves and turbulent motions in the ICM, across a wide range of scales. In this paper, a first exploratory study of this data set is presented. We report on the thermal properties of galaxy clusters at z = 0. Integrated and morphological properties of gas density, gas temperature, gas entropy and baryon fraction distributions are discussed, and compared with existing outcomes both from the observational and from the numerical literature. Our cluster sample shows an overall good consistency with the results obtained adopting other numerical techniques (e.g. Smoothed Particles Hydrodynamics), yet it provides a more accurate representation of the accretion patterns far outside the cluster cores. We also reconstruct the properties of shock waves within the sample by means of a velocity-based approach, and we study Mach numbers and energy distributions for the various dynamical states in clusters, giving estimates for the injection of Cosmic Rays particles at shocks. The present sample is rather unique in the panorama of cosmological simulations of massive galaxy clusters, due to its dynamical range, statistics of objects and number of time outputs. For this reason, we deploy a public repository of the available data, accessible via web portal at http://data.cineca.it.

  11. Technique for fast and efficient hierarchical clustering

    DOEpatents

    Stork, Christopher

    2013-10-08

    A fast and efficient technique for hierarchical clustering of samples in a dataset includes compressing the dataset to reduce a number of variables within each of the samples of the dataset. A nearest neighbor matrix is generated to identify nearest neighbor pairs between the samples based on differences between the variables of the samples. The samples are arranged into a hierarchy that groups the samples based on the nearest neighbor matrix. The hierarchy is rendered to a display to graphically illustrate similarities or differences between the samples.

  12. (GTG)5 MSP-PCR fingerprinting as a technique for discrimination of wine associated yeasts?

    PubMed

    Ramírez-Castrillón, Mauricio; Mendes, Sandra Denise Camargo; Inostroza-Ponta, Mario; Valente, Patricia

    2014-01-01

    In microbiology, identification of all isolates by sequencing is still unfeasible in small research laboratories. Therefore, many yeast diversity studies follow a screening procedure consisting of clustering the yeast isolates using MSP-PCR fingerprinting, followed by identification of one or a few selected representatives of each cluster by sequencing. Although this procedure has been widely applied in the literature, it has not been properly validated. We evaluated a standardized protocol using MSP-PCR fingerprinting with the primers (GTG)5 and M13 for the discrimination of wine associated yeasts in South Brazil. Two datasets were used: yeasts isolated from bottled wines and vineyard environments. We compared the discriminatory power of both primers in a subset of 16 strains, choosing the primer (GTG)5 for further evaluation. Afterwards, we applied this technique to 245 strains, and compared the results with the identification obtained by partial sequencing of the LSU rRNA gene, considered as the gold standard. An array matrix was constructed for each dataset and used as input for clustering with two methods (hierarchical dendrograms and QAPGrid layout). For both yeast datasets, unrelated species were clustered in the same group. The sensitivity score of (GTG)5 MSP-PCR fingerprinting was high, but specificity was low. As a conclusion, the yeast diversity inferred in several previous studies may have been underestimated and some isolates were probably misidentified due to the compliance to this screening procedure.

  13. (GTG)5 MSP-PCR Fingerprinting as a Technique for Discrimination of Wine Associated Yeasts?

    PubMed Central

    Inostroza-Ponta, Mario; Valente, Patricia

    2014-01-01

    In microbiology, identification of all isolates by sequencing is still unfeasible in small research laboratories. Therefore, many yeast diversity studies follow a screening procedure consisting of clustering the yeast isolates using MSP-PCR fingerprinting, followed by identification of one or a few selected representatives of each cluster by sequencing. Although this procedure has been widely applied in the literature, it has not been properly validated. We evaluated a standardized protocol using MSP-PCR fingerprinting with the primers (GTG)5 and M13 for the discrimination of wine associated yeasts in South Brazil. Two datasets were used: yeasts isolated from bottled wines and vineyard environments. We compared the discriminatory power of both primers in a subset of 16 strains, choosing the primer (GTG)5 for further evaluation. Afterwards, we applied this technique to 245 strains, and compared the results with the identification obtained by partial sequencing of the LSU rRNA gene, considered as the gold standard. An array matrix was constructed for each dataset and used as input for clustering with two methods (hierarchical dendrograms and QAPGrid layout). For both yeast datasets, unrelated species were clustered in the same group. The sensitivity score of (GTG)5 MSP-PCR fingerprinting was high, but specificity was low. As a conclusion, the yeast diversity inferred in several previous studies may have been underestimated and some isolates were probably misidentified due to the compliance to this screening procedure. PMID:25171185

  14. Cluster Analysis of the Yale Global Tic Severity Scale (YGTSS): Symptom Dimensions and Clinical Correlates in an Outpatient Youth Sample

    ERIC Educational Resources Information Center

    Kircanski, Katharina; Woods, Douglas W.; Chang, Susanna W.; Ricketts, Emily J.; Piacentini, John C.

    2010-01-01

    Tic disorders are heterogeneous, with symptoms varying widely both within and across patients. Exploration of symptom clusters may aid in the identification of symptom dimensions of empirical and treatment import. This article presents the results of two studies investigating tic symptom clusters using a sample of 99 youth (M age = 10.7, 81% male,…

  15. Relations between the Woodcock-Johnson III Clinical Clusters and Measures of Executive Functions from the Delis-Kaplan Executive Function System

    ERIC Educational Resources Information Center

    Floyd, Randy G.; McCormack, Allison C.; Ingram, Elizabeth L.; Davis, Amy E.; Bergeron, Renee; Hamilton, Gloria

    2006-01-01

    This study examined the convergent relations between scores from four clinical clusters from the Woodcock-Johnson III Tests of Cognitive Abilities (WJ III) and measures of executive functions using a sample of school-aged children and a sample of adults. The WJ III clinical clusters included the Working Memory, Cognitive Fluency, Broad Attention,…

  16. Liver Gene Expression Profiles of Rats Treated with Clofibric Acid

    PubMed Central

    Michel, Cécile; Desdouets, Chantal; Sacre-Salem, Béatrice; Gautier, Jean-Charles; Roberts, Ruth; Boitier, Eric

    2003-01-01

    Clofibric acid (CLO) is a peroxisome proliferator (PP) that acts through the peroxisome proliferator activated receptor α, leading to hepatocarcinogenesis in rodents. CLO-induced hepatocarcinogenesis is a multi-step process, first transforming normal liver cells into foci. The combination of laser capture microdissection (LCM) and genomics has the potential to provide expression profiles from such small cell clusters, giving an opportunity to understand the process of cancer development in response to PPs. To our knowledge, this is the first evaluation of the impact of the successive steps of LCM procedure on gene expression profiling by comparing profiles from LCM samples to those obtained with non-microdissected liver samples collected after a 1 month CLO treatment in the rat. We showed that hematoxylin and eosin (H&E) staining and laser microdissection itself do not impact on RNA quality. However, the overall process of the LCM procedure affects the RNA quality, resulting in a bias in the gene profiles. Nonetheless, this bias did not prevent accurate determination of a CLO-specific molecular signature. Thus, gene-profiling analysis of microdissected foci, identified by H&E staining may provide insight into the mechanisms underlying non-genotoxic hepatocarcinogenesis in the rat by allowing identification of specific genes that are regulated by CLO in early pre-neoplastic foci. PMID:14633594

  17. Liver gene expression profiles of rats treated with clofibric acid: comparison of whole liver and laser capture microdissected liver.

    PubMed

    Michel, Cécile; Desdouets, Chantal; Sacre-Salem, Béatrice; Gautier, Jean-Charles; Roberts, Ruth; Boitier, Eric

    2003-12-01

    Clofibric acid (CLO) is a peroxisome proliferator (PP) that acts through the peroxisome proliferator activated receptor alpha, leading to hepatocarcinogenesis in rodents. CLO-induced hepatocarcinogenesis is a multi-step process, first transforming normal liver cells into foci. The combination of laser capture microdissection (LCM) and genomics has the potential to provide expression profiles from such small cell clusters, giving an opportunity to understand the process of cancer development in response to PPs. To our knowledge, this is the first evaluation of the impact of the successive steps of LCM procedure on gene expression profiling by comparing profiles from LCM samples to those obtained with non-microdissected liver samples collected after a 1 month CLO treatment in the rat. We showed that hematoxylin and eosin (H&E) staining and laser microdissection itself do not impact on RNA quality. However, the overall process of the LCM procedure affects the RNA quality, resulting in a bias in the gene profiles. Nonetheless, this bias did not prevent accurate determination of a CLO-specific molecular signature. Thus, gene-profiling analysis of microdissected foci, identified by H&E staining may provide insight into the mechanisms underlying non-genotoxic hepatocarcinogenesis in the rat by allowing identification of specific genes that are regulated by CLO in early pre-neoplastic foci.

  18. A closer look at cross-validation for assessing the accuracy of gene regulatory networks and models.

    PubMed

    Tabe-Bordbar, Shayan; Emad, Amin; Zhao, Sihai Dave; Sinha, Saurabh

    2018-04-26

    Cross-validation (CV) is a technique to assess the generalizability of a model to unseen data. This technique relies on assumptions that may not be satisfied when studying genomics datasets. For example, random CV (RCV) assumes that a randomly selected set of samples, the test set, well represents unseen data. This assumption doesn't hold true where samples are obtained from different experimental conditions, and the goal is to learn regulatory relationships among the genes that generalize beyond the observed conditions. In this study, we investigated how the CV procedure affects the assessment of supervised learning methods used to learn gene regulatory networks (or in other applications). We compared the performance of a regression-based method for gene expression prediction estimated using RCV with that estimated using a clustering-based CV (CCV) procedure. Our analysis illustrates that RCV can produce over-optimistic estimates of the model's generalizability compared to CCV. Next, we defined the 'distinctness' of test set from training set and showed that this measure is predictive of performance of the regression method. Finally, we introduced a simulated annealing method to construct partitions with gradually increasing distinctness and showed that performance of different gene expression prediction methods can be better evaluated using this method.

  19. Task Analysis for Health Occupations. Cluster: Medical Assisting. Occupation: Medical Assistant. Education for Employment Task Lists.

    ERIC Educational Resources Information Center

    Lathrop, Janice

    Task analyses are provided for two duty areas for the occupation of medical assistant in the medical assisting cluster. Five tasks for the duty area "providing therapeutic measures" are as follows: assist with dressing change, apply clean dressing, apply elastic bandage, assist physician in therapeutic procedure, and apply topical…

  20. A Preliminary Comparison of the Effectiveness of Cluster Analysis Weighting Procedures for Within-Group Covariance Structure.

    ERIC Educational Resources Information Center

    Donoghue, John R.

    A Monte Carlo study compared the usefulness of six variable weighting methods for cluster analysis. Data were 100 bivariate observations from 2 subgroups, generated according to a finite normal mixture model. Subgroup size, within-group correlation, within-group variance, and distance between subgroup centroids were manipulated. Of the clustering…

  1. HLM in Cluster-Randomised Trials--Measuring Efficacy across Diverse Populations of Learners

    ERIC Educational Resources Information Center

    Hegedus, Stephen; Tapper, John; Dalton, Sara; Sloane, Finbarr

    2013-01-01

    We describe the application of Hierarchical Linear Modelling (HLM) in a cluster-randomised study to examine learning algebraic concepts and procedures in an innovative, technology-rich environment in the US. HLM is applied to measure the impact of such treatment on learning and on contextual variables. We provide a detailed description of such…

  2. Early Results from Swift AGN and Cluster Survey

    NASA Astrophysics Data System (ADS)

    Dai, Xinyu; Griffin, Rhiannon; Nugent, Jenna; Kochanek, Christopher S.; Bregman, Joel N.

    2016-04-01

    The Swift AGN and Cluster Survey (SACS) uses 125 deg^2 of Swift X-ray Telescope serendipitous fields with variable depths surrounding gamma-ray bursts to provide a medium depth (4 × 10^-15 erg cm^-2 s^-1) and area survey filling the gap between deep, narrow Chandra/XMM-Newton surveys and wide, shallow ROSAT surveys. Here, we present the first two papers in a series of publications for SACS. In the first paper, we introduce our method and catalog of 22,563 point sources and 442 extended sources. SACS provides excellent constraints on the AGN and cluster number counts at the bright end with negligible uncertainties due to cosmic variance, and these constraints are consistent with previous measurements. The depth and areal coverage of SACS is well suited for galaxy cluster surveys outside the local universe, reaching z > 1 for massive clusters. In the second paper, we use SDSS DR8 data to study the 203 extended SACS sources that are located within the SDSS footprint. We search for galaxy over-densities in 3-D space using SDSS galaxies and their photometric redshifts near the Swift galaxy cluster candidates. We find 103 Swift clusters with a > 3σ over-density. The remaining targets are potentially located at higher redshifts and require deeper optical follow-up observations for confirmations as galaxy clusters. We present a series of cluster properties including the redshift, BCG magnitude, BCG-to-X-ray center offset, optical richness, X-ray luminosity and red sequences. We compare the observed redshift distribution of the sample with a theoretical model, and find that our sample is complete for z ≤ 0.3 and 80% complete for z ≤ 0.4, consistent with the survey depth of SDSS. These analysis results suggest that our Swift cluster selection algorithm presented in our first paper has yielded a statistically well-defined cluster sample for further studying cluster evolution and cosmology. In the end, we will discuss our ongoing optical identification of z>0.5 cluster sample, using MDM, KPNO, CTIO, and Magellan data, and discuss SACS as a pilot for eROSITA deep surveys.

  3. Do X-ray dark or underluminous galaxy clusters exist?

    NASA Astrophysics Data System (ADS)

    Andreon, S.; Moretti, A.

    2011-12-01

    We study the X-ray properties of a color-selected sample of clusters at 0.1 < z < 0.3, to quantify the real aboundance of the population of X-ray dark or underluminous clusters and at the same time the spurious detection contamination level of color-selected cluster catalogs. Starting from a local sample of color-selected clusters, we restrict our attention to those with sufficiently deep X-ray observations to probe their X-ray luminosity down to very faint values and without introducing any X-ray bias. This allowed us to have an X-ray- unbiased sample of 33 clusters to measure the LX-richness relation. Swift 1.4 Ms X-ray observations show that at least 89% of the color-detected clusters are real objects with a potential well deep enough to heat and retain an intracluster medium. The percentage rises to 94% when one includes the single spectroscopically confirmed color-selected cluster whose X-ray emission is not secured. Looking at our results from the opposite perspective, the percentage of X-ray dark clusters among color-selected clusters is very low: at most about 11 per cent (at 90% confidence). Supplementing our data with those from literature, we conclude that X-ray- and color- cluster surveys sample the same population and consequently that in this regard we can safely use clusters selected with any of the two methods for cosmological purposes. This is an essential and promising piece of information for upcoming surveys in both the optical/IR (DES, EUCLID) and X-ray (eRosita). Richness correlates with X-ray luminosity with a large scatter, 0.51 ± 0.08 (0.44 ± 0.07) dex in lgLX at a given richness, when Lx is measured in a 500 (1070) kpc aperture. We release data and software to estimate the X-ray flux, or its upper limit, of a source with over-Poisson background fluctuations (found in this work to be ~20% on cluster angular scales) and to fit X-ray luminosity vs richness if there is an intrinsic scatter. These Bayesian applications rigorously account for boundaries (e.g., the X-ray luminosity and the richness cannot be negative).

  4. Cluster-sample surveys and lot quality assurance sampling to evaluate yellow fever immunisation coverage following a national campaign, Bolivia, 2007.

    PubMed

    Pezzoli, Lorenzo; Pineda, Silvia; Halkyer, Percy; Crespo, Gladys; Andrews, Nick; Ronveaux, Olivier

    2009-03-01

    To estimate the yellow fever (YF) vaccine coverage for the endemic and non-endemic areas of Bolivia and to determine whether selected districts had acceptable levels of coverage (>70%). We conducted two surveys of 600 individuals (25 x 12 clusters) to estimate coverage in the endemic and non-endemic areas. We assessed 11 districts using lot quality assurance sampling (LQAS). The lot (district) sample was 35 individuals with six as decision value (alpha error 6% if true coverage 70%; beta error 6% if true coverage 90%). To increase feasibility, we divided the lots into five clusters of seven individuals; to investigate the effect of clustering, we calculated alpha and beta by conducting simulations where each cluster's true coverage was sampled from a normal distribution with a mean of 70% or 90% and standard deviations of 5% or 10%. Estimated coverage was 84.3% (95% CI: 78.9-89.7) in endemic areas, 86.8% (82.5-91.0) in non-endemic and 86.0% (82.8-89.1) nationally. LQAS showed that four lots had unacceptable coverage levels. In six lots, results were inconsistent with the estimated administrative coverage. The simulations suggested that the effect of clustering the lots is unlikely to have significantly increased the risk of making incorrect accept/reject decisions. Estimated YF coverage was high. Discrepancies between administrative coverage and LQAS results may be due to incorrect population data. Even allowing for clustering in LQAS, the statistical errors would remain low. Catch-up campaigns are recommended in districts with unacceptable coverage.

  5. Dark Energy Survey Year 1 Results: redshift distributions of the weak-lensing source galaxies

    NASA Astrophysics Data System (ADS)

    Hoyle, B.; Gruen, D.; Bernstein, G. M.; Rau, M. M.; De Vicente, J.; Hartley, W. G.; Gaztanaga, E.; DeRose, J.; Troxel, M. A.; Davis, C.; Alarcon, A.; MacCrann, N.; Prat, J.; Sánchez, C.; Sheldon, E.; Wechsler, R. H.; Asorey, J.; Becker, M. R.; Bonnett, C.; Carnero Rosell, A.; Carollo, D.; Carrasco Kind, M.; Castander, F. J.; Cawthon, R.; Chang, C.; Childress, M.; Davis, T. M.; Drlica-Wagner, A.; Gatti, M.; Glazebrook, K.; Gschwend, J.; Hinton, S. R.; Hoormann, J. K.; Kim, A. G.; King, A.; Kuehn, K.; Lewis, G.; Lidman, C.; Lin, H.; Macaulay, E.; Maia, M. A. G.; Martini, P.; Mudd, D.; Möller, A.; Nichol, R. C.; Ogando, R. L. C.; Rollins, R. P.; Roodman, A.; Ross, A. J.; Rozo, E.; Rykoff, E. S.; Samuroff, S.; Sevilla-Noarbe, I.; Sharp, R.; Sommer, N. E.; Tucker, B. E.; Uddin, S. A.; Varga, T. N.; Vielzeuf, P.; Yuan, F.; Zhang, B.; Abbott, T. M. C.; Abdalla, F. B.; Allam, S.; Annis, J.; Bechtol, K.; Benoit-Lévy, A.; Bertin, E.; Brooks, D.; Buckley-Geer, E.; Burke, D. L.; Busha, M. T.; Capozzi, D.; Carretero, J.; Crocce, M.; D'Andrea, C. B.; da Costa, L. N.; DePoy, D. L.; Desai, S.; Diehl, H. T.; Doel, P.; Eifler, T. F.; Estrada, J.; Evrard, A. E.; Fernandez, E.; Flaugher, B.; Fosalba, P.; Frieman, J.; García-Bellido, J.; Gerdes, D. W.; Giannantonio, T.; Goldstein, D. A.; Gruendl, R. A.; Gutierrez, G.; Honscheid, K.; James, D. J.; Jarvis, M.; Jeltema, T.; Johnson, M. W. G.; Johnson, M. D.; Kirk, D.; Krause, E.; Kuhlmann, S.; Kuropatkin, N.; Lahav, O.; Li, T. S.; Lima, M.; March, M.; Marshall, J. L.; Melchior, P.; Menanteau, F.; Miquel, R.; Nord, B.; O'Neill, C. R.; Plazas, A. A.; Romer, A. K.; Sako, M.; Sanchez, E.; Santiago, B.; Scarpine, V.; Schindler, R.; Schubnell, M.; Smith, M.; Smith, R. C.; Soares-Santos, M.; Sobreira, F.; Suchyta, E.; Swanson, M. E. C.; Tarle, G.; Thomas, D.; Tucker, D. L.; Vikram, V.; Walker, A. R.; Weller, J.; Wester, W.; Wolf, R. C.; Yanny, B.; Zuntz, J.

    2018-07-01

    We describe the derivation and validation of redshift distribution estimates and their uncertainties for the populations of galaxies used as weak-lensing sources in the Dark Energy Survey (DES) Year 1 cosmological analyses. The Bayesian Photometric Redshift (BPZ) code is used to assign galaxies to four redshift bins between z ≈ 0.2 and ≈1.3, and to produce initial estimates of the lensing-weighted redshift distributions n^i_PZ(z)∝ dn^i/dz for members of bin i. Accurate determination of cosmological parameters depends critically on knowledge of ni, but is insensitive to bin assignments or redshift errors for individual galaxies. The cosmological analyses allow for shifts n^i(z)=n^i_PZ(z-Δ z^i) to correct the mean redshift of ni(z) for biases in n^i_PZ. The Δzi are constrained by comparison of independently estimated 30-band photometric redshifts of galaxies in the Cosmic Evolution Survey (COSMOS) field to BPZ estimates made from the DES griz fluxes, for a sample matched in fluxes, pre-seeing size, and lensing weight to the DES weak-lensing sources. In companion papers, the Δzi of the three lowest redshift bins are further constrained by the angular clustering of the source galaxies around red galaxies with secure photometric redshifts at 0.15 < z < 0.9. This paper details the BPZ and COSMOS procedures, and demonstrates that the cosmological inference is insensitive to details of the ni(z) beyond the choice of Δzi. The clustering and COSMOS validation methods produce consistent estimates of Δzi in the bins where both can be applied, with combined uncertainties of σ_{Δ z^i}=0.015, 0.013, 0.011, and 0.022 in the four bins. Repeating the photo-z procedure instead using the Directional Neighbourhood Fitting algorithm, or using the ni(z) estimated from the matched sample in COSMOS, yields no discernible difference in cosmological inferences.

  6. Photoionization cross section by Stieltjes imaging applied to coupled cluster Lanczos pseudo-spectra

    NASA Astrophysics Data System (ADS)

    Cukras, Janusz; Coriani, Sonia; Decleva, Piero; Christiansen, Ove; Norman, Patrick

    2013-09-01

    A recently implemented asymmetric Lanczos algorithm for computing (complex) linear response functions within the coupled cluster singles (CCS), coupled cluster singles and iterative approximate doubles (CC2), and coupled cluster singles and doubles (CCSD) is coupled to a Stieltjes imaging technique in order to describe the photoionization cross section of atoms and molecules, in the spirit of a similar procedure recently proposed by Averbukh and co-workers within the Algebraic Diagrammatic Construction approach. Pilot results are reported for the atoms He, Ne, and Ar and for the molecules H2, H2O, NH3, HF, CO, and CO2.

  7. Photoionization cross section by Stieltjes imaging applied to coupled cluster Lanczos pseudo-spectra.

    PubMed

    Cukras, Janusz; Coriani, Sonia; Decleva, Piero; Christiansen, Ove; Norman, Patrick

    2013-09-07

    A recently implemented asymmetric Lanczos algorithm for computing (complex) linear response functions within the coupled cluster singles (CCS), coupled cluster singles and iterative approximate doubles (CC2), and coupled cluster singles and doubles (CCSD) is coupled to a Stieltjes imaging technique in order to describe the photoionization cross section of atoms and molecules, in the spirit of a similar procedure recently proposed by Averbukh and co-workers within the Algebraic Diagrammatic Construction approach. Pilot results are reported for the atoms He, Ne, and Ar and for the molecules H2, H2O, NH3, HF, CO, and CO2.

  8. Preoptimised VB: a fast method for the ground and excited states of ionic clusters I. Localised preoptimisation for (ArCO) +, (ArN 2) + and N 4+

    NASA Astrophysics Data System (ADS)

    Langenberg, J. H.; Bucur, I. B.; Archirel, P.

    1997-09-01

    We show that in the simple case of van der Waals ionic clusters, the optimisation of orbitals within VB can be easily simulated with the help of pseudopotentials. The procedure yields the ground and the first excited states of the cluster simultaneously. This makes the calculation of potential energy surfaces for tri- and tetraatomic clusters possible, with very acceptable computation times. We give potential curves for (ArCO) +, (ArN 2) + and N 4+. An application to the simulation of the SCF method is shown for Na +H 2O.

  9. Structure of clusters and building blocks in amylopectin from African rice accessions.

    PubMed

    Gayin, Joseph; Abdel-Aal, El-Sayed M; Marcone, Massimo; Manful, John; Bertoft, Eric

    2016-09-05

    Enzymatic hydrolysis in combination with gel-permeation and anion-exchange chromatography techniques were employed to characterise the composition of clusters and building blocks of amylopectin from two African rice (Oryza glaberrima) accessions-IRGC 103759 and TOG 12440. The samples were compared with one Asian rice (Oryza sativa) sample (cv WITA 4) and one O. sativa×O. glaberrima cross (NERICA 4). The average DP of clusters from the African rice accessions (ARAs) was marginally larger (DP=83) than in WITA 4 (DP=81). However, regarding average number of chains, clusters from the ARAs represented both the smallest and largest clusters. Overall, the result suggested that the structure of clusters in TOG 12440 was dense with short chains and high degree of branching, whereas the situation was the opposite in NERICA 4. IRGC 103759 and WITA 4 possessed clusters with intermediate characteristics. The commonest type of building blocks in all samples was group 2 (single branched dextrins) representing 40.3-49.4% of the blocks, while groups 3-6 were found in successively lower numbers. The average number of building blocks in the clusters was significantly larger in NERICA 4 (5.8) and WITA 4 (5.7) than in IRGC 103759 and TOG 12440 (5.1 and 5.3, respectively). Copyright © 2016 Elsevier Ltd. All rights reserved.

  10. Sample size calculations for stepped wedge and cluster randomised trials: a unified approach

    PubMed Central

    Hemming, Karla; Taljaard, Monica

    2016-01-01

    Objectives To clarify and illustrate sample size calculations for the cross-sectional stepped wedge cluster randomized trial (SW-CRT) and to present a simple approach for comparing the efficiencies of competing designs within a unified framework. Study Design and Setting We summarize design effects for the SW-CRT, the parallel cluster randomized trial (CRT), and the parallel cluster randomized trial with before and after observations (CRT-BA), assuming cross-sectional samples are selected over time. We present new formulas that enable trialists to determine the required cluster size for a given number of clusters. We illustrate by example how to implement the presented design effects and give practical guidance on the design of stepped wedge studies. Results For a fixed total cluster size, the choice of study design that provides the greatest power depends on the intracluster correlation coefficient (ICC) and the cluster size. When the ICC is small, the CRT tends to be more efficient; when the ICC is large, the SW-CRT tends to be more efficient and can serve as an alternative design when the CRT is an infeasible design. Conclusion Our unified approach allows trialists to easily compare the efficiencies of three competing designs to inform the decision about the most efficient design in a given scenario. PMID:26344808

  11. The evolution of active galactic nuclei in clusters of galaxies from the Dark Energy Survey

    DOE PAGES

    Bufanda, E.; Hollowood, D.; Jeltema, T. E.; ...

    2016-12-13

    The correlation between active galactic nuclei (AGN) and environment provides important clues to AGN fueling and the relationship of black hole growth to galaxy evolution. Here, we analyze the fraction of galaxies in clusters hosting AGN as a function of redshift and cluster richness for X-ray detected AGN associated with clusters of galaxies in Dark Energy Survey (DES) Science Verification data. The present sample includes 33 AGN with L_X > 10 43 ergs s -1 in non-central, host galaxies with luminosity greater than 0.5 L* from a total sample of 432 clusters in the redshift range of 0.10.7. Our resultmore » is in good agreement with previous work and parallels the increase in star formation in cluster galaxies over the same redshift range. But, the AGN fraction in clusters is observed to have no significant correlation with cluster mass. Future analyses with DES Year 1 through Year 3 data will be able to clarify whether AGN activity is correlated to cluster mass and will tightly constrain the relationship between cluster AGN populations and redshift.« less

  12. Galaxy properties in clusters. II. Backsplash galaxies

    NASA Astrophysics Data System (ADS)

    Muriel, H.; Coenda, V.

    2014-04-01

    Aims: We explore the properties of galaxies on the outskirts of clusters and their dependence on recent dynamical history in order to understand the real impact that the cluster core has on the evolution of galaxies. Methods: We analyse the properties of more than 1000 galaxies brighter than M0.1r = - 19.6 on the outskirts of 90 clusters (1 < r/rvir < 2) in the redshift range 0.05 < z < 0.10. Using the line of sight velocity of galaxies relative to the cluster's mean, we selected low and high velocity subsamples. Theoretical predictions indicate that a significant fraction of the first subsample should be backsplash galaxies, that is, objects that have already orbited near the cluster centre. A significant proportion of the sample of high relative velocity (HV) galaxies seems to be composed of infalling objects. Results: Our results suggest that, at fixed stellar mass, late-type galaxies in the low-velocity (LV) sample are systematically older, redder, and have formed fewer stars during the last 3 Gyrs than galaxies in the HV sample. This result is consistent with models that assume that the central regions of clusters are effective in quenching the star formation by means of processes such as ram pressure stripping or strangulation. At fixed stellar mass, LV galaxies show some evidence of having higher surface brightness and smaller size than HV galaxies. These results are consistent with the scenario where galaxies that have orbited the central regions of clusters are more likely to suffer tidal effects, producing loss of mass as well as a re-distribution of matter towards more compact configurations. Finally, we found a higher fraction of ET galaxies in the LV sample, supporting the idea that the central region of clusters of galaxies may contribute to the transformation of morphological types towards earlier types.

  13. Further Automate Planned Cluster Maintenance to Minimize System Downtime during Maintenance Windows

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Springmeyer, R.

    This report documents the integration and testing of the automated update process of compute clusters in LC to minimize impact to user productivity. Description: A set of scripts will be written and deployed to further standardize cluster maintenance activities and minimize downtime during planned maintenance windows. Completion Criteria: When the scripts have been deployed and used during planned maintenance windows and a timing comparison is completed between the existing process and the new more automated process, this milestone is complete. This milestone was completed on Aug 23, 2016 on the new CTS1 cluster called Jade when a request to upgrademore » the version of TOSS 3 was initiated while SWL jobs and normal user jobs were running. Jobs that were running when the update to the system began continued to run to completion. New jobs on the cluster started on the new release of TOSS 3. No system administrator action was required. Current update procedures in TOSS 2 begin by killing all users jobs. Then all diskfull nodes are updated, which can take a few hours. Only after the updates are applied are all nodes are rebooted, and then finally put back into service. A system administrator is required for all steps. In terms of human time spent during a cluster OS update, the TOSS 3 automated procedure on Jade took 0 FTE hours. Doing the same update without the Toss Update Tool would have required 4 FTE hours.« less

  14. The acceptability among young Hindus and Muslims of actively ending the lives of newborns with genetic defects.

    PubMed

    Kamble, Shanmukh; Ahmed, Ramadan; Sorum, Paul Clay; Mullet, Etienne

    2014-03-01

    To explore the views in non-Western cultures about ending the lives of damaged newborns. 254 university students from India and 150 from Kuwait rated the acceptability of ending the lives of newborns with genetic defects in 54 vignettes consisting of all combinations of four factors: gestational age (term or 7 months); severity of genetic defect (trisomy 21 alone, trisomy 21 with serious morphological abnormalities or trisomy 13 with impending death); the parents' attitude about prolonging care (unknown, in favour or opposed); and the procedure used (withholding treatment, withdrawing it or injecting a lethal substance). Four clusters were identified by cluster analysis and subjected to analysis of variance. Cluster I, labelled 'Never Acceptable', included 4% of the Indians and 59% of the Kuwaitis. Cluster II, 'No Firm Opinion', had little variation in rating from one scenario to the next; it included 38% of the Indians and 18% of the Kuwaitis. In Cluster III, 'Parents' Attitude+Severity+Procedure', all three factors affected the ratings; it was composed of 18% of the Indians and 16% of the Kuwaitis. Cluster IV was called 'Severity+Parents' Attitude' because these had the strongest impact; it was composed of 40% of the Indians and 7% of the Kuwaitis. In accordance with the teachings of Islam versus Hinduism, Kuwaiti students were more likely to oppose ending a newborn's life under all conditions, Indian students more likely to favour it and to judge its acceptability in light of the different circumstances.

  15. VizieR Online Data Catalog: WOCS. LXVI. Radial velocity survey in M35 (Leiner+, 2015)

    NASA Astrophysics Data System (ADS)

    Leiner, E. M.; Mathieu, R. D.; Gosnell, N. M.; Geller, A. M.

    2015-09-01

    In this second paper (see also Geller et al. 2010, cat. J/AJ/139/1383) in a series studying the dynamical state of the young (150Myr) open cluster M35 we present an updated version of our complete radial velocity database for the cluster. Our sample is selected to cover the range of the M35 main sequence from 0.8 to 1.6Mȯ out to 30' from cluster center. In the 17 years that we have observed M35, we have gathered ~8000 moderate-precision (σi=0.5km/s) spectra of ~1300 stars. We find 418 of these to be confirmed radial velocity cluster members or likely members. Within our sample of 418 cluster members or likely members, we detect 64 velocity-variable stars. We present orbital solutions for 52 (see Tables 5 and 7) of these 64 systems, in addition to 28 (see Tables 6 and 8) completed orbital solutions for non-member binaries in our field of view. The binaries are drawn from a sample initially derived from the photometry of T. von Hippel taken at KPNO on the Burrell Schmidt telescope. Observations were taken on 1993 November 18-19, and include B and V photometry down to a magnitude of V=17 lying within a 70'*70' field of view. Subsequently, we updated this photometry for 74% of our sources with more precise BV photometry from C. P. Deliyannis (2006, private communication; Sarrazine et al., 2000AAS...197.4107S). This new photometry was taken on the WIYN 0.9m telescope with the S2KB imager and covers a 40'*40' field of view. See Geller et al. 2010 (cat. J/AJ/139/1383) for more information on these two sets of photometry. Beginning in 1997 September, we have obtained spectra for the stars in our sample at the WIYN 3.5m telescope at KPNO using the Hydra Multi-Object Spectrograph (MOS). For a detailed description of our observing and data reduction procedure see Geller et al. 2008 (cat. J/AJ/135/2264). In short, we typically use Hydra's blue sensitive fibers and an echelle grating providing a resolution of R~20000. These spectra are centered on 512.5nm, and span a ~25nm wavelength range, covering several prominent absorption lines including the MgB triplet. We present here all radial velocity measurements of the 1355 stars in our sample to date (see Table4), totaling ~8000 radial velocities (see Table3). M35 was observed in the X-ray by the XMM-Newton orbiting observatory for 8.6ks (02:37:10-05:00:50 UT) on 2008 September 20. The telescope boresight location was 6h8m54s, +24°20'00'' (J2000). The XMM field of view with r=15' does not extend as far from cluster center as the WOCS radial velocity survey (r=30', α=06h09m07.5s, δ=+24°20'28''). We cross-correlate the position of each X-ray source with the WOCS catalog to find potential optical counterparts to the X-ray sources (see Table9). (7 data files).

  16. A comprehensive HST BVI catalogue of star clusters in five Hickson compact groups of galaxies

    NASA Astrophysics Data System (ADS)

    Fedotov, K.; Gallagher, S. C.; Durrell, P. R.; Bastian, N.; Konstantopoulos, I. S.; Charlton, J.; Johnson, K. E.; Chandar, R.

    2015-05-01

    We present a photometric catalogue of star cluster candidates in Hickson compact groups (HCGs) 7, 31, 42, 59, and 92, based on observations with the Advanced Camera for Surveys and the Wide Field Camera 3 on the Hubble Space Telescope. The catalogue contains precise cluster positions (right ascension and declination), magnitudes, and colours in the BVI filters. The number of detected sources ranges from 2200 to 5600 per group, from which we construct the high-confidence sample by applying a number of criteria designed to reduce foreground and background contaminants. Furthermore, the high-confidence cluster candidates for each of the 16 galaxies in our sample are split into two subpopulations: one that may contain young star clusters and one that is dominated by globular older clusters. The ratio of young star cluster to globular cluster candidates varies from group to group, from equal numbers to the extreme of HCG 31 which has a ratio of 8 to 1, due to a recent starburst induced by interactions in the group. We find that the number of blue clusters with MV < -9 correlates well with the current star formation rate in an individual galaxy, while the number of globular cluster candidates with MV < -7.8 correlates well (though with large scatter) with the stellar mass. Analyses of the high-confidence sample presented in this paper show that star clusters can be successfully used to infer the gross star formation history of the host groups and therefore determine their placement in a proposed evolutionary sequence for compact galaxy groups.

  17. Pressure of the hot gas in simulations of galaxy clusters

    NASA Astrophysics Data System (ADS)

    Planelles, S.; Fabjan, D.; Borgani, S.; Murante, G.; Rasia, E.; Biffi, V.; Truong, N.; Ragone-Figueroa, C.; Granato, G. L.; Dolag, K.; Pierpaoli, E.; Beck, A. M.; Steinborn, Lisa K.; Gaspari, M.

    2017-06-01

    We analyse the radial pressure profiles, the intracluster medium (ICM) clumping factor and the Sunyaev-Zel'dovich (SZ) scaling relations of a sample of simulated galaxy clusters and groups identified in a set of hydrodynamical simulations based on an updated version of the treepm-SPH GADGET-3 code. Three different sets of simulations are performed: the first assumes non-radiative physics, the others include, among other processes, active galactic nucleus (AGN) and/or stellar feedback. Our results are analysed as a function of redshift, ICM physics, cluster mass and cluster cool-coreness or dynamical state. In general, the mean pressure profiles obtained for our sample of groups and clusters show a good agreement with X-ray and SZ observations. Simulated cool-core (CC) and non-cool-core (NCC) clusters also show a good match with real data. We obtain in all cases a small (if any) redshift evolution of the pressure profiles of massive clusters, at least back to z = 1. We find that the clumpiness of gas density and pressure increases with the distance from the cluster centre and with the dynamical activity. The inclusion of AGN feedback in our simulations generates values for the gas clumping (√{C}_{ρ }˜ 1.2 at R200) in good agreement with recent observational estimates. The simulated YSZ-M scaling relations are in good accordance with several observed samples, especially for massive clusters. As for the scatter of these relations, we obtain a clear dependence on the cluster dynamical state, whereas this distinction is not so evident when looking at the subsamples of CC and NCC clusters.

  18. Clusters of Monoisotopic Elements for Calibration in (TOF) Mass Spectrometry

    NASA Astrophysics Data System (ADS)

    Kolářová, Lenka; Prokeš, Lubomír; Kučera, Lukáš; Hampl, Aleš; Peňa-Méndez, Eladia; Vaňhara, Petr; Havel, Josef

    2017-03-01

    Precise calibration in TOF MS requires suitable and reliable standards, which are not always available for high masses. We evaluated inorganic clusters of the monoisotopic elements gold and phosphorus (Au n +/Au n - and P n +/P n -) as an alternative to peptides or proteins for the external and internal calibration of mass spectra in various experimental and instrumental scenarios. Monoisotopic gold or phosphorus clusters can be easily generated in situ from suitable precursors by laser desorption/ionization (LDI) or matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS). Their use offers numerous advantages, including simplicity of preparation, biological inertness, and exact mass determination even at lower mass resolution. We used citrate-stabilized gold nanoparticles to generate gold calibration clusters, and red phosphorus powder to generate phosphorus clusters. Both elements can be added to samples to perform internal calibration up to mass-to-charge ( m/z) 10-15,000 without significantly interfering with the analyte. We demonstrated the use of the gold and phosphorous clusters in the MS analysis of complex biological samples, including microbial standards and total extracts of mouse embryonic fibroblasts. We believe that clusters of monoisotopic elements could be used as generally applicable calibrants for complex biological samples.

  19. Enumerative and binomial sequential sampling plans for the multicolored Asian lady beetle (Coleoptera: Coccinellidae) in wine grapes.

    PubMed

    Galvan, T L; Burkness, E C; Hutchison, W D

    2007-06-01

    To develop a practical integrated pest management (IPM) system for the multicolored Asian lady beetle, Harmonia axyridis (Pallas) (Coleoptera: Coccinellidae), in wine grapes, we assessed the spatial distribution of H. axyridis and developed eight sampling plans to estimate adult density or infestation level in grape clusters. We used 49 data sets collected from commercial vineyards in 2004 and 2005, in Minnesota and Wisconsin. Enumerative plans were developed using two precision levels (0.10 and 0.25); the six binomial plans reflected six unique action thresholds (3, 7, 12, 18, 22, and 31% of cluster samples infested with at least one H. axyridis). The spatial distribution of H. axyridis in wine grapes was aggregated, independent of cultivar and year, but it was more randomly distributed as mean density declined. The average sample number (ASN) for each sampling plan was determined using resampling software. For research purposes, an enumerative plan with a precision level of 0.10 (SE/X) resulted in a mean ASN of 546 clusters. For IPM applications, the enumerative plan with a precision level of 0.25 resulted in a mean ASN of 180 clusters. In contrast, the binomial plans resulted in much lower ASNs and provided high probabilities of arriving at correct "treat or no-treat" decisions, making these plans more efficient for IPM applications. For a tally threshold of one adult per cluster, the operating characteristic curves for the six action thresholds provided binomial sequential sampling plans with mean ASNs of only 19-26 clusters, and probabilities of making correct decisions between 83 and 96%. The benefits of the binomial sampling plans are discussed within the context of improving IPM programs for wine grapes.

  20. Small Sample Performance of Bias-corrected Sandwich Estimators for Cluster-Randomized Trials with Binary Outcomes

    PubMed Central

    Li, Peng; Redden, David T.

    2014-01-01

    SUMMARY The sandwich estimator in generalized estimating equations (GEE) approach underestimates the true variance in small samples and consequently results in inflated type I error rates in hypothesis testing. This fact limits the application of the GEE in cluster-randomized trials (CRTs) with few clusters. Under various CRT scenarios with correlated binary outcomes, we evaluate the small sample properties of the GEE Wald tests using bias-corrected sandwich estimators. Our results suggest that the GEE Wald z test should be avoided in the analyses of CRTs with few clusters even when bias-corrected sandwich estimators are used. With t-distribution approximation, the Kauermann and Carroll (KC)-correction can keep the test size to nominal levels even when the number of clusters is as low as 10, and is robust to the moderate variation of the cluster sizes. However, in cases with large variations in cluster sizes, the Fay and Graubard (FG)-correction should be used instead. Furthermore, we derive a formula to calculate the power and minimum total number of clusters one needs using the t test and KC-correction for the CRTs with binary outcomes. The power levels as predicted by the proposed formula agree well with the empirical powers from the simulations. The proposed methods are illustrated using real CRT data. We conclude that with appropriate control of type I error rates under small sample sizes, we recommend the use of GEE approach in CRTs with binary outcomes due to fewer assumptions and robustness to the misspecification of the covariance structure. PMID:25345738

  1. Efficient sampling of complex network with modified random walk strategies

    NASA Astrophysics Data System (ADS)

    Xie, Yunya; Chang, Shuhua; Zhang, Zhipeng; Zhang, Mi; Yang, Lei

    2018-02-01

    We present two novel random walk strategies, choosing seed node (CSN) random walk and no-retracing (NR) random walk. Different from the classical random walk sampling, the CSN and NR strategies focus on the influences of the seed node choice and path overlap, respectively. Three random walk samplings are applied in the Erdös-Rényi (ER), Barabási-Albert (BA), Watts-Strogatz (WS), and the weighted USAir networks, respectively. Then, the major properties of sampled subnets, such as sampling efficiency, degree distributions, average degree and average clustering coefficient, are studied. The similar conclusions can be reached with these three random walk strategies. Firstly, the networks with small scales and simple structures are conducive to the sampling. Secondly, the average degree and the average clustering coefficient of the sampled subnet tend to the corresponding values of original networks with limited steps. And thirdly, all the degree distributions of the subnets are slightly biased to the high degree side. However, the NR strategy performs better for the average clustering coefficient of the subnet. In the real weighted USAir networks, some obvious characters like the larger clustering coefficient and the fluctuation of degree distribution are reproduced well by these random walk strategies.

  2. Intra-class correlation estimates for assessment of vitamin A intake in children.

    PubMed

    Agarwal, Girdhar G; Awasthi, Shally; Walter, Stephen D

    2005-03-01

    In many community-based surveys, multi-level sampling is inherent in the design. In the design of these studies, especially to calculate the appropriate sample size, investigators need good estimates of intra-class correlation coefficient (ICC), along with the cluster size, to adjust for variation inflation due to clustering at each level. The present study used data on the assessment of clinical vitamin A deficiency and intake of vitamin A-rich food in children in a district in India. For the survey, 16 households were sampled from 200 villages nested within eight randomly-selected blocks of the district. ICCs and components of variances were estimated from a three-level hierarchical random effects analysis of variance model. Estimates of ICCs and variance components were obtained at village and block levels. Between-cluster variation was evident at each level of clustering. In these estimates, ICCs were inversely related to cluster size, but the design effect could be substantial for large clusters. At the block level, most ICC estimates were below 0.07. At the village level, many ICC estimates ranged from 0.014 to 0.45. These estimates may provide useful information for the design of epidemiological studies in which the sampled (or allocated) units range in size from households to large administrative zones.

  3. Posttraumatic Stress Disorder Symptom Clusters and the Interpersonal Theory of Suicide in a Large Military Sample.

    PubMed

    Pennings, Stephanie M; Finn, Joseph; Houtsma, Claire; Green, Bradley A; Anestis, Michael D

    2017-10-01

    Prior studies examining posttraumatic stress disorder (PTSD) symptom clusters and the components of the interpersonal theory of suicide (ITS) have yielded mixed results, likely stemming in part from the use of divergent samples and measurement techniques. This study aimed to expand on these findings by utilizing a large military sample, gold standard ITS measures, and multiple PTSD factor structures. Utilizing a sample of 935 military personnel, hierarchical multiple regression analyses were used to test the association between PTSD symptom clusters and the ITS variables. Additionally, we tested for indirect effects of PTSD symptom clusters on suicidal ideation through thwarted belongingness, conditional on levels of perceived burdensomeness. Results indicated that numbing symptoms are positively associated with both perceived burdensomeness and thwarted belongingness and hyperarousal symptoms (dysphoric arousal in the 5-factor model) are positively associated with thwarted belongingness. Results also indicated that hyperarousal symptoms (anxious arousal in the 5-factor model) were positively associated with fearlessness about death. The positive association between PTSD symptom clusters and suicidal ideation was inconsistent and modest, with mixed support for the ITS model. Overall, these results provide further clarity regarding the association between specific PTSD symptom clusters and suicide risk factors. © 2016 The American Association of Suicidology.

  4. HIFLUGCS: X-ray luminosity-dynamical mass relation and its implications for mass calibrations with the SPIDERS and 4MOST surveys

    NASA Astrophysics Data System (ADS)

    Zhang, Yu-Ying; Reiprich, Thomas H.; Schneider, Peter; Clerc, Nicolas; Merloni, Andrea; Schwope, Axel; Borm, Katharina; Andernach, Heinz; Caretta, César A.; Wu, Xiang-Ping

    2017-03-01

    We present the relation of X-ray luminosity versus dynamical mass for 63 nearby clusters of galaxies in a flux-limited sample, the HIghest X-ray FLUx Galaxy Cluster Sample (HIFLUGCS, consisting of 64 clusters). The luminosity measurements are obtained based on 1.3 Ms of clean XMM-Newton data and ROSAT pointed observations. The masses are estimated using optical spectroscopic redshifts of 13647 cluster galaxies in total. We classify clusters into disturbed and undisturbed based on a combination of the X-ray luminosity concentration and the offset between the brightest cluster galaxy and X-ray flux-weighted center. Given sufficient numbers (I.e., ≥45) of member galaxies when the dynamical masses are computed, the luminosity versus mass relations agree between the disturbed and undisturbed clusters. The cool-core clusters still dominate the scatter in the luminosity versus mass relation even when a core-corrected X-ray luminosity is used, which indicates that the scatter of this scaling relation mainly reflects the structure formation history of the clusters. As shown by the clusters with only few spectroscopically confirmed members, the dynamical masses can be underestimated and thus lead to a biased scaling relation. To investigate the potential of spectroscopic surveys to follow up high-redshift galaxy clusters or groups observed in X-ray surveys for the identifications and mass calibrations, we carried out Monte Carlo resampling of the cluster galaxy redshifts and calibrated the uncertainties of the redshift and dynamical mass estimates when only reduced numbers of galaxy redshifts per cluster are available. The resampling considers the SPIDERS and 4MOST configurations, designed for the follow-up of the eROSITA clusters, and was carried out for each cluster in the sample at the actual cluster redshift as well as at the assigned input cluster redshifts of 0.2, 0.4, 0.6, and 0.8. To follow up very distant clusters or groups, we also carried out the mass calibration based on the resampling with only ten redshifts per cluster, and redshift calibration based on the resampling with only five and ten redshifts per cluster, respectively. Our results demonstrate the power of combining upcoming X-ray and optical spectroscopic surveys for mass calibration of clusters. The scatter in the dynamical mass estimates for the clusters with at least ten members is within 50%.

  5. Uniform deposition of size-selected clusters using Lissajous scanning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Beniya, Atsushi; Watanabe, Yoshihide, E-mail: e0827@mosk.tytlabs.co.jp; Hirata, Hirohito

    2016-05-15

    Size-selected clusters can be deposited on the surface using size-selected cluster ion beams. However, because of the cross-sectional intensity distribution of the ion beam, it is difficult to define the coverage of the deposited clusters. The aggregation probability of the cluster depends on coverage, whereas cluster size on the surface depends on the position, despite the size-selected clusters are deposited. It is crucial, therefore, to deposit clusters uniformly on the surface. In this study, size-selected clusters were deposited uniformly on surfaces by scanning the cluster ions in the form of Lissajous pattern. Two sets of deflector electrodes set in orthogonalmore » directions were placed in front of the sample surface. Triangular waves were applied to the electrodes with an irrational frequency ratio to ensure that the ion trajectory filled the sample surface. The advantages of this method are simplicity and low cost of setup compared with raster scanning method. The authors further investigated CO adsorption on size-selected Pt{sub n} (n = 7, 15, 20) clusters uniformly deposited on the Al{sub 2}O{sub 3}/NiAl(110) surface and demonstrated the importance of uniform deposition.« less

  6. The XXL Survey. II. The bright cluster sample: catalogue and luminosity function

    NASA Astrophysics Data System (ADS)

    Pacaud, F.; Clerc, N.; Giles, P. A.; Adami, C.; Sadibekova, T.; Pierre, M.; Maughan, B. J.; Lieu, M.; Le Fèvre, J. P.; Alis, S.; Altieri, B.; Ardila, F.; Baldry, I.; Benoist, C.; Birkinshaw, M.; Chiappetti, L.; Démoclès, J.; Eckert, D.; Evrard, A. E.; Faccioli, L.; Gastaldello, F.; Guennou, L.; Horellou, C.; Iovino, A.; Koulouridis, E.; Le Brun, V.; Lidman, C.; Liske, J.; Maurogordato, S.; Menanteau, F.; Owers, M.; Poggianti, B.; Pomarède, D.; Pompei, E.; Ponman, T. J.; Rapetti, D.; Reiprich, T. H.; Smith, G. P.; Tuffs, R.; Valageas, P.; Valtchanov, I.; Willis, J. P.; Ziparo, F.

    2016-06-01

    Context. The XXL Survey is the largest survey carried out by the XMM-Newton satellite and covers a total area of 50 square degrees distributed over two fields. It primarily aims at investigating the large-scale structures of the Universe using the distribution of galaxy clusters and active galactic nuclei as tracers of the matter distribution. The survey will ultimately uncover several hundreds of galaxy clusters out to a redshift of ~2 at a sensitivity of ~10-14 erg s-1 cm-2 in the [0.5-2] keV band. Aims: This article presents the XXL bright cluster sample, a subsample of 100 galaxy clusters selected from the full XXL catalogue by setting a lower limit of 3 × 10-14 erg s-1 cm-2 on the source flux within a 1' aperture. Methods: The selection function was estimated using a mixture of Monte Carlo simulations and analytical recipes that closely reproduce the source selection process. An extensive spectroscopic follow-up provided redshifts for 97 of the 100 clusters. We derived accurate X-ray parameters for all the sources. Scaling relations were self-consistently derived from the same sample in other publications of the series. On this basis, we study the number density, luminosity function, and spatial distribution of the sample. Results: The bright cluster sample consists of systems with masses between M500 = 7 × 1013 and 3 × 1014 M⊙, mostly located between z = 0.1 and 0.5. The observed sky density of clusters is slightly below the predictions from the WMAP9 model, and significantly below the prediction from the Planck 2015 cosmology. In general, within the current uncertainties of the cluster mass calibration, models with higher values of σ8 and/or ΩM appear more difficult to accommodate. We provide tight constraints on the cluster differential luminosity function and find no hint of evolution out to z ~ 1. We also find strong evidence for the presence of large-scale structures in the XXL bright cluster sample and identify five new superclusters. Based on observations obtained with XMM-Newton, an ESA science mission with instruments and contributions directly funded by ESA Member States and NASA. Based on observations made with ESO Telescopes at the La Silla and Paranal Observatories under programme ID 089.A-0666 and LP191.A-0268.The Master Catalogue is available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/592/A2

  7. Weak-Lensing Mass Calibration of the Atacama Cosmology Telescope Equatorial Sunyaev-Zeldovich Cluster Sample with the Canada-France-Hawaii Telescope Stripe 82 Survey

    NASA Technical Reports Server (NTRS)

    Battaglia, N.; Leauthaud, A.; Miyatake, H.; Hasseleld, M.; Gralla, M. B.; Allison, R.; Bond, J. R.; Calabrese, E.; Crichton, D.; Devlin, M. J.; hide

    2016-01-01

    Mass calibration uncertainty is the largest systematic effect for using clustersof galaxies to constrain cosmological parameters. We present weak lensing mass measurements from the Canada-France-Hawaii Telescope Stripe 82 Survey for galaxy clusters selected through their high signal-to-noise thermal Sunyaev-Zeldovich (tSZ) signal measured with the Atacama Cosmology Telescope (ACT). For a sample of 9 ACT clusters with a tSZ signal-to-noise greater than five, the average weak lensing mass is (4.8 plus or minus 0.8) times 10 (sup 14) solar mass, consistent with the tSZ mass estimate of (4.7 plus or minus 1.0) times 10 (sup 14) solar mass, which assumes a universal pressure profile for the cluster gas. Our results are consistent with previous weak-lensing measurements of tSZ-detected clusters from the Planck satellite. When comparing our results, we estimate the Eddington bias correction for the sample intersection of Planck and weak-lensing clusters which was previously excluded.

  8. Orbits of Selected Globular Clusters in the Galactic Bulge

    NASA Astrophysics Data System (ADS)

    Pérez-Villegas, A.; Rossi, L.; Ortolani, S.; Casotto, S.; Barbuy, B.; Bica, E.

    2018-05-01

    We present orbit analysis for a sample of eight inner bulge globular clusters, together with one reference halo object. We used proper motion values derived from long time base CCD data. Orbits are integrated in both an axisymmetric model and a model including the Galactic bar potential. The inclusion of the bar proved to be essential for the description of the dynamical behaviour of the clusters. We use the Monte Carlo scheme to construct the initial conditions for each cluster, taking into account the uncertainties in the kinematical data and distances. The sample clusters show typically maximum height to the Galactic plane below 1.5 kpc, and develop rather eccentric orbits. Seven of the bulge sample clusters share the orbital properties of the bar/bulge, having perigalactic and apogalatic distances, and maximum vertical excursion from the Galactic plane inside the bar region. NGC 6540 instead shows a completely different orbital behaviour, having a dynamical signature of the thick disc. Both prograde and prograde-retrograde orbits with respect to the direction of the Galactic rotation were revealed, which might characterise a chaotic behaviour.

  9. The dysregulated cluster in personality profiling research: Longitudinal stability and associations with bulimic behaviors and correlates

    PubMed Central

    Slane, Jennifer D.; Klump, Kelly L.; Donnellan, M. Brent; McGue, Matthew; Iacono, William G.

    2013-01-01

    Among cluster analytic studies of the personality profiles associated with bulimia nervosa, a group of individuals characterized by emotional lability and behavioral dysregulation (i.e., a dysregulated cluster) has emerged most consistently. However, previous studies have all been cross-sectional and mostly used clinical samples. This study aimed to replicate associations between the dysregulated personality cluster and bulimic symptoms and related characteristics using a longitudinal, population-based sample. Participants were females assessed at ages 17 and 25 from the Minnesota Twin Family Study, clustered based on their personality traits. The Dysregulated cluster was successfully identified at both time points and was more stable across time than either the Resilient or Sensation Seeking clusters. Rates of bulimic symptoms and related behaviors (e.g., alcohol use problems) were also highest in the dysregulated group. Findings suggest that the dysregulated cluster is a relatively stable and robust profile that is associated with bulimic symptoms. PMID:23398096

  10. Optimal design of a plot cluster for monitoring

    Treesearch

    Charles T. Scott

    1993-01-01

    Traveling costs incurred during extensive forest surveys make cluster sampling cost-effective. Clusters are specified by the type of plots, plot size, number of plots, and the distance between plots within the cluster. A method to determine the optimal cluster design when different plot types are used for different forest resource attributes is described. The method...

  11. Use of LANDSAT imagery for wildlife habitat mapping in northeast and eastcentral Alaska

    NASA Technical Reports Server (NTRS)

    Lent, P. C. (Principal Investigator)

    1976-01-01

    The author has identified the following significant results. There is strong indication that spatially rare feature classes may be missed in clustering classifications based on 2% random sampling. Therefore, it seems advisable to augment random sampling for cluster analysis with directed sampling of any spatially rare features which are relevant to the analysis.

  12. Brief Report: Clustered Forward Chaining with Embedded Mastery Probes to Teach Recipe Following

    ERIC Educational Resources Information Center

    Chazin, Kate T.; Bartelmay, Danielle N.; Lambert, Joseph M.; Houchins-Juárez, Nealetta J.

    2017-01-01

    This study evaluated the effectiveness of a clustered forward chaining (CFC) procedure to teach a 23-year-old male with autism to follow written recipes. CFC incorporates elements of forward chaining (FC) and total task chaining (TTC) by teaching a small number of steps (i.e., units) using TTC, introducing new units sequentially (akin to FC), and…

  13. Development of the ion source for cluster implantation

    NASA Astrophysics Data System (ADS)

    Kulevoy, T. V.; Seleznev, D. N.; Kozlov, A. V.; Kuibeda, R. P.; Kropachev, G. N.; Alexeyenko, O. V.; Dugin, S. N.; Oks, E. M.; Gushenets, V. I.; Hershcovitch, A.; Jonson, B.; Poole, H. J.

    2014-02-01

    Bernas ion source development to meet needs of 100s of electron-volt ion implanters for shallow junction production is in progress in Institute for Theoretical and Experimental Physics. The ion sources provides high intensity ion beam of boron clusters under self-cleaning operation mode. The last progress with ion source operation is presented. The mechanism of self-cleaning procedure is described.

  14. SU-G-TeP3-14: Three-Dimensional Cluster Model in Inhomogeneous Dose Distribution

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wei, J; Penagaricano, J; Narayanasamy, G

    2016-06-15

    Purpose: We aim to investigate 3D cluster formation in inhomogeneous dose distribution to search for new models predicting radiation tissue damage and further leading to new optimization paradigm for radiotherapy planning. Methods: The aggregation of higher dose in the organ at risk (OAR) than a preset threshold was chosen as the cluster whose connectivity dictates the cluster structure. Upon the selection of the dose threshold, the fractional density defined as the fraction of voxels in the organ eligible to be part of the cluster was determined according to the dose volume histogram (DVH). A Monte Carlo method was implemented tomore » establish a case pertinent to the corresponding DVH. Ones and zeros were randomly assigned to each OAR voxel with the sampling probability equal to the fractional density. Ten thousand samples were randomly generated to ensure a sufficient number of cluster sets. A recursive cluster searching algorithm was developed to analyze the cluster with various connectivity choices like 1-, 2-, and 3-connectivity. The mean size of the largest cluster (MSLC) from the Monte Carlo samples was taken to be a function of the fractional density. Various OARs from clinical plans were included in the study. Results: Intensive Monte Carlo study demonstrates the inverse relationship between the MSLC and the cluster connectivity as anticipated and the cluster size does not change with fractional density linearly regardless of the connectivity types. An initially-slow-increase to exponential growth transition of the MSLC from low to high density was observed. The cluster sizes were found to vary within a large range and are relatively independent of the OARs. Conclusion: The Monte Carlo study revealed that the cluster size could serve as a suitable index of the tissue damage (percolation cluster) and the clinical outcome of the same DVH might be potentially different.« less

  15. VizieR Online Data Catalog: NORAS II. I. First results (Bohringer+, 2017)

    NASA Astrophysics Data System (ADS)

    Bohringer, H.; Chon, G.; Retzlaff, J.; Trumper, J.; Meisenheimer, K.; Schartel, N.

    2017-08-01

    The NOrthern ROSAT All-Sky (NORAS) galaxy cluster survey project is based on the ROSAT All-Sky Survey (RASS; Trumper 1993Sci...260.1769T), which is the only full-sky survey conducted with an imaging X-ray telescope. We have already used RASS for the construction of the cluster catalogs of the NORAS I project. While NORAS I was as a first step focused on the identification of galaxy clusters among the RASS X-ray sources showing a significant extent, the complementary REFLEX I sample in the southern sky was strictly constructed as a flux-limited cluster sample. A major extension of the REFLEX I sample, which roughly doubles the number of clusters, REFLEX II (Bohringer et al. 2013, Cat. J/A+A/555/A30), was recently completed. It is by far the largest high-quality sample of X-ray-selected galaxy clusters. The NORAS II survey now reaches a flux limit of 1.8*10-12erg/s/cm2 in the 0.1-2.4keV band. Redshifts have been obtained for all of the 860 clusters in the NORAS II catalog, except for 25 clusters for which observing campaigns are scheduled. Thus with 3% missing redshifts we can already obtain a very good view of the properties of the NORAS II cluster sample and obtain some first results. The NORAS II survey covers the sky region north of the equator outside the band of the Milky Way (|bII|>=20°). We also excise a region around the nearby Virgo cluster of galaxies that extends over several degrees on the sky, where the detection of background clusters is hampered by bright X-ray emission. This region is bounded in right ascension by R.A.=185°-191.25° and in declination by decl.=6°-15° (an area of ~53deg2). With this excision, the survey area covers 4.18 steradian (13519deg2, a fraction of 32.7% of the sky). NORAS II is based on the RASS product RASS III (Voges et al. 1999, Cat. IX/10), which was also used for REFLEX II. The NORAS II survey was constructed in a way identical to REFLEX II with a nominal flux limit of 1.8*10-12erg/s/cm2. (3 data files).

  16. ATCA observations of the MACS-Planck Radio Halo Cluster Project. II. Radio observations of an intermediate redshift cluster sample

    NASA Astrophysics Data System (ADS)

    Martinez Aviles, G.; Johnston-Hollitt, M.; Ferrari, C.; Venturi, T.; Democles, J.; Dallacasa, D.; Cassano, R.; Brunetti, G.; Giacintucci, S.; Pratt, G. W.; Arnaud, M.; Aghanim, N.; Brown, S.; Douspis, M.; Hurier, J.; Intema, H. T.; Langer, M.; Macario, G.; Pointecouteau, E.

    2018-04-01

    Aim. A fraction of galaxy clusters host diffuse radio sources whose origins are investigated through multi-wavelength studies of cluster samples. We investigate the presence of diffuse radio emission in a sample of seven galaxy clusters in the largely unexplored intermediate redshift range (0.3 < z < 0.44). Methods: In search of diffuse emission, deep radio imaging of the clusters are presented from wide band (1.1-3.1 GHz), full resolution ( 5 arcsec) observations with the Australia Telescope Compact Array (ATCA). The visibilities were also imaged at lower resolution after point source modelling and subtraction and after a taper was applied to achieve better sensitivity to low surface brightness diffuse radio emission. In case of non-detection of diffuse sources, we set upper limits for the radio power of injected diffuse radio sources in the field of our observations. Furthermore, we discuss the dynamical state of the observed clusters based on an X-ray morphological analysis with XMM-Newton. Results: We detect a giant radio halo in PSZ2 G284.97-23.69 (z = 0.39) and a possible diffuse source in the nearly relaxed cluster PSZ2 G262.73-40.92 (z = 0.421). Our sample contains three highly disturbed massive clusters without clear traces of diffuse emission at the observed frequencies. We were able to inject modelled radio haloes with low values of total flux density to set upper detection limits; however, with our high-frequency observations we cannot exclude the presence of RH in these systems because of the sensitivity of our observations in combination with the high z of the observed clusters. The reduced images are only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/611/A94

  17. TimesVector: a vectorized clustering approach to the analysis of time series transcriptome data from multiple phenotypes.

    PubMed

    Jung, Inuk; Jo, Kyuri; Kang, Hyejin; Ahn, Hongryul; Yu, Youngjae; Kim, Sun

    2017-12-01

    Identifying biologically meaningful gene expression patterns from time series gene expression data is important to understand the underlying biological mechanisms. To identify significantly perturbed gene sets between different phenotypes, analysis of time series transcriptome data requires consideration of time and sample dimensions. Thus, the analysis of such time series data seeks to search gene sets that exhibit similar or different expression patterns between two or more sample conditions, constituting the three-dimensional data, i.e. gene-time-condition. Computational complexity for analyzing such data is very high, compared to the already difficult NP-hard two dimensional biclustering algorithms. Because of this challenge, traditional time series clustering algorithms are designed to capture co-expressed genes with similar expression pattern in two sample conditions. We present a triclustering algorithm, TimesVector, specifically designed for clustering three-dimensional time series data to capture distinctively similar or different gene expression patterns between two or more sample conditions. TimesVector identifies clusters with distinctive expression patterns in three steps: (i) dimension reduction and clustering of time-condition concatenated vectors, (ii) post-processing clusters for detecting similar and distinct expression patterns and (iii) rescuing genes from unclassified clusters. Using four sets of time series gene expression data, generated by both microarray and high throughput sequencing platforms, we demonstrated that TimesVector successfully detected biologically meaningful clusters of high quality. TimesVector improved the clustering quality compared to existing triclustering tools and only TimesVector detected clusters with differential expression patterns across conditions successfully. The TimesVector software is available at http://biohealth.snu.ac.kr/software/TimesVector/. sunkim.bioinfo@snu.ac.kr. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  18. Single exposure three-dimensional imaging of dusty plasma clusters.

    PubMed

    Hartmann, Peter; Donkó, István; Donkó, Zoltán

    2013-02-01

    We have worked out the details of a single camera, single exposure method to perform three-dimensional imaging of a finite particle cluster. The procedure is based on the plenoptic imaging principle and utilizes a commercial Lytro light field still camera. We demonstrate the capabilities of our technique on a single layer particle cluster in a dusty plasma, where the camera is aligned and inclined at a small angle to the particle layer. The reconstruction of the third coordinate (depth) is found to be accurate and even shadowing particles can be identified.

  19. Querying Co-regulated Genes on Diverse Gene Expression Datasets Via Biclustering.

    PubMed

    Deveci, Mehmet; Küçüktunç, Onur; Eren, Kemal; Bozdağ, Doruk; Kaya, Kamer; Çatalyürek, Ümit V

    2016-01-01

    Rapid development and increasing popularity of gene expression microarrays have resulted in a number of studies on the discovery of co-regulated genes. One important way of discovering such co-regulations is the query-based search since gene co-expressions may indicate a shared role in a biological process. Although there exist promising query-driven search methods adapting clustering, they fail to capture many genes that function in the same biological pathway because microarray datasets are fraught with spurious samples or samples of diverse origin, or the pathways might be regulated under only a subset of samples. On the other hand, a class of clustering algorithms known as biclustering algorithms which simultaneously cluster both the items and their features are useful while analyzing gene expression data, or any data in which items are related in only a subset of their samples. This means that genes need not be related in all samples to be clustered together. Because many genes only interact under specific circumstances, biclustering may recover the relationships that traditional clustering algorithms can easily miss. In this chapter, we briefly summarize the literature using biclustering for querying co-regulated genes. Then we present a novel biclustering approach and evaluate its performance by a thorough experimental analysis.

  20. Classification of different types of beer according to their colour characteristics

    NASA Astrophysics Data System (ADS)

    Nikolova, Kr T.; Gabrova, R.; Boyadzhiev, D.; Pisanova, E. S.; Ruseva, J.; Yanakiev, D.

    2017-01-01

    Twenty-two samples from different beers have been investigated in two colour systems - XYZ and SIELab - and have been characterised according to their colour parameters. The goals of the current study were to conduct correlation and discriminant analysis and to find the inner relation between the studied indices. K-means cluster has been used to compare and group the tested types of beer based on their similarity. To apply the K-Cluster analysis it is required that the number of clusters be determined in advance. The variant K = 4 was worked out. The first cluster unified all bright beers, the second one contained samples with fruits, the third one contained samples with addition of lemon, the fourth unified the samples of dark beers. By applying the discriminant analysis it is possible to help selections in the establishment of the type of beer. The proposed model correctly describes the types of beer on the Bulgarian market and it can be used for determining the affiliation of the beer which is not used in obtained model. One sample has been chosen from each cluster and the digital image has been obtained. It confirms the color parameters in the color system XYZ and SIELab. These facts can be used for elaboration for express estimation of beer by color.

Top