Sample records for cluster sample design

  1. Choosing a Cluster Sampling Design for Lot Quality Assurance Sampling Surveys

    PubMed Central

    Hund, Lauren; Bedrick, Edward J.; Pagano, Marcello

    2015-01-01

    Lot quality assurance sampling (LQAS) surveys are commonly used for monitoring and evaluation in resource-limited settings. Recently several methods have been proposed to combine LQAS with cluster sampling for more timely and cost-effective data collection. For some of these methods, the standard binomial model can be used for constructing decision rules as the clustering can be ignored. For other designs, considered here, clustering is accommodated in the design phase. In this paper, we compare these latter cluster LQAS methodologies and provide recommendations for choosing a cluster LQAS design. We compare technical differences in the three methods and determine situations in which the choice of method results in a substantively different design. We consider two different aspects of the methods: the distributional assumptions and the clustering parameterization. Further, we provide software tools for implementing each method and clarify misconceptions about these designs in the literature. We illustrate the differences in these methods using vaccination and nutrition cluster LQAS surveys as example designs. The cluster methods are not sensitive to the distributional assumptions but can result in substantially different designs (sample sizes) depending on the clustering parameterization. However, none of the clustering parameterizations used in the existing methods appears to be consistent with the observed data, and, consequently, choice between the cluster LQAS methods is not straightforward. Further research should attempt to characterize clustering patterns in specific applications and provide suggestions for best-practice cluster LQAS designs on a setting-specific basis. PMID:26125967

  2. Choosing a Cluster Sampling Design for Lot Quality Assurance Sampling Surveys.

    PubMed

    Hund, Lauren; Bedrick, Edward J; Pagano, Marcello

    2015-01-01

    Lot quality assurance sampling (LQAS) surveys are commonly used for monitoring and evaluation in resource-limited settings. Recently several methods have been proposed to combine LQAS with cluster sampling for more timely and cost-effective data collection. For some of these methods, the standard binomial model can be used for constructing decision rules as the clustering can be ignored. For other designs, considered here, clustering is accommodated in the design phase. In this paper, we compare these latter cluster LQAS methodologies and provide recommendations for choosing a cluster LQAS design. We compare technical differences in the three methods and determine situations in which the choice of method results in a substantively different design. We consider two different aspects of the methods: the distributional assumptions and the clustering parameterization. Further, we provide software tools for implementing each method and clarify misconceptions about these designs in the literature. We illustrate the differences in these methods using vaccination and nutrition cluster LQAS surveys as example designs. The cluster methods are not sensitive to the distributional assumptions but can result in substantially different designs (sample sizes) depending on the clustering parameterization. However, none of the clustering parameterizations used in the existing methods appears to be consistent with the observed data, and, consequently, choice between the cluster LQAS methods is not straightforward. Further research should attempt to characterize clustering patterns in specific applications and provide suggestions for best-practice cluster LQAS designs on a setting-specific basis.

  3. Understanding the cluster randomised crossover design: a graphical illustraton of the components of variation and a sample size tutorial.

    PubMed

    Arnup, Sarah J; McKenzie, Joanne E; Hemming, Karla; Pilcher, David; Forbes, Andrew B

    2017-08-15

    In a cluster randomised crossover (CRXO) design, a sequence of interventions is assigned to a group, or 'cluster' of individuals. Each cluster receives each intervention in a separate period of time, forming 'cluster-periods'. Sample size calculations for CRXO trials need to account for both the cluster randomisation and crossover aspects of the design. Formulae are available for the two-period, two-intervention, cross-sectional CRXO design, however implementation of these formulae is known to be suboptimal. The aims of this tutorial are to illustrate the intuition behind the design; and provide guidance on performing sample size calculations. Graphical illustrations are used to describe the effect of the cluster randomisation and crossover aspects of the design on the correlation between individual responses in a CRXO trial. Sample size calculations for binary and continuous outcomes are illustrated using parameters estimated from the Australia and New Zealand Intensive Care Society - Adult Patient Database (ANZICS-APD) for patient mortality and length(s) of stay (LOS). The similarity between individual responses in a CRXO trial can be understood in terms of three components of variation: variation in cluster mean response; variation in the cluster-period mean response; and variation between individual responses within a cluster-period; or equivalently in terms of the correlation between individual responses in the same cluster-period (within-cluster within-period correlation, WPC), and between individual responses in the same cluster, but in different periods (within-cluster between-period correlation, BPC). The BPC lies between zero and the WPC. When the WPC and BPC are equal the precision gained by crossover aspect of the CRXO design equals the precision lost by cluster randomisation. When the BPC is zero there is no advantage in a CRXO over a parallel-group cluster randomised trial. Sample size calculations illustrate that small changes in the specification of the WPC or BPC can increase the required number of clusters. By illustrating how the parameters required for sample size calculations arise from the CRXO design and by providing guidance on both how to choose values for the parameters and perform the sample size calculations, the implementation of the sample size formulae for CRXO trials may improve.

  4. Sampling designs for HIV molecular epidemiology with application to Honduras.

    PubMed

    Shepherd, Bryan E; Rossini, Anthony J; Soto, Ramon Jeremias; De Rivera, Ivette Lorenzana; Mullins, James I

    2005-11-01

    Proper sampling is essential to characterize the molecular epidemiology of human immunodeficiency virus (HIV). HIV sampling frames are difficult to identify, so most studies use convenience samples. We discuss statistically valid and feasible sampling techniques that overcome some of the potential for bias due to convenience sampling and ensure better representation of the study population. We employ a sampling design called stratified cluster sampling. This first divides the population into geographical and/or social strata. Within each stratum, a population of clusters is chosen from groups, locations, or facilities where HIV-positive individuals might be found. Some clusters are randomly selected within strata and individuals are randomly selected within clusters. Variation and cost help determine the number of clusters and the number of individuals within clusters that are to be sampled. We illustrate the approach through a study designed to survey the heterogeneity of subtype B strains in Honduras.

  5. Extending cluster Lot Quality Assurance Sampling designs for surveillance programs

    PubMed Central

    Hund, Lauren; Pagano, Marcello

    2014-01-01

    Lot quality assurance sampling (LQAS) has a long history of applications in industrial quality control. LQAS is frequently used for rapid surveillance in global health settings, with areas classified as poor or acceptable performance based on the binary classification of an indicator. Historically, LQAS surveys have relied on simple random samples from the population; however, implementing two-stage cluster designs for surveillance sampling is often more cost-effective than simple random sampling. By applying survey sampling results to the binary classification procedure, we develop a simple and flexible non-parametric procedure to incorporate clustering effects into the LQAS sample design to appropriately inflate the sample size, accommodating finite numbers of clusters in the population when relevant. We use this framework to then discuss principled selection of survey design parameters in longitudinal surveillance programs. We apply this framework to design surveys to detect rises in malnutrition prevalence in nutrition surveillance programs in Kenya and South Sudan, accounting for clustering within villages. By combining historical information with data from previous surveys, we design surveys to detect spikes in the childhood malnutrition rate. PMID:24633656

  6. Extending cluster lot quality assurance sampling designs for surveillance programs.

    PubMed

    Hund, Lauren; Pagano, Marcello

    2014-07-20

    Lot quality assurance sampling (LQAS) has a long history of applications in industrial quality control. LQAS is frequently used for rapid surveillance in global health settings, with areas classified as poor or acceptable performance on the basis of the binary classification of an indicator. Historically, LQAS surveys have relied on simple random samples from the population; however, implementing two-stage cluster designs for surveillance sampling is often more cost-effective than simple random sampling. By applying survey sampling results to the binary classification procedure, we develop a simple and flexible nonparametric procedure to incorporate clustering effects into the LQAS sample design to appropriately inflate the sample size, accommodating finite numbers of clusters in the population when relevant. We use this framework to then discuss principled selection of survey design parameters in longitudinal surveillance programs. We apply this framework to design surveys to detect rises in malnutrition prevalence in nutrition surveillance programs in Kenya and South Sudan, accounting for clustering within villages. By combining historical information with data from previous surveys, we design surveys to detect spikes in the childhood malnutrition rate. Copyright © 2014 John Wiley & Sons, Ltd.

  7. Methods for sample size determination in cluster randomized trials

    PubMed Central

    Rutterford, Clare; Copas, Andrew; Eldridge, Sandra

    2015-01-01

    Background: The use of cluster randomized trials (CRTs) is increasing, along with the variety in their design and analysis. The simplest approach for their sample size calculation is to calculate the sample size assuming individual randomization and inflate this by a design effect to account for randomization by cluster. The assumptions of a simple design effect may not always be met; alternative or more complicated approaches are required. Methods: We summarise a wide range of sample size methods available for cluster randomized trials. For those familiar with sample size calculations for individually randomized trials but with less experience in the clustered case, this manuscript provides formulae for a wide range of scenarios with associated explanation and recommendations. For those with more experience, comprehensive summaries are provided that allow quick identification of methods for a given design, outcome and analysis method. Results: We present first those methods applicable to the simplest two-arm, parallel group, completely randomized design followed by methods that incorporate deviations from this design such as: variability in cluster sizes; attrition; non-compliance; or the inclusion of baseline covariates or repeated measures. The paper concludes with methods for alternative designs. Conclusions: There is a large amount of methodology available for sample size calculations in CRTs. This paper gives the most comprehensive description of published methodology for sample size calculation and provides an important resource for those designing these trials. PMID:26174515

  8. The effect of clustering on lot quality assurance sampling: a probabilistic model to calculate sample sizes for quality assessments

    PubMed Central

    2013-01-01

    Background Traditional Lot Quality Assurance Sampling (LQAS) designs assume observations are collected using simple random sampling. Alternatively, randomly sampling clusters of observations and then individuals within clusters reduces costs but decreases the precision of the classifications. In this paper, we develop a general framework for designing the cluster(C)-LQAS system and illustrate the method with the design of data quality assessments for the community health worker program in Rwanda. Results To determine sample size and decision rules for C-LQAS, we use the beta-binomial distribution to account for inflated risk of errors introduced by sampling clusters at the first stage. We present general theory and code for sample size calculations. The C-LQAS sample sizes provided in this paper constrain misclassification risks below user-specified limits. Multiple C-LQAS systems meet the specified risk requirements, but numerous considerations, including per-cluster versus per-individual sampling costs, help identify optimal systems for distinct applications. Conclusions We show the utility of C-LQAS for data quality assessments, but the method generalizes to numerous applications. This paper provides the necessary technical detail and supplemental code to support the design of C-LQAS for specific programs. PMID:24160725

  9. The effect of clustering on lot quality assurance sampling: a probabilistic model to calculate sample sizes for quality assessments.

    PubMed

    Hedt-Gauthier, Bethany L; Mitsunaga, Tisha; Hund, Lauren; Olives, Casey; Pagano, Marcello

    2013-10-26

    Traditional Lot Quality Assurance Sampling (LQAS) designs assume observations are collected using simple random sampling. Alternatively, randomly sampling clusters of observations and then individuals within clusters reduces costs but decreases the precision of the classifications. In this paper, we develop a general framework for designing the cluster(C)-LQAS system and illustrate the method with the design of data quality assessments for the community health worker program in Rwanda. To determine sample size and decision rules for C-LQAS, we use the beta-binomial distribution to account for inflated risk of errors introduced by sampling clusters at the first stage. We present general theory and code for sample size calculations.The C-LQAS sample sizes provided in this paper constrain misclassification risks below user-specified limits. Multiple C-LQAS systems meet the specified risk requirements, but numerous considerations, including per-cluster versus per-individual sampling costs, help identify optimal systems for distinct applications. We show the utility of C-LQAS for data quality assessments, but the method generalizes to numerous applications. This paper provides the necessary technical detail and supplemental code to support the design of C-LQAS for specific programs.

  10. Cluster designs to assess the prevalence of acute malnutrition by lot quality assurance sampling: a validation study by computer simulation.

    PubMed

    Olives, Casey; Pagano, Marcello; Deitchler, Megan; Hedt, Bethany L; Egge, Kari; Valadez, Joseph J

    2009-04-01

    Traditional lot quality assurance sampling (LQAS) methods require simple random sampling to guarantee valid results. However, cluster sampling has been proposed to reduce the number of random starting points. This study uses simulations to examine the classification error of two such designs, a 67x3 (67 clusters of three observations) and a 33x6 (33 clusters of six observations) sampling scheme to assess the prevalence of global acute malnutrition (GAM). Further, we explore the use of a 67x3 sequential sampling scheme for LQAS classification of GAM prevalence. Results indicate that, for independent clusters with moderate intracluster correlation for the GAM outcome, the three sampling designs maintain approximate validity for LQAS analysis. Sequential sampling can substantially reduce the average sample size that is required for data collection. The presence of intercluster correlation can impact dramatically the classification error that is associated with LQAS analysis.

  11. Sample size calculations for stepped wedge and cluster randomised trials: a unified approach

    PubMed Central

    Hemming, Karla; Taljaard, Monica

    2016-01-01

    Objectives To clarify and illustrate sample size calculations for the cross-sectional stepped wedge cluster randomized trial (SW-CRT) and to present a simple approach for comparing the efficiencies of competing designs within a unified framework. Study Design and Setting We summarize design effects for the SW-CRT, the parallel cluster randomized trial (CRT), and the parallel cluster randomized trial with before and after observations (CRT-BA), assuming cross-sectional samples are selected over time. We present new formulas that enable trialists to determine the required cluster size for a given number of clusters. We illustrate by example how to implement the presented design effects and give practical guidance on the design of stepped wedge studies. Results For a fixed total cluster size, the choice of study design that provides the greatest power depends on the intracluster correlation coefficient (ICC) and the cluster size. When the ICC is small, the CRT tends to be more efficient; when the ICC is large, the SW-CRT tends to be more efficient and can serve as an alternative design when the CRT is an infeasible design. Conclusion Our unified approach allows trialists to easily compare the efficiencies of three competing designs to inform the decision about the most efficient design in a given scenario. PMID:26344808

  12. Cluster designs to assess the prevalence of acute malnutrition by lot quality assurance sampling: a validation study by computer simulation

    PubMed Central

    Olives, Casey; Pagano, Marcello; Deitchler, Megan; Hedt, Bethany L; Egge, Kari; Valadez, Joseph J

    2009-01-01

    Traditional lot quality assurance sampling (LQAS) methods require simple random sampling to guarantee valid results. However, cluster sampling has been proposed to reduce the number of random starting points. This study uses simulations to examine the classification error of two such designs, a 67×3 (67 clusters of three observations) and a 33×6 (33 clusters of six observations) sampling scheme to assess the prevalence of global acute malnutrition (GAM). Further, we explore the use of a 67×3 sequential sampling scheme for LQAS classification of GAM prevalence. Results indicate that, for independent clusters with moderate intracluster correlation for the GAM outcome, the three sampling designs maintain approximate validity for LQAS analysis. Sequential sampling can substantially reduce the average sample size that is required for data collection. The presence of intercluster correlation can impact dramatically the classification error that is associated with LQAS analysis. PMID:20011037

  13. Two-stage sequential sampling: A neighborhood-free adaptive sampling procedure

    USGS Publications Warehouse

    Salehi, M.; Smith, D.R.

    2005-01-01

    Designing an efficient sampling scheme for a rare and clustered population is a challenging area of research. Adaptive cluster sampling, which has been shown to be viable for such a population, is based on sampling a neighborhood of units around a unit that meets a specified condition. However, the edge units produced by sampling neighborhoods have proven to limit the efficiency and applicability of adaptive cluster sampling. We propose a sampling design that is adaptive in the sense that the final sample depends on observed values, but it avoids the use of neighborhoods and the sampling of edge units. Unbiased estimators of population total and its variance are derived using Murthy's estimator. The modified two-stage sampling design is easy to implement and can be applied to a wider range of populations than adaptive cluster sampling. We evaluate the proposed sampling design by simulating sampling of two real biological populations and an artificial population for which the variable of interest took the value either 0 or 1 (e.g., indicating presence and absence of a rare event). We show that the proposed sampling design is more efficient than conventional sampling in nearly all cases. The approach used to derive estimators (Murthy's estimator) opens the door for unbiased estimators to be found for similar sequential sampling designs. ?? 2005 American Statistical Association and the International Biometric Society.

  14. A field test of three LQAS designs to assess the prevalence of acute malnutrition.

    PubMed

    Deitchler, Megan; Valadez, Joseph J; Egge, Kari; Fernandez, Soledad; Hennigan, Mary

    2007-08-01

    The conventional method for assessing the prevalence of Global Acute Malnutrition (GAM) in emergency settings is the 30 x 30 cluster-survey. This study describes alternative approaches: three Lot Quality Assurance Sampling (LQAS) designs to assess GAM. The LQAS designs were field-tested and their results compared with those from a 30 x 30 cluster-survey. Computer simulations confirmed that small clusters instead of a simple random sample could be used for LQAS assessments of GAM. Three LQAS designs were developed (33 x 6, 67 x 3, Sequential design) to assess GAM thresholds of 10, 15 and 20%. The designs were field-tested simultaneously with a 30 x 30 cluster-survey in Siraro, Ethiopia during June 2003. Using a nested study design, anthropometric, morbidity and vaccination data were collected on all children 6-59 months in sampled households. Hypothesis tests about GAM thresholds were conducted for each LQAS design. Point estimates were obtained for the 30 x 30 cluster-survey and the 33 x 6 and 67 x 3 LQAS designs. Hypothesis tests showed GAM as <10% for the 33 x 6 design and GAM as > or =10% for the 67 x 3 and Sequential designs. Point estimates for the 33 x 6 and 67 x 3 designs were similar to those of the 30 x 30 cluster-survey for GAM (6.7%, CI = 3.2-10.2%; 8.2%, CI = 4.3-12.1%, 7.4%, CI = 4.8-9.9%) and all other indicators. The CIs for the LQAS designs were only slightly wider than the CIs for the 30 x 30 cluster-survey; yet the LQAS designs required substantially less time to administer. The LQAS designs provide statistically appropriate alternatives to the more time-consuming 30 x 30 cluster-survey. However, additional field-testing is needed using independent samples rather than a nested study design.

  15. Precision, time, and cost: a comparison of three sampling designs in an emergency setting.

    PubMed

    Deitchler, Megan; Deconinck, Hedwig; Bergeron, Gilles

    2008-05-02

    The conventional method to collect data on the health, nutrition, and food security status of a population affected by an emergency is a 30 x 30 cluster survey. This sampling method can be time and resource intensive and, accordingly, may not be the most appropriate one when data are needed rapidly for decision making. In this study, we compare the precision, time and cost of the 30 x 30 cluster survey with two alternative sampling designs: a 33 x 6 cluster design (33 clusters, 6 observations per cluster) and a 67 x 3 cluster design (67 clusters, 3 observations per cluster). Data for each sampling design were collected concurrently in West Darfur, Sudan in September-October 2005 in an emergency setting. Results of the study show the 30 x 30 design to provide more precise results (i.e. narrower 95% confidence intervals) than the 33 x 6 and 67 x 3 design for most child-level indicators. Exceptions are indicators of immunization and vitamin A capsule supplementation coverage which show a high intra-cluster correlation. Although the 33 x 6 and 67 x 3 designs provide wider confidence intervals than the 30 x 30 design for child anthropometric indicators, the 33 x 6 and 67 x 3 designs provide the opportunity to conduct a LQAS hypothesis test to detect whether or not a critical threshold of global acute malnutrition prevalence has been exceeded, whereas the 30 x 30 design does not. For the household-level indicators tested in this study, the 67 x 3 design provides the most precise results. However, our results show that neither the 33 x 6 nor the 67 x 3 design are appropriate for assessing indicators of mortality. In this field application, data collection for the 33 x 6 and 67 x 3 designs required substantially less time and cost than that required for the 30 x 30 design. The findings of this study suggest the 33 x 6 and 67 x 3 designs can provide useful time- and resource-saving alternatives to the 30 x 30 method of data collection in emergency settings.

  16. Precision, time, and cost: a comparison of three sampling designs in an emergency setting

    PubMed Central

    Deitchler, Megan; Deconinck, Hedwig; Bergeron, Gilles

    2008-01-01

    The conventional method to collect data on the health, nutrition, and food security status of a population affected by an emergency is a 30 × 30 cluster survey. This sampling method can be time and resource intensive and, accordingly, may not be the most appropriate one when data are needed rapidly for decision making. In this study, we compare the precision, time and cost of the 30 × 30 cluster survey with two alternative sampling designs: a 33 × 6 cluster design (33 clusters, 6 observations per cluster) and a 67 × 3 cluster design (67 clusters, 3 observations per cluster). Data for each sampling design were collected concurrently in West Darfur, Sudan in September-October 2005 in an emergency setting. Results of the study show the 30 × 30 design to provide more precise results (i.e. narrower 95% confidence intervals) than the 33 × 6 and 67 × 3 design for most child-level indicators. Exceptions are indicators of immunization and vitamin A capsule supplementation coverage which show a high intra-cluster correlation. Although the 33 × 6 and 67 × 3 designs provide wider confidence intervals than the 30 × 30 design for child anthropometric indicators, the 33 × 6 and 67 × 3 designs provide the opportunity to conduct a LQAS hypothesis test to detect whether or not a critical threshold of global acute malnutrition prevalence has been exceeded, whereas the 30 × 30 design does not. For the household-level indicators tested in this study, the 67 × 3 design provides the most precise results. However, our results show that neither the 33 × 6 nor the 67 × 3 design are appropriate for assessing indicators of mortality. In this field application, data collection for the 33 × 6 and 67 × 3 designs required substantially less time and cost than that required for the 30 × 30 design. The findings of this study suggest the 33 × 6 and 67 × 3 designs can provide useful time- and resource-saving alternatives to the 30 × 30 method of data collection in emergency settings. PMID:18454866

  17. Estimating regression coefficients from clustered samples: Sampling errors and optimum sample allocation

    NASA Technical Reports Server (NTRS)

    Kalton, G.

    1983-01-01

    A number of surveys were conducted to study the relationship between the level of aircraft or traffic noise exposure experienced by people living in a particular area and their annoyance with it. These surveys generally employ a clustered sample design which affects the precision of the survey estimates. Regression analysis of annoyance on noise measures and other variables is often an important component of the survey analysis. Formulae are presented for estimating the standard errors of regression coefficients and ratio of regression coefficients that are applicable with a two- or three-stage clustered sample design. Using a simple cost function, they also determine the optimum allocation of the sample across the stages of the sample design for the estimation of a regression coefficient.

  18. A priori evaluation of two-stage cluster sampling for accuracy assessment of large-area land-cover maps

    USGS Publications Warehouse

    Wickham, J.D.; Stehman, S.V.; Smith, J.H.; Wade, T.G.; Yang, L.

    2004-01-01

    Two-stage cluster sampling reduces the cost of collecting accuracy assessment reference data by constraining sample elements to fall within a limited number of geographic domains (clusters). However, because classification error is typically positively spatially correlated, within-cluster correlation may reduce the precision of the accuracy estimates. The detailed population information to quantify a priori the effect of within-cluster correlation on precision is typically unavailable. Consequently, a convenient, practical approach to evaluate the likely performance of a two-stage cluster sample is needed. We describe such an a priori evaluation protocol focusing on the spatial distribution of the sample by land-cover class across different cluster sizes and costs of different sampling options, including options not imposing clustering. This protocol also assesses the two-stage design's adequacy for estimating the precision of accuracy estimates for rare land-cover classes. We illustrate the approach using two large-area, regional accuracy assessments from the National Land-Cover Data (NLCD), and describe how the a priorievaluation was used as a decision-making tool when implementing the NLCD design.

  19. The optimal design of stepped wedge trials with equal allocation to sequences and a comparison to other trial designs.

    PubMed

    Thompson, Jennifer A; Fielding, Katherine; Hargreaves, James; Copas, Andrew

    2017-12-01

    Background/Aims We sought to optimise the design of stepped wedge trials with an equal allocation of clusters to sequences and explored sample size comparisons with alternative trial designs. Methods We developed a new expression for the design effect for a stepped wedge trial, assuming that observations are equally correlated within clusters and an equal number of observations in each period between sequences switching to the intervention. We minimised the design effect with respect to (1) the fraction of observations before the first and after the final sequence switches (the periods with all clusters in the control or intervention condition, respectively) and (2) the number of sequences. We compared the design effect of this optimised stepped wedge trial to the design effects of a parallel cluster-randomised trial, a cluster-randomised trial with baseline observations, and a hybrid trial design (a mixture of cluster-randomised trial and stepped wedge trial) with the same total cluster size for all designs. Results We found that a stepped wedge trial with an equal allocation to sequences is optimised by obtaining all observations after the first sequence switches and before the final sequence switches to the intervention; this means that the first sequence remains in the control condition and the last sequence remains in the intervention condition for the duration of the trial. With this design, the optimal number of sequences is [Formula: see text], where [Formula: see text] is the cluster-mean correlation, [Formula: see text] is the intracluster correlation coefficient, and m is the total cluster size. The optimal number of sequences is small when the intracluster correlation coefficient and cluster size are small and large when the intracluster correlation coefficient or cluster size is large. A cluster-randomised trial remains more efficient than the optimised stepped wedge trial when the intracluster correlation coefficient or cluster size is small. A cluster-randomised trial with baseline observations always requires a larger sample size than the optimised stepped wedge trial. The hybrid design can always give an equally or more efficient design, but will be at most 5% more efficient. We provide a strategy for selecting a design if the optimal number of sequences is unfeasible. For a non-optimal number of sequences, the sample size may be reduced by allowing a proportion of observations before the first or after the final sequence has switched. Conclusion The standard stepped wedge trial is inefficient. To reduce sample sizes when a hybrid design is unfeasible, stepped wedge trial designs should have no observations before the first sequence switches or after the final sequence switches.

  20. The Effect of Cluster Sampling Design in Survey Research on the Standard Error Statistic.

    ERIC Educational Resources Information Center

    Wang, Lin; Fan, Xitao

    Standard statistical methods are used to analyze data that is assumed to be collected using a simple random sampling scheme. These methods, however, tend to underestimate variance when the data is collected with a cluster design, which is often found in educational survey research. The purposes of this paper are to demonstrate how a cluster design…

  1. A Comparison of Single Sample and Bootstrap Methods to Assess Mediation in Cluster Randomized Trials

    ERIC Educational Resources Information Center

    Pituch, Keenan A.; Stapleton, Laura M.; Kang, Joo Youn

    2006-01-01

    A Monte Carlo study examined the statistical performance of single sample and bootstrap methods that can be used to test and form confidence interval estimates of indirect effects in two cluster randomized experimental designs. The designs were similar in that they featured random assignment of clusters to one of two treatment conditions and…

  2. Re-estimating sample size in cluster randomised trials with active recruitment within clusters.

    PubMed

    van Schie, S; Moerbeek, M

    2014-08-30

    Often only a limited number of clusters can be obtained in cluster randomised trials, although many potential participants can be recruited within each cluster. Thus, active recruitment is feasible within the clusters. To obtain an efficient sample size in a cluster randomised trial, the cluster level and individual level variance should be known before the study starts, but this is often not the case. We suggest using an internal pilot study design to address this problem of unknown variances. A pilot can be useful to re-estimate the variances and re-calculate the sample size during the trial. Using simulated data, it is shown that an initially low or high power can be adjusted using an internal pilot with the type I error rate remaining within an acceptable range. The intracluster correlation coefficient can be re-estimated with more precision, which has a positive effect on the sample size. We conclude that an internal pilot study design may be used if active recruitment is feasible within a limited number of clusters. Copyright © 2014 John Wiley & Sons, Ltd.

  3. Creel survey sampling designs for estimating effort in short-duration Chinook salmon fisheries

    USGS Publications Warehouse

    McCormick, Joshua L.; Quist, Michael C.; Schill, Daniel J.

    2013-01-01

    Chinook Salmon Oncorhynchus tshawytscha sport fisheries in the Columbia River basin are commonly monitored using roving creel survey designs and require precise, unbiased catch estimates. The objective of this study was to examine the relative bias and precision of total catch estimates using various sampling designs to estimate angling effort under the assumption that mean catch rate was known. We obtained information on angling populations based on direct visual observations of portions of Chinook Salmon fisheries in three Idaho river systems over a 23-d period. Based on the angling population, Monte Carlo simulations were used to evaluate the properties of effort and catch estimates for each sampling design. All sampling designs evaluated were relatively unbiased. Systematic random sampling (SYS) resulted in the most precise estimates. The SYS and simple random sampling designs had mean square error (MSE) estimates that were generally half of those observed with cluster sampling designs. The SYS design was more efficient (i.e., higher accuracy per unit cost) than a two-cluster design. Increasing the number of clusters available for sampling within a day decreased the MSE of estimates of daily angling effort, but the MSE of total catch estimates was variable depending on the fishery. The results of our simulations provide guidelines on the relative influence of sample sizes and sampling designs on parameters of interest in short-duration Chinook Salmon fisheries.

  4. Optimal design of a plot cluster for monitoring

    Treesearch

    Charles T. Scott

    1993-01-01

    Traveling costs incurred during extensive forest surveys make cluster sampling cost-effective. Clusters are specified by the type of plots, plot size, number of plots, and the distance between plots within the cluster. A method to determine the optimal cluster design when different plot types are used for different forest resource attributes is described. The method...

  5. Intra-class correlation estimates for assessment of vitamin A intake in children.

    PubMed

    Agarwal, Girdhar G; Awasthi, Shally; Walter, Stephen D

    2005-03-01

    In many community-based surveys, multi-level sampling is inherent in the design. In the design of these studies, especially to calculate the appropriate sample size, investigators need good estimates of intra-class correlation coefficient (ICC), along with the cluster size, to adjust for variation inflation due to clustering at each level. The present study used data on the assessment of clinical vitamin A deficiency and intake of vitamin A-rich food in children in a district in India. For the survey, 16 households were sampled from 200 villages nested within eight randomly-selected blocks of the district. ICCs and components of variances were estimated from a three-level hierarchical random effects analysis of variance model. Estimates of ICCs and variance components were obtained at village and block levels. Between-cluster variation was evident at each level of clustering. In these estimates, ICCs were inversely related to cluster size, but the design effect could be substantial for large clusters. At the block level, most ICC estimates were below 0.07. At the village level, many ICC estimates ranged from 0.014 to 0.45. These estimates may provide useful information for the design of epidemiological studies in which the sampled (or allocated) units range in size from households to large administrative zones.

  6. Relative efficiency and sample size for cluster randomized trials with variable cluster sizes.

    PubMed

    You, Zhiying; Williams, O Dale; Aban, Inmaculada; Kabagambe, Edmond Kato; Tiwari, Hemant K; Cutter, Gary

    2011-02-01

    The statistical power of cluster randomized trials depends on two sample size components, the number of clusters per group and the numbers of individuals within clusters (cluster size). Variable cluster sizes are common and this variation alone may have significant impact on study power. Previous approaches have taken this into account by either adjusting total sample size using a designated design effect or adjusting the number of clusters according to an assessment of the relative efficiency of unequal versus equal cluster sizes. This article defines a relative efficiency of unequal versus equal cluster sizes using noncentrality parameters, investigates properties of this measure, and proposes an approach for adjusting the required sample size accordingly. We focus on comparing two groups with normally distributed outcomes using t-test, and use the noncentrality parameter to define the relative efficiency of unequal versus equal cluster sizes and show that statistical power depends only on this parameter for a given number of clusters. We calculate the sample size required for an unequal cluster sizes trial to have the same power as one with equal cluster sizes. Relative efficiency based on the noncentrality parameter is straightforward to calculate and easy to interpret. It connects the required mean cluster size directly to the required sample size with equal cluster sizes. Consequently, our approach first determines the sample size requirements with equal cluster sizes for a pre-specified study power and then calculates the required mean cluster size while keeping the number of clusters unchanged. Our approach allows adjustment in mean cluster size alone or simultaneous adjustment in mean cluster size and number of clusters, and is a flexible alternative to and a useful complement to existing methods. Comparison indicated that we have defined a relative efficiency that is greater than the relative efficiency in the literature under some conditions. Our measure of relative efficiency might be less than the measure in the literature under some conditions, underestimating the relative efficiency. The relative efficiency of unequal versus equal cluster sizes defined using the noncentrality parameter suggests a sample size approach that is a flexible alternative and a useful complement to existing methods.

  7. Efficient design of cluster randomized trials with treatment-dependent costs and treatment-dependent unknown variances.

    PubMed

    van Breukelen, Gerard J P; Candel, Math J J M

    2018-06-10

    Cluster randomized trials evaluate the effect of a treatment on persons nested within clusters, where treatment is randomly assigned to clusters. Current equations for the optimal sample size at the cluster and person level assume that the outcome variances and/or the study costs are known and homogeneous between treatment arms. This paper presents efficient yet robust designs for cluster randomized trials with treatment-dependent costs and treatment-dependent unknown variances, and compares these with 2 practical designs. First, the maximin design (MMD) is derived, which maximizes the minimum efficiency (minimizes the maximum sampling variance) of the treatment effect estimator over a range of treatment-to-control variance ratios. The MMD is then compared with the optimal design for homogeneous variances and costs (balanced design), and with that for homogeneous variances and treatment-dependent costs (cost-considered design). The results show that the balanced design is the MMD if the treatment-to control cost ratio is the same at both design levels (cluster, person) and within the range for the treatment-to-control variance ratio. It still is highly efficient and better than the cost-considered design if the cost ratio is within the range for the squared variance ratio. Outside that range, the cost-considered design is better and highly efficient, but it is not the MMD. An example shows sample size calculation for the MMD, and the computer code (SPSS and R) is provided as supplementary material. The MMD is recommended for trial planning if the study costs are treatment-dependent and homogeneity of variances cannot be assumed. © 2018 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.

  8. Estimation after classification using lot quality assurance sampling: corrections for curtailed sampling with application to evaluating polio vaccination campaigns.

    PubMed

    Olives, Casey; Valadez, Joseph J; Pagano, Marcello

    2014-03-01

    To assess the bias incurred when curtailment of Lot Quality Assurance Sampling (LQAS) is ignored, to present unbiased estimators, to consider the impact of cluster sampling by simulation and to apply our method to published polio immunization data from Nigeria. We present estimators of coverage when using two kinds of curtailed LQAS strategies: semicurtailed and curtailed. We study the proposed estimators with independent and clustered data using three field-tested LQAS designs for assessing polio vaccination coverage, with samples of size 60 and decision rules of 9, 21 and 33, and compare them to biased maximum likelihood estimators. Lastly, we present estimates of polio vaccination coverage from previously published data in 20 local government authorities (LGAs) from five Nigerian states. Simulations illustrate substantial bias if one ignores the curtailed sampling design. Proposed estimators show no bias. Clustering does not affect the bias of these estimators. Across simulations, standard errors show signs of inflation as clustering increases. Neither sampling strategy nor LQAS design influences estimates of polio vaccination coverage in 20 Nigerian LGAs. When coverage is low, semicurtailed LQAS strategies considerably reduces the sample size required to make a decision. Curtailed LQAS designs further reduce the sample size when coverage is high. Results presented dispel the misconception that curtailed LQAS data are unsuitable for estimation. These findings augment the utility of LQAS as a tool for monitoring vaccination efforts by demonstrating that unbiased estimation using curtailed designs is not only possible but these designs also reduce the sample size. © 2014 John Wiley & Sons Ltd.

  9. Finding gene clusters for a replicated time course study

    PubMed Central

    2014-01-01

    Background Finding genes that share similar expression patterns across samples is an important question that is frequently asked in high-throughput microarray studies. Traditional clustering algorithms such as K-means clustering and hierarchical clustering base gene clustering directly on the observed measurements and do not take into account the specific experimental design under which the microarray data were collected. A new model-based clustering method, the clustering of regression models method, takes into account the specific design of the microarray study and bases the clustering on how genes are related to sample covariates. It can find useful gene clusters for studies from complicated study designs such as replicated time course studies. Findings In this paper, we applied the clustering of regression models method to data from a time course study of yeast on two genotypes, wild type and YOX1 mutant, each with two technical replicates, and compared the clustering results with K-means clustering. We identified gene clusters that have similar expression patterns in wild type yeast, two of which were missed by K-means clustering. We further identified gene clusters whose expression patterns were changed in YOX1 mutant yeast compared to wild type yeast. Conclusions The clustering of regression models method can be a valuable tool for identifying genes that are coordinately transcribed by a common mechanism. PMID:24460656

  10. Sample size calculations for the design of cluster randomized trials: A summary of methodology.

    PubMed

    Gao, Fei; Earnest, Arul; Matchar, David B; Campbell, Michael J; Machin, David

    2015-05-01

    Cluster randomized trial designs are growing in popularity in, for example, cardiovascular medicine research and other clinical areas and parallel statistical developments concerned with the design and analysis of these trials have been stimulated. Nevertheless, reviews suggest that design issues associated with cluster randomized trials are often poorly appreciated and there remain inadequacies in, for example, describing how the trial size is determined and the associated results are presented. In this paper, our aim is to provide pragmatic guidance for researchers on the methods of calculating sample sizes. We focus attention on designs with the primary purpose of comparing two interventions with respect to continuous, binary, ordered categorical, incidence rate and time-to-event outcome variables. Issues of aggregate and non-aggregate cluster trials, adjustment for variation in cluster size and the effect size are detailed. The problem of establishing the anticipated magnitude of between- and within-cluster variation to enable planning values of the intra-cluster correlation coefficient and the coefficient of variation are also described. Illustrative examples of calculations of trial sizes for each endpoint type are included. Copyright © 2015 Elsevier Inc. All rights reserved.

  11. Sample size calculation for stepped wedge and other longitudinal cluster randomised trials.

    PubMed

    Hooper, Richard; Teerenstra, Steven; de Hoop, Esther; Eldridge, Sandra

    2016-11-20

    The sample size required for a cluster randomised trial is inflated compared with an individually randomised trial because outcomes of participants from the same cluster are correlated. Sample size calculations for longitudinal cluster randomised trials (including stepped wedge trials) need to take account of at least two levels of clustering: the clusters themselves and times within clusters. We derive formulae for sample size for repeated cross-section and closed cohort cluster randomised trials with normally distributed outcome measures, under a multilevel model allowing for variation between clusters and between times within clusters. Our formulae agree with those previously described for special cases such as crossover and analysis of covariance designs, although simulation suggests that the formulae could underestimate required sample size when the number of clusters is small. Whether using a formula or simulation, a sample size calculation requires estimates of nuisance parameters, which in our model include the intracluster correlation, cluster autocorrelation, and individual autocorrelation. A cluster autocorrelation less than 1 reflects a situation where individuals sampled from the same cluster at different times have less correlated outcomes than individuals sampled from the same cluster at the same time. Nuisance parameters could be estimated from time series obtained in similarly clustered settings with the same outcome measure, using analysis of variance to estimate variance components. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  12. Systematic review finds major deficiencies in sample size methodology and reporting for stepped-wedge cluster randomised trials

    PubMed Central

    Martin, James; Taljaard, Monica; Girling, Alan; Hemming, Karla

    2016-01-01

    Background Stepped-wedge cluster randomised trials (SW-CRT) are increasingly being used in health policy and services research, but unless they are conducted and reported to the highest methodological standards, they are unlikely to be useful to decision-makers. Sample size calculations for these designs require allowance for clustering, time effects and repeated measures. Methods We carried out a methodological review of SW-CRTs up to October 2014. We assessed adherence to reporting each of the 9 sample size calculation items recommended in the 2012 extension of the CONSORT statement to cluster trials. Results We identified 32 completed trials and 28 independent protocols published between 1987 and 2014. Of these, 45 (75%) reported a sample size calculation, with a median of 5.0 (IQR 2.5–6.0) of the 9 CONSORT items reported. Of those that reported a sample size calculation, the majority, 33 (73%), allowed for clustering, but just 15 (33%) allowed for time effects. There was a small increase in the proportions reporting a sample size calculation (from 64% before to 84% after publication of the CONSORT extension, p=0.07). The type of design (cohort or cross-sectional) was not reported clearly in the majority of studies, but cohort designs seemed to be most prevalent. Sample size calculations in cohort designs were particularly poor with only 3 out of 24 (13%) of these studies allowing for repeated measures. Discussion The quality of reporting of sample size items in stepped-wedge trials is suboptimal. There is an urgent need for dissemination of the appropriate guidelines for reporting and methodological development to match the proliferation of the use of this design in practice. Time effects and repeated measures should be considered in all SW-CRT power calculations, and there should be clarity in reporting trials as cohort or cross-sectional designs. PMID:26846897

  13. Sample size calculations for cluster randomised crossover trials in Australian and New Zealand intensive care research.

    PubMed

    Arnup, Sarah J; McKenzie, Joanne E; Pilcher, David; Bellomo, Rinaldo; Forbes, Andrew B

    2018-06-01

    The cluster randomised crossover (CRXO) design provides an opportunity to conduct randomised controlled trials to evaluate low risk interventions in the intensive care setting. Our aim is to provide a tutorial on how to perform a sample size calculation for a CRXO trial, focusing on the meaning of the elements required for the calculations, with application to intensive care trials. We use all-cause in-hospital mortality from the Australian and New Zealand Intensive Care Society Adult Patient Database clinical registry to illustrate the sample size calculations. We show sample size calculations for a two-intervention, two 12-month period, cross-sectional CRXO trial. We provide the formulae, and examples of their use, to determine the number of intensive care units required to detect a risk ratio (RR) with a designated level of power between two interventions for trials in which the elements required for sample size calculations remain constant across all ICUs (unstratified design); and in which there are distinct groups (strata) of ICUs that differ importantly in the elements required for sample size calculations (stratified design). The CRXO design markedly reduces the sample size requirement compared with the parallel-group, cluster randomised design for the example cases. The stratified design further reduces the sample size requirement compared with the unstratified design. The CRXO design enables the evaluation of routinely used interventions that can bring about small, but important, improvements in patient care in the intensive care setting.

  14. Impact of non-uniform correlation structure on sample size and power in multiple-period cluster randomised trials.

    PubMed

    Kasza, J; Hemming, K; Hooper, R; Matthews, Jns; Forbes, A B

    2017-01-01

    Stepped wedge and cluster randomised crossover trials are examples of cluster randomised designs conducted over multiple time periods that are being used with increasing frequency in health research. Recent systematic reviews of both of these designs indicate that the within-cluster correlation is typically taken account of in the analysis of data using a random intercept mixed model, implying a constant correlation between any two individuals in the same cluster no matter how far apart in time they are measured: within-period and between-period intra-cluster correlations are assumed to be identical. Recently proposed extensions allow the within- and between-period intra-cluster correlations to differ, although these methods require that all between-period intra-cluster correlations are identical, which may not be appropriate in all situations. Motivated by a proposed intensive care cluster randomised trial, we propose an alternative correlation structure for repeated cross-sectional multiple-period cluster randomised trials in which the between-period intra-cluster correlation is allowed to decay depending on the distance between measurements. We present results for the variance of treatment effect estimators for varying amounts of decay, investigating the consequences of the variation in decay on sample size planning for stepped wedge, cluster crossover and multiple-period parallel-arm cluster randomised trials. We also investigate the impact of assuming constant between-period intra-cluster correlations instead of decaying between-period intra-cluster correlations. Our results indicate that in certain design configurations, including the one corresponding to the proposed trial, a correlation decay can have an important impact on variances of treatment effect estimators, and hence on sample size and power. An R Shiny app allows readers to interactively explore the impact of correlation decay.

  15. Precision of systematic and random sampling in clustered populations: habitat patches and aggregating organisms.

    PubMed

    McGarvey, Richard; Burch, Paul; Matthews, Janet M

    2016-01-01

    Natural populations of plants and animals spatially cluster because (1) suitable habitat is patchy, and (2) within suitable habitat, individuals aggregate further into clusters of higher density. We compare the precision of random and systematic field sampling survey designs under these two processes of species clustering. Second, we evaluate the performance of 13 estimators for the variance of the sample mean from a systematic survey. Replicated simulated surveys, as counts from 100 transects, allocated either randomly or systematically within the study region, were used to estimate population density in six spatial point populations including habitat patches and Matérn circular clustered aggregations of organisms, together and in combination. The standard one-start aligned systematic survey design, a uniform 10 x 10 grid of transects, was much more precise. Variances of the 10 000 replicated systematic survey mean densities were one-third to one-fifth of those from randomly allocated transects, implying transect sample sizes giving equivalent precision by random survey would need to be three to five times larger. Organisms being restricted to patches of habitat was alone sufficient to yield this precision advantage for the systematic design. But this improved precision for systematic sampling in clustered populations is underestimated by standard variance estimators used to compute confidence intervals. True variance for the survey sample mean was computed from the variance of 10 000 simulated survey mean estimates. Testing 10 published and three newly proposed variance estimators, the two variance estimators (v) that corrected for inter-transect correlation (ν₈ and ν(W)) were the most accurate and also the most precise in clustered populations. These greatly outperformed the two "post-stratification" variance estimators (ν₂ and ν₃) that are now more commonly applied in systematic surveys. Similar variance estimator performance rankings were found with a second differently generated set of spatial point populations, ν₈ and ν(W) again being the best performers in the longer-range autocorrelated populations. However, no systematic variance estimators tested were free from bias. On balance, systematic designs bring more narrow confidence intervals in clustered populations, while random designs permit unbiased estimates of (often wider) confidence interval. The search continues for better estimators of sampling variance for the systematic survey mean.

  16. On the Analysis of Case-Control Studies in Cluster-correlated Data Settings.

    PubMed

    Haneuse, Sebastien; Rivera-Rodriguez, Claudia

    2018-01-01

    In resource-limited settings, long-term evaluation of national antiretroviral treatment (ART) programs often relies on aggregated data, the analysis of which may be subject to ecological bias. As researchers and policy makers consider evaluating individual-level outcomes such as treatment adherence or mortality, the well-known case-control design is appealing in that it provides efficiency gains over random sampling. In the context that motivates this article, valid estimation and inference requires acknowledging any clustering, although, to our knowledge, no statistical methods have been published for the analysis of case-control data for which the underlying population exhibits clustering. Furthermore, in the specific context of an ongoing collaboration in Malawi, rather than performing case-control sampling across all clinics, case-control sampling within clinics has been suggested as a more practical strategy. To our knowledge, although similar outcome-dependent sampling schemes have been described in the literature, a case-control design specific to correlated data settings is new. In this article, we describe this design, discuss balanced versus unbalanced sampling techniques, and provide a general approach to analyzing case-control studies in cluster-correlated settings based on inverse probability-weighted generalized estimating equations. Inference is based on a robust sandwich estimator with correlation parameters estimated to ensure appropriate accounting of the outcome-dependent sampling scheme. We conduct comprehensive simulations, based in part on real data on a sample of N = 78,155 program registrants in Malawi between 2005 and 2007, to evaluate small-sample operating characteristics and potential trade-offs associated with standard case-control sampling or when case-control sampling is performed within clusters.

  17. Cluster randomised crossover trials with binary data and unbalanced cluster sizes: application to studies of near-universal interventions in intensive care.

    PubMed

    Forbes, Andrew B; Akram, Muhammad; Pilcher, David; Cooper, Jamie; Bellomo, Rinaldo

    2015-02-01

    Cluster randomised crossover trials have been utilised in recent years in the health and social sciences. Methods for analysis have been proposed; however, for binary outcomes, these have received little assessment of their appropriateness. In addition, methods for determination of sample size are currently limited to balanced cluster sizes both between clusters and between periods within clusters. This article aims to extend this work to unbalanced situations and to evaluate the properties of a variety of methods for analysis of binary data, with a particular focus on the setting of potential trials of near-universal interventions in intensive care to reduce in-hospital mortality. We derive a formula for sample size estimation for unbalanced cluster sizes, and apply it to the intensive care setting to demonstrate the utility of the cluster crossover design. We conduct a numerical simulation of the design in the intensive care setting and for more general configurations, and we assess the performance of three cluster summary estimators and an individual-data estimator based on binomial-identity-link regression. For settings similar to the intensive care scenario involving large cluster sizes and small intra-cluster correlations, the sample size formulae developed and analysis methods investigated are found to be appropriate, with the unweighted cluster summary method performing well relative to the more optimal but more complex inverse-variance weighted method. More generally, we find that the unweighted and cluster-size-weighted summary methods perform well, with the relative efficiency of each largely determined systematically from the study design parameters. Performance of individual-data regression is adequate with small cluster sizes but becomes inefficient for large, unbalanced cluster sizes. When outcome prevalences are 6% or less and the within-cluster-within-period correlation is 0.05 or larger, all methods display sub-nominal confidence interval coverage, with the less prevalent the outcome the worse the coverage. As with all simulation studies, conclusions are limited to the configurations studied. We confined attention to detecting intervention effects on an absolute risk scale using marginal models and did not explore properties of binary random effects models. Cluster crossover designs with binary outcomes can be analysed using simple cluster summary methods, and sample size in unbalanced cluster size settings can be determined using relatively straightforward formulae. However, caution needs to be applied in situations with low prevalence outcomes and moderate to high intra-cluster correlations. © The Author(s) 2014.

  18. An adaptive two-stage sequential design for sampling rare and clustered populations

    USGS Publications Warehouse

    Brown, J.A.; Salehi, M.M.; Moradi, M.; Bell, G.; Smith, D.R.

    2008-01-01

    How to design an efficient large-area survey continues to be an interesting question for ecologists. In sampling large areas, as is common in environmental studies, adaptive sampling can be efficient because it ensures survey effort is targeted to subareas of high interest. In two-stage sampling, higher density primary sample units are usually of more interest than lower density primary units when populations are rare and clustered. Two-stage sequential sampling has been suggested as a method for allocating second stage sample effort among primary units. Here, we suggest a modification: adaptive two-stage sequential sampling. In this method, the adaptive part of the allocation process means the design is more flexible in how much extra effort can be directed to higher-abundance primary units. We discuss how best to design an adaptive two-stage sequential sample. ?? 2008 The Society of Population Ecology and Springer.

  19. Performance of small cluster surveys and the clustered LQAS design to estimate local-level vaccination coverage in Mali.

    PubMed

    Minetti, Andrea; Riera-Montes, Margarita; Nackers, Fabienne; Roederer, Thomas; Koudika, Marie Hortense; Sekkenes, Johanne; Taconet, Aurore; Fermon, Florence; Touré, Albouhary; Grais, Rebecca F; Checchi, Francesco

    2012-10-12

    Estimation of vaccination coverage at the local level is essential to identify communities that may require additional support. Cluster surveys can be used in resource-poor settings, when population figures are inaccurate. To be feasible, cluster samples need to be small, without losing robustness of results. The clustered LQAS (CLQAS) approach has been proposed as an alternative, as smaller sample sizes are required. We explored (i) the efficiency of cluster surveys of decreasing sample size through bootstrapping analysis and (ii) the performance of CLQAS under three alternative sampling plans to classify local VC, using data from a survey carried out in Mali after mass vaccination against meningococcal meningitis group A. VC estimates provided by a 10 × 15 cluster survey design were reasonably robust. We used them to classify health areas in three categories and guide mop-up activities: i) health areas not requiring supplemental activities; ii) health areas requiring additional vaccination; iii) health areas requiring further evaluation. As sample size decreased (from 10 × 15 to 10 × 3), standard error of VC and ICC estimates were increasingly unstable. Results of CLQAS simulations were not accurate for most health areas, with an overall risk of misclassification greater than 0.25 in one health area out of three. It was greater than 0.50 in one health area out of two under two of the three sampling plans. Small sample cluster surveys (10 × 15) are acceptably robust for classification of VC at local level. We do not recommend the CLQAS method as currently formulated for evaluating vaccination programmes.

  20. An imbalance in cluster sizes does not lead to notable loss of power in cross-sectional, stepped-wedge cluster randomised trials with a continuous outcome.

    PubMed

    Kristunas, Caroline A; Smith, Karen L; Gray, Laura J

    2017-03-07

    The current methodology for sample size calculations for stepped-wedge cluster randomised trials (SW-CRTs) is based on the assumption of equal cluster sizes. However, as is often the case in cluster randomised trials (CRTs), the clusters in SW-CRTs are likely to vary in size, which in other designs of CRT leads to a reduction in power. The effect of an imbalance in cluster size on the power of SW-CRTs has not previously been reported, nor what an appropriate adjustment to the sample size calculation should be to allow for any imbalance. We aimed to assess the impact of an imbalance in cluster size on the power of a cross-sectional SW-CRT and recommend a method for calculating the sample size of a SW-CRT when there is an imbalance in cluster size. The effect of varying degrees of imbalance in cluster size on the power of SW-CRTs was investigated using simulations. The sample size was calculated using both the standard method and two proposed adjusted design effects (DEs), based on those suggested for CRTs with unequal cluster sizes. The data were analysed using generalised estimating equations with an exchangeable correlation matrix and robust standard errors. An imbalance in cluster size was not found to have a notable effect on the power of SW-CRTs. The two proposed adjusted DEs resulted in trials that were generally considerably over-powered. We recommend that the standard method of sample size calculation for SW-CRTs be used, provided that the assumptions of the method hold. However, it would be beneficial to investigate, through simulation, what effect the maximum likely amount of inequality in cluster sizes would be on the power of the trial and whether any inflation of the sample size would be required.

  1. Multiple Imputation in Two-Stage Cluster Samples Using The Weighted Finite Population Bayesian Bootstrap.

    PubMed

    Zhou, Hanzhi; Elliott, Michael R; Raghunathan, Trivellore E

    2016-06-01

    Multistage sampling is often employed in survey samples for cost and convenience. However, accounting for clustering features when generating datasets for multiple imputation is a nontrivial task, particularly when, as is often the case, cluster sampling is accompanied by unequal probabilities of selection, necessitating case weights. Thus, multiple imputation often ignores complex sample designs and assumes simple random sampling when generating imputations, even though failing to account for complex sample design features is known to yield biased estimates and confidence intervals that have incorrect nominal coverage. In this article, we extend a recently developed, weighted, finite-population Bayesian bootstrap procedure to generate synthetic populations conditional on complex sample design data that can be treated as simple random samples at the imputation stage, obviating the need to directly model design features for imputation. We develop two forms of this method: one where the probabilities of selection are known at the first and second stages of the design, and the other, more common in public use files, where only the final weight based on the product of the two probabilities is known. We show that this method has advantages in terms of bias, mean square error, and coverage properties over methods where sample designs are ignored, with little loss in efficiency, even when compared with correct fully parametric models. An application is made using the National Automotive Sampling System Crashworthiness Data System, a multistage, unequal probability sample of U.S. passenger vehicle crashes, which suffers from a substantial amount of missing data in "Delta-V," a key crash severity measure.

  2. Multiple Imputation in Two-Stage Cluster Samples Using The Weighted Finite Population Bayesian Bootstrap

    PubMed Central

    Zhou, Hanzhi; Elliott, Michael R.; Raghunathan, Trivellore E.

    2017-01-01

    Multistage sampling is often employed in survey samples for cost and convenience. However, accounting for clustering features when generating datasets for multiple imputation is a nontrivial task, particularly when, as is often the case, cluster sampling is accompanied by unequal probabilities of selection, necessitating case weights. Thus, multiple imputation often ignores complex sample designs and assumes simple random sampling when generating imputations, even though failing to account for complex sample design features is known to yield biased estimates and confidence intervals that have incorrect nominal coverage. In this article, we extend a recently developed, weighted, finite-population Bayesian bootstrap procedure to generate synthetic populations conditional on complex sample design data that can be treated as simple random samples at the imputation stage, obviating the need to directly model design features for imputation. We develop two forms of this method: one where the probabilities of selection are known at the first and second stages of the design, and the other, more common in public use files, where only the final weight based on the product of the two probabilities is known. We show that this method has advantages in terms of bias, mean square error, and coverage properties over methods where sample designs are ignored, with little loss in efficiency, even when compared with correct fully parametric models. An application is made using the National Automotive Sampling System Crashworthiness Data System, a multistage, unequal probability sample of U.S. passenger vehicle crashes, which suffers from a substantial amount of missing data in “Delta-V,” a key crash severity measure. PMID:29226161

  3. Adaptive sampling in research on risk-related behaviors.

    PubMed

    Thompson, Steven K; Collins, Linda M

    2002-11-01

    This article introduces adaptive sampling designs to substance use researchers. Adaptive sampling is particularly useful when the population of interest is rare, unevenly distributed, hidden, or hard to reach. Examples of such populations are injection drug users, individuals at high risk for HIV/AIDS, and young adolescents who are nicotine dependent. In conventional sampling, the sampling design is based entirely on a priori information, and is fixed before the study begins. By contrast, in adaptive sampling, the sampling design adapts based on observations made during the survey; for example, drug users may be asked to refer other drug users to the researcher. In the present article several adaptive sampling designs are discussed. Link-tracing designs such as snowball sampling, random walk methods, and network sampling are described, along with adaptive allocation and adaptive cluster sampling. It is stressed that special estimation procedures taking the sampling design into account are needed when adaptive sampling has been used. These procedures yield estimates that are considerably better than conventional estimates. For rare and clustered populations adaptive designs can give substantial gains in efficiency over conventional designs, and for hidden populations link-tracing and other adaptive procedures may provide the only practical way to obtain a sample large enough for the study objectives.

  4. RosettaAntibodyDesign (RAbD): A general framework for computational antibody design

    PubMed Central

    Adolf-Bryfogle, Jared; Kalyuzhniy, Oleks; Kubitz, Michael; Hu, Xiaozhen; Adachi, Yumiko; Schief, William R.

    2018-01-01

    A structural-bioinformatics-based computational methodology and framework have been developed for the design of antibodies to targets of interest. RosettaAntibodyDesign (RAbD) samples the diverse sequence, structure, and binding space of an antibody to an antigen in highly customizable protocols for the design of antibodies in a broad range of applications. The program samples antibody sequences and structures by grafting structures from a widely accepted set of the canonical clusters of CDRs (North et al., J. Mol. Biol., 406:228–256, 2011). It then performs sequence design according to amino acid sequence profiles of each cluster, and samples CDR backbones using a flexible-backbone design protocol incorporating cluster-based CDR constraints. Starting from an existing experimental or computationally modeled antigen-antibody structure, RAbD can be used to redesign a single CDR or multiple CDRs with loops of different length, conformation, and sequence. We rigorously benchmarked RAbD on a set of 60 diverse antibody–antigen complexes, using two design strategies—optimizing total Rosetta energy and optimizing interface energy alone. We utilized two novel metrics for measuring success in computational protein design. The design risk ratio (DRR) is equal to the frequency of recovery of native CDR lengths and clusters divided by the frequency of sampling of those features during the Monte Carlo design procedure. Ratios greater than 1.0 indicate that the design process is picking out the native more frequently than expected from their sampled rate. We achieved DRRs for the non-H3 CDRs of between 2.4 and 4.0. The antigen risk ratio (ARR) is the ratio of frequencies of the native amino acid types, CDR lengths, and clusters in the output decoys for simulations performed in the presence and absence of the antigen. For CDRs, we achieved cluster ARRs as high as 2.5 for L1 and 1.5 for H2. For sequence design simulations without CDR grafting, the overall recovery for the native amino acid types for residues that contact the antigen in the native structures was 72% in simulations performed in the presence of the antigen and 48% in simulations performed without the antigen, for an ARR of 1.5. For the non-contacting residues, the ARR was 1.08. This shows that the sequence profiles are able to maintain the amino acid types of these conserved, buried sites, while recovery of the exposed, contacting residues requires the presence of the antigen-antibody interface. We tested RAbD experimentally on both a lambda and kappa antibody–antigen complex, successfully improving their affinities 10 to 50 fold by replacing individual CDRs of the native antibody with new CDR lengths and clusters. PMID:29702641

  5. RosettaAntibodyDesign (RAbD): A general framework for computational antibody design.

    PubMed

    Adolf-Bryfogle, Jared; Kalyuzhniy, Oleks; Kubitz, Michael; Weitzner, Brian D; Hu, Xiaozhen; Adachi, Yumiko; Schief, William R; Dunbrack, Roland L

    2018-04-01

    A structural-bioinformatics-based computational methodology and framework have been developed for the design of antibodies to targets of interest. RosettaAntibodyDesign (RAbD) samples the diverse sequence, structure, and binding space of an antibody to an antigen in highly customizable protocols for the design of antibodies in a broad range of applications. The program samples antibody sequences and structures by grafting structures from a widely accepted set of the canonical clusters of CDRs (North et al., J. Mol. Biol., 406:228-256, 2011). It then performs sequence design according to amino acid sequence profiles of each cluster, and samples CDR backbones using a flexible-backbone design protocol incorporating cluster-based CDR constraints. Starting from an existing experimental or computationally modeled antigen-antibody structure, RAbD can be used to redesign a single CDR or multiple CDRs with loops of different length, conformation, and sequence. We rigorously benchmarked RAbD on a set of 60 diverse antibody-antigen complexes, using two design strategies-optimizing total Rosetta energy and optimizing interface energy alone. We utilized two novel metrics for measuring success in computational protein design. The design risk ratio (DRR) is equal to the frequency of recovery of native CDR lengths and clusters divided by the frequency of sampling of those features during the Monte Carlo design procedure. Ratios greater than 1.0 indicate that the design process is picking out the native more frequently than expected from their sampled rate. We achieved DRRs for the non-H3 CDRs of between 2.4 and 4.0. The antigen risk ratio (ARR) is the ratio of frequencies of the native amino acid types, CDR lengths, and clusters in the output decoys for simulations performed in the presence and absence of the antigen. For CDRs, we achieved cluster ARRs as high as 2.5 for L1 and 1.5 for H2. For sequence design simulations without CDR grafting, the overall recovery for the native amino acid types for residues that contact the antigen in the native structures was 72% in simulations performed in the presence of the antigen and 48% in simulations performed without the antigen, for an ARR of 1.5. For the non-contacting residues, the ARR was 1.08. This shows that the sequence profiles are able to maintain the amino acid types of these conserved, buried sites, while recovery of the exposed, contacting residues requires the presence of the antigen-antibody interface. We tested RAbD experimentally on both a lambda and kappa antibody-antigen complex, successfully improving their affinities 10 to 50 fold by replacing individual CDRs of the native antibody with new CDR lengths and clusters.

  6. Performance of small cluster surveys and the clustered LQAS design to estimate local-level vaccination coverage in Mali

    PubMed Central

    2012-01-01

    Background Estimation of vaccination coverage at the local level is essential to identify communities that may require additional support. Cluster surveys can be used in resource-poor settings, when population figures are inaccurate. To be feasible, cluster samples need to be small, without losing robustness of results. The clustered LQAS (CLQAS) approach has been proposed as an alternative, as smaller sample sizes are required. Methods We explored (i) the efficiency of cluster surveys of decreasing sample size through bootstrapping analysis and (ii) the performance of CLQAS under three alternative sampling plans to classify local VC, using data from a survey carried out in Mali after mass vaccination against meningococcal meningitis group A. Results VC estimates provided by a 10 × 15 cluster survey design were reasonably robust. We used them to classify health areas in three categories and guide mop-up activities: i) health areas not requiring supplemental activities; ii) health areas requiring additional vaccination; iii) health areas requiring further evaluation. As sample size decreased (from 10 × 15 to 10 × 3), standard error of VC and ICC estimates were increasingly unstable. Results of CLQAS simulations were not accurate for most health areas, with an overall risk of misclassification greater than 0.25 in one health area out of three. It was greater than 0.50 in one health area out of two under two of the three sampling plans. Conclusions Small sample cluster surveys (10 × 15) are acceptably robust for classification of VC at local level. We do not recommend the CLQAS method as currently formulated for evaluating vaccination programmes. PMID:23057445

  7. An improved initialization center k-means clustering algorithm based on distance and density

    NASA Astrophysics Data System (ADS)

    Duan, Yanling; Liu, Qun; Xia, Shuyin

    2018-04-01

    Aiming at the problem of the random initial clustering center of k means algorithm that the clustering results are influenced by outlier data sample and are unstable in multiple clustering, a method of central point initialization method based on larger distance and higher density is proposed. The reciprocal of the weighted average of distance is used to represent the sample density, and the data sample with the larger distance and the higher density are selected as the initial clustering centers to optimize the clustering results. Then, a clustering evaluation method based on distance and density is designed to verify the feasibility of the algorithm and the practicality, the experimental results on UCI data sets show that the algorithm has a certain stability and practicality.

  8. Evaluation of Primary Immunization Coverage of Infants Under Universal Immunization Programme in an Urban Area of Bangalore City Using Cluster Sampling and Lot Quality Assurance Sampling Techniques

    PubMed Central

    K, Punith; K, Lalitha; G, Suman; BS, Pradeep; Kumar K, Jayanth

    2008-01-01

    Research Question: Is LQAS technique better than cluster sampling technique in terms of resources to evaluate the immunization coverage in an urban area? Objective: To assess and compare the lot quality assurance sampling against cluster sampling in the evaluation of primary immunization coverage. Study Design: Population-based cross-sectional study. Study Setting: Areas under Mathikere Urban Health Center. Study Subjects: Children aged 12 months to 23 months. Sample Size: 220 in cluster sampling, 76 in lot quality assurance sampling. Statistical Analysis: Percentages and Proportions, Chi square Test. Results: (1) Using cluster sampling, the percentage of completely immunized, partially immunized and unimmunized children were 84.09%, 14.09% and 1.82%, respectively. With lot quality assurance sampling, it was 92.11%, 6.58% and 1.31%, respectively. (2) Immunization coverage levels as evaluated by cluster sampling technique were not statistically different from the coverage value as obtained by lot quality assurance sampling techniques. Considering the time and resources required, it was found that lot quality assurance sampling is a better technique in evaluating the primary immunization coverage in urban area. PMID:19876474

  9. Procedures to handle inventory cluster plots that straddle two or more conditions

    Treesearch

    Jerold T. Hahn; Colin D. MacLean; Stanford L. Arner; William A. Bechtold

    1995-01-01

    We review the relative merits and field procedures for four basic plot designs to handle forest inventory plots that straddle two or more conditions, given that subplots will not be moved. A cluster design is recommended that combines fixed-area subplots and variable-radius plot (VRP) sampling. Each subplot in a cluster consists of a large fixed-area subplot for...

  10. Declustering of clustered preferential sampling for histogram and semivariogram inference

    USGS Publications Warehouse

    Olea, R.A.

    2007-01-01

    Measurements of attributes obtained more as a consequence of business ventures than sampling design frequently result in samplings that are preferential both in location and value, typically in the form of clusters along the pay. Preferential sampling requires preprocessing for the purpose of properly inferring characteristics of the parent population, such as the cumulative distribution and the semivariogram. Consideration of the distance to the nearest neighbor allows preparation of resampled sets that produce comparable results to those from previously proposed methods. Clustered sampling of size 140, taken from an exhaustive sampling, is employed to illustrate this approach. ?? International Association for Mathematical Geology 2007.

  11. Observed intra-cluster correlation coefficients in a cluster survey sample of patient encounters in general practice in Australia

    PubMed Central

    Knox, Stephanie A; Chondros, Patty

    2004-01-01

    Background Cluster sample study designs are cost effective, however cluster samples violate the simple random sample assumption of independence of observations. Failure to account for the intra-cluster correlation of observations when sampling through clusters may lead to an under-powered study. Researchers therefore need estimates of intra-cluster correlation for a range of outcomes to calculate sample size. We report intra-cluster correlation coefficients observed within a large-scale cross-sectional study of general practice in Australia, where the general practitioner (GP) was the primary sampling unit and the patient encounter was the unit of inference. Methods Each year the Bettering the Evaluation and Care of Health (BEACH) study recruits a random sample of approximately 1,000 GPs across Australia. Each GP completes details of 100 consecutive patient encounters. Intra-cluster correlation coefficients were estimated for patient demographics, morbidity managed and treatments received. Intra-cluster correlation coefficients were estimated for descriptive outcomes and for associations between outcomes and predictors and were compared across two independent samples of GPs drawn three years apart. Results Between April 1999 and March 2000, a random sample of 1,047 Australian general practitioners recorded details of 104,700 patient encounters. Intra-cluster correlation coefficients for patient demographics ranged from 0.055 for patient sex to 0.451 for language spoken at home. Intra-cluster correlations for morbidity variables ranged from 0.005 for the management of eye problems to 0.059 for management of psychological problems. Intra-cluster correlation for the association between two variables was smaller than the descriptive intra-cluster correlation of each variable. When compared with the April 2002 to March 2003 sample (1,008 GPs) the estimated intra-cluster correlation coefficients were found to be consistent across samples. Conclusions The demonstrated precision and reliability of the estimated intra-cluster correlations indicate that these coefficients will be useful for calculating sample sizes in future general practice surveys that use the GP as the primary sampling unit. PMID:15613248

  12. Quality of reporting of pilot and feasibility cluster randomised trials: a systematic review

    PubMed Central

    Chan, Claire L; Leyrat, Clémence; Eldridge, Sandra M

    2017-01-01

    Objectives To systematically review the quality of reporting of pilot and feasibility of cluster randomised trials (CRTs). In particular, to assess (1) the number of pilot CRTs conducted between 1 January 2011 and 31 December 2014, (2) whether objectives and methods are appropriate and (3) reporting quality. Methods We searched PubMed (2011–2014) for CRTs with ‘pilot’ or ‘feasibility’ in the title or abstract; that were assessing some element of feasibility and showing evidence the study was in preparation for a main effectiveness/efficacy trial. Quality assessment criteria were based on the Consolidated Standards of Reporting Trials (CONSORT) extensions for pilot trials and CRTs. Results Eighteen pilot CRTs were identified. Forty-four per cent did not have feasibility as their primary objective, and many (50%) performed formal hypothesis testing for effectiveness/efficacy despite being underpowered. Most (83%) included ‘pilot’ or ‘feasibility’ in the title, and discussed implications for progression from the pilot to the future definitive trial (89%), but fewer reported reasons for the randomised pilot trial (39%), sample size rationale (44%) or progression criteria (17%). Most defined the cluster (100%), and number of clusters randomised (94%), but few reported how the cluster design affected sample size (17%), whether consent was sought from clusters (11%), or who enrolled clusters (17%). Conclusions That only 18 pilot CRTs were identified necessitates increased awareness of the importance of conducting and publishing pilot CRTs and improved reporting. Pilot CRTs should primarily be assessing feasibility, avoiding formal hypothesis testing for effectiveness/efficacy and reporting reasons for the pilot, sample size rationale and progression criteria, as well as enrolment of clusters, and how the cluster design affects design aspects. We recommend adherence to the CONSORT extensions for pilot trials and CRTs. PMID:29122791

  13. THE NORTH CAROLINA HERALD PILOT STUDY

    EPA Science Inventory



    The sampling design for the National Children's Study (NCS) calls for a population-based, multi-stage, clustered household sampling approach. The full sample is designed to be representative of both urban and rural births in the United States, 2007-2011. While other sur...

  14. A primer on stand and forest inventory designs

    Treesearch

    H. Gyde Lund; Charles E. Thomas

    1989-01-01

    Covers designs for the inventory of stands and forests in detail and with worked-out examples. For stands, random sampling, line transects, ricochet plot, systematic sampling, single plot, cluster, subjective sampling and complete enumeration are discussed. For forests inventory, the main categories are subjective sampling, inventories without prior stand mapping,...

  15. RECRUITING FOR A LONGITUDINAL STUDY OF CHILDREN'S HEALTH USING A HOUSEHOLD-BASED PROBABILITY SAMPLING APPROACH

    EPA Science Inventory

    The sampling design for the National Children¿s Study (NCS) calls for a population-based, multi-stage, clustered household sampling approach (visit our website for more information on the NCS : www.nationalchildrensstudy.gov). The full sample is designed to be representative of ...

  16. Evaluation of single and two-stage adaptive sampling designs for estimation of density and abundance of freshwater mussels in a large river

    USGS Publications Warehouse

    Smith, D.R.; Rogala, J.T.; Gray, B.R.; Zigler, S.J.; Newton, T.J.

    2011-01-01

    Reliable estimates of abundance are needed to assess consequences of proposed habitat restoration and enhancement projects on freshwater mussels in the Upper Mississippi River (UMR). Although there is general guidance on sampling techniques for population assessment of freshwater mussels, the actual performance of sampling designs can depend critically on the population density and spatial distribution at the project site. To evaluate various sampling designs, we simulated sampling of populations, which varied in density and degree of spatial clustering. Because of logistics and costs of large river sampling and spatial clustering of freshwater mussels, we focused on adaptive and non-adaptive versions of single and two-stage sampling. The candidate designs performed similarly in terms of precision (CV) and probability of species detection for fixed sample size. Both CV and species detection were determined largely by density, spatial distribution and sample size. However, designs did differ in the rate that occupied quadrats were encountered. Occupied units had a higher probability of selection using adaptive designs than conventional designs. We used two measures of cost: sample size (i.e. number of quadrats) and distance travelled between the quadrats. Adaptive and two-stage designs tended to reduce distance between sampling units, and thus performed better when distance travelled was considered. Based on the comparisons, we provide general recommendations on the sampling designs for the freshwater mussels in the UMR, and presumably other large rivers.

  17. Enhancing local health department disaster response capacity with rapid community needs assessments: validation of a computerized program for binary attribute cluster sampling.

    PubMed

    Groenewold, Matthew R

    2006-01-01

    Local health departments are among the first agencies to respond to disasters or other mass emergencies. However, they often lack the ability to handle large-scale events. Plans including locally developed and deployed tools may enhance local response. Simplified cluster sampling methods can be useful in assessing community needs after a sudden-onset, short duration event. Using an adaptation of the methodology used by the World Health Organization Expanded Programme on Immunization (EPI), a Microsoft Access-based application for two-stage cluster sampling of residential addresses in Louisville/Jefferson County Metro, Kentucky was developed. The sampling frame was derived from geographically referenced data on residential addresses and political districts available through the Louisville/Jefferson County Information Consortium (LOJIC). The program randomly selected 30 clusters, defined as election precincts, from within the area of interest, and then, randomly selected 10 residential addresses from each cluster. The program, called the Rapid Assessment Tools Package (RATP), was tested in terms of accuracy and precision using data on a dichotomous characteristic of residential addresses available from the local tax assessor database. A series of 30 samples were produced and analyzed with respect to their precision and accuracy in estimating the prevalence of the study attribute. Point estimates with 95% confidence intervals were calculated by determining the proportion of the study attribute values in each of the samples and compared with the population proportion. To estimate the design effect, corresponding simple random samples of 300 addresses were taken after each of the 30 cluster samples. The sample proportion fell within +/-10 absolute percentage points of the true proportion in 80% of the samples. In 93.3% of the samples, the point estimate fell within +/-12.5%, and 96.7% fell within +/-15%. All of the point estimates fell within +/-20% of the true proportion. Estimates of the design effect ranged from 0.926 to 1.436 (mean = 1.157, median = 1.170) for the 30 samples. Although prospective evaluation of its performance in field trials or a real emergency is required to confirm its utility, this study suggests that the RATP, a locally designed and deployed tool, may provide population-based estimates of community needs or the extent of event-related consequences that are precise enough to serve as the basis for the initial post-event decisions regarding relief efforts.

  18. Adaptive sampling in behavioral surveys.

    PubMed

    Thompson, S K

    1997-01-01

    Studies of populations such as drug users encounter difficulties because the members of the populations are rare, hidden, or hard to reach. Conventionally designed large-scale surveys detect relatively few members of the populations so that estimates of population characteristics have high uncertainty. Ethnographic studies, on the other hand, reach suitable numbers of individuals only through the use of link-tracing, chain referral, or snowball sampling procedures that often leave the investigators unable to make inferences from their sample to the hidden population as a whole. In adaptive sampling, the procedure for selecting people or other units to be in the sample depends on variables of interest observed during the survey, so the design adapts to the population as encountered. For example, when self-reported drug use is found among members of the sample, sampling effort may be increased in nearby areas. Types of adaptive sampling designs include ordinary sequential sampling, adaptive allocation in stratified sampling, adaptive cluster sampling, and optimal model-based designs. Graph sampling refers to situations with nodes (for example, people) connected by edges (such as social links or geographic proximity). An initial sample of nodes or edges is selected and edges are subsequently followed to bring other nodes into the sample. Graph sampling designs include network sampling, snowball sampling, link-tracing, chain referral, and adaptive cluster sampling. A graph sampling design is adaptive if the decision to include linked nodes depends on variables of interest observed on nodes already in the sample. Adjustment methods for nonsampling errors such as imperfect detection of drug users in the sample apply to adaptive as well as conventional designs.

  19. A nonparametric method to generate synthetic populations to adjust for complex sampling design features.

    PubMed

    Dong, Qi; Elliott, Michael R; Raghunathan, Trivellore E

    2014-06-01

    Outside of the survey sampling literature, samples are often assumed to be generated by a simple random sampling process that produces independent and identically distributed (IID) samples. Many statistical methods are developed largely in this IID world. Application of these methods to data from complex sample surveys without making allowance for the survey design features can lead to erroneous inferences. Hence, much time and effort have been devoted to develop the statistical methods to analyze complex survey data and account for the sample design. This issue is particularly important when generating synthetic populations using finite population Bayesian inference, as is often done in missing data or disclosure risk settings, or when combining data from multiple surveys. By extending previous work in finite population Bayesian bootstrap literature, we propose a method to generate synthetic populations from a posterior predictive distribution in a fashion inverts the complex sampling design features and generates simple random samples from a superpopulation point of view, making adjustment on the complex data so that they can be analyzed as simple random samples. We consider a simulation study with a stratified, clustered unequal-probability of selection sample design, and use the proposed nonparametric method to generate synthetic populations for the 2006 National Health Interview Survey (NHIS), and the Medical Expenditure Panel Survey (MEPS), which are stratified, clustered unequal-probability of selection sample designs.

  20. A nonparametric method to generate synthetic populations to adjust for complex sampling design features

    PubMed Central

    Dong, Qi; Elliott, Michael R.; Raghunathan, Trivellore E.

    2017-01-01

    Outside of the survey sampling literature, samples are often assumed to be generated by a simple random sampling process that produces independent and identically distributed (IID) samples. Many statistical methods are developed largely in this IID world. Application of these methods to data from complex sample surveys without making allowance for the survey design features can lead to erroneous inferences. Hence, much time and effort have been devoted to develop the statistical methods to analyze complex survey data and account for the sample design. This issue is particularly important when generating synthetic populations using finite population Bayesian inference, as is often done in missing data or disclosure risk settings, or when combining data from multiple surveys. By extending previous work in finite population Bayesian bootstrap literature, we propose a method to generate synthetic populations from a posterior predictive distribution in a fashion inverts the complex sampling design features and generates simple random samples from a superpopulation point of view, making adjustment on the complex data so that they can be analyzed as simple random samples. We consider a simulation study with a stratified, clustered unequal-probability of selection sample design, and use the proposed nonparametric method to generate synthetic populations for the 2006 National Health Interview Survey (NHIS), and the Medical Expenditure Panel Survey (MEPS), which are stratified, clustered unequal-probability of selection sample designs. PMID:29200608

  1. Finite-sample corrected generalized estimating equation of population average treatment effects in stepped wedge cluster randomized trials.

    PubMed

    Scott, JoAnna M; deCamp, Allan; Juraska, Michal; Fay, Michael P; Gilbert, Peter B

    2017-04-01

    Stepped wedge designs are increasingly commonplace and advantageous for cluster randomized trials when it is both unethical to assign placebo, and it is logistically difficult to allocate an intervention simultaneously to many clusters. We study marginal mean models fit with generalized estimating equations for assessing treatment effectiveness in stepped wedge cluster randomized trials. This approach has advantages over the more commonly used mixed models that (1) the population-average parameters have an important interpretation for public health applications and (2) they avoid untestable assumptions on latent variable distributions and avoid parametric assumptions about error distributions, therefore, providing more robust evidence on treatment effects. However, cluster randomized trials typically have a small number of clusters, rendering the standard generalized estimating equation sandwich variance estimator biased and highly variable and hence yielding incorrect inferences. We study the usual asymptotic generalized estimating equation inferences (i.e., using sandwich variance estimators and asymptotic normality) and four small-sample corrections to generalized estimating equation for stepped wedge cluster randomized trials and for parallel cluster randomized trials as a comparison. We show by simulation that the small-sample corrections provide improvement, with one correction appearing to provide at least nominal coverage even with only 10 clusters per group. These results demonstrate the viability of the marginal mean approach for both stepped wedge and parallel cluster randomized trials. We also study the comparative performance of the corrected methods for stepped wedge and parallel designs, and describe how the methods can accommodate interval censoring of individual failure times and incorporate semiparametric efficient estimators.

  2. The Australian longitudinal study on male health sampling design and survey weighting: implications for analysis and interpretation of clustered data.

    PubMed

    Spittal, Matthew J; Carlin, John B; Currier, Dianne; Downes, Marnie; English, Dallas R; Gordon, Ian; Pirkis, Jane; Gurrin, Lyle

    2016-10-31

    The Australian Longitudinal Study on Male Health (Ten to Men) used a complex sampling scheme to identify potential participants for the baseline survey. This raises important questions about when and how to adjust for the sampling design when analyzing data from the baseline survey. We describe the sampling scheme used in Ten to Men focusing on four important elements: stratification, multi-stage sampling, clustering and sample weights. We discuss how these elements fit together when using baseline data to estimate a population parameter (e.g., population mean or prevalence) or to estimate the association between an exposure and an outcome (e.g., an odds ratio). We illustrate this with examples using a continuous outcome (weight in kilograms) and a binary outcome (smoking status). Estimates of a population mean or disease prevalence using Ten to Men baseline data are influenced by the extent to which the sampling design is addressed in an analysis. Estimates of mean weight and smoking prevalence are larger in unweighted analyses than weighted analyses (e.g., mean = 83.9 kg vs. 81.4 kg; prevalence = 18.0 % vs. 16.7 %, for unweighted and weighted analyses respectively) and the standard error of the mean is 1.03 times larger in an analysis that acknowledges the hierarchical (clustered) structure of the data compared with one that does not. For smoking prevalence, the corresponding standard error is 1.07 times larger. Measures of association (mean group differences, odds ratios) are generally similar in unweighted or weighted analyses and whether or not adjustment is made for clustering. The extent to which the Ten to Men sampling design is accounted for in any analysis of the baseline data will depend on the research question. When the goals of the analysis are to estimate the prevalence of a disease or risk factor in the population or the magnitude of a population-level exposure-outcome association, our advice is to adopt an analysis that respects the sampling design.

  3. Atomically precise (catalytic) particles synthesized by a novel cluster deposition instrument

    DOE PAGES

    Yin, C.; Tyo, E.; Kuchta, K.; ...

    2014-05-06

    Here, we report a new high vacuum instrument which is dedicated to the preparation of well-defined clusters supported on model and technologically relevant supports for catalytic and materials investigations. The instrument is based on deposition of size selected metallic cluster ions that are produced by a high flux magnetron cluster source. Furthermore, we maximize the throughput of the apparatus by collecting and focusing ions utilizing a conical octupole ion guide and a linear ion guide. The size selection is achieved by a quadrupole mass filter. The new design of the sample holder provides for the preparation of multiple samples onmore » supports of various sizes and shapes in one session. After cluster deposition onto the support of interest, samples will be taken out of the chamber for a variety of testing and characterization.« less

  4. Recommendations for choosing an analysis method that controls Type I error for unbalanced cluster sample designs with Gaussian outcomes.

    PubMed

    Johnson, Jacqueline L; Kreidler, Sarah M; Catellier, Diane J; Murray, David M; Muller, Keith E; Glueck, Deborah H

    2015-11-30

    We used theoretical and simulation-based approaches to study Type I error rates for one-stage and two-stage analytic methods for cluster-randomized designs. The one-stage approach uses the observed data as outcomes and accounts for within-cluster correlation using a general linear mixed model. The two-stage model uses the cluster specific means as the outcomes in a general linear univariate model. We demonstrate analytically that both one-stage and two-stage models achieve exact Type I error rates when cluster sizes are equal. With unbalanced data, an exact size α test does not exist, and Type I error inflation may occur. Via simulation, we compare the Type I error rates for four one-stage and six two-stage hypothesis testing approaches for unbalanced data. With unbalanced data, the two-stage model, weighted by the inverse of the estimated theoretical variance of the cluster means, and with variance constrained to be positive, provided the best Type I error control for studies having at least six clusters per arm. The one-stage model with Kenward-Roger degrees of freedom and unconstrained variance performed well for studies having at least 14 clusters per arm. The popular analytic method of using a one-stage model with denominator degrees of freedom appropriate for balanced data performed poorly for small sample sizes and low intracluster correlation. Because small sample sizes and low intracluster correlation are common features of cluster-randomized trials, the Kenward-Roger method is the preferred one-stage approach. Copyright © 2015 John Wiley & Sons, Ltd.

  5. Estimating accuracy of land-cover composition from two-stage cluster sampling

    USGS Publications Warehouse

    Stehman, S.V.; Wickham, J.D.; Fattorini, L.; Wade, T.D.; Baffetta, F.; Smith, J.H.

    2009-01-01

    Land-cover maps are often used to compute land-cover composition (i.e., the proportion or percent of area covered by each class), for each unit in a spatial partition of the region mapped. We derive design-based estimators of mean deviation (MD), mean absolute deviation (MAD), root mean square error (RMSE), and correlation (CORR) to quantify accuracy of land-cover composition for a general two-stage cluster sampling design, and for the special case of simple random sampling without replacement (SRSWOR) at each stage. The bias of the estimators for the two-stage SRSWOR design is evaluated via a simulation study. The estimators of RMSE and CORR have small bias except when sample size is small and the land-cover class is rare. The estimator of MAD is biased for both rare and common land-cover classes except when sample size is large. A general recommendation is that rare land-cover classes require large sample sizes to ensure that the accuracy estimators have small bias. ?? 2009 Elsevier Inc.

  6. TimesVector: a vectorized clustering approach to the analysis of time series transcriptome data from multiple phenotypes.

    PubMed

    Jung, Inuk; Jo, Kyuri; Kang, Hyejin; Ahn, Hongryul; Yu, Youngjae; Kim, Sun

    2017-12-01

    Identifying biologically meaningful gene expression patterns from time series gene expression data is important to understand the underlying biological mechanisms. To identify significantly perturbed gene sets between different phenotypes, analysis of time series transcriptome data requires consideration of time and sample dimensions. Thus, the analysis of such time series data seeks to search gene sets that exhibit similar or different expression patterns between two or more sample conditions, constituting the three-dimensional data, i.e. gene-time-condition. Computational complexity for analyzing such data is very high, compared to the already difficult NP-hard two dimensional biclustering algorithms. Because of this challenge, traditional time series clustering algorithms are designed to capture co-expressed genes with similar expression pattern in two sample conditions. We present a triclustering algorithm, TimesVector, specifically designed for clustering three-dimensional time series data to capture distinctively similar or different gene expression patterns between two or more sample conditions. TimesVector identifies clusters with distinctive expression patterns in three steps: (i) dimension reduction and clustering of time-condition concatenated vectors, (ii) post-processing clusters for detecting similar and distinct expression patterns and (iii) rescuing genes from unclassified clusters. Using four sets of time series gene expression data, generated by both microarray and high throughput sequencing platforms, we demonstrated that TimesVector successfully detected biologically meaningful clusters of high quality. TimesVector improved the clustering quality compared to existing triclustering tools and only TimesVector detected clusters with differential expression patterns across conditions successfully. The TimesVector software is available at http://biohealth.snu.ac.kr/software/TimesVector/. sunkim.bioinfo@snu.ac.kr. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  7. Comparing cluster-level dynamic treatment regimens using sequential, multiple assignment, randomized trials: Regression estimation and sample size considerations.

    PubMed

    NeCamp, Timothy; Kilbourne, Amy; Almirall, Daniel

    2017-08-01

    Cluster-level dynamic treatment regimens can be used to guide sequential treatment decision-making at the cluster level in order to improve outcomes at the individual or patient-level. In a cluster-level dynamic treatment regimen, the treatment is potentially adapted and re-adapted over time based on changes in the cluster that could be impacted by prior intervention, including aggregate measures of the individuals or patients that compose it. Cluster-randomized sequential multiple assignment randomized trials can be used to answer multiple open questions preventing scientists from developing high-quality cluster-level dynamic treatment regimens. In a cluster-randomized sequential multiple assignment randomized trial, sequential randomizations occur at the cluster level and outcomes are observed at the individual level. This manuscript makes two contributions to the design and analysis of cluster-randomized sequential multiple assignment randomized trials. First, a weighted least squares regression approach is proposed for comparing the mean of a patient-level outcome between the cluster-level dynamic treatment regimens embedded in a sequential multiple assignment randomized trial. The regression approach facilitates the use of baseline covariates which is often critical in the analysis of cluster-level trials. Second, sample size calculators are derived for two common cluster-randomized sequential multiple assignment randomized trial designs for use when the primary aim is a between-dynamic treatment regimen comparison of the mean of a continuous patient-level outcome. The methods are motivated by the Adaptive Implementation of Effective Programs Trial which is, to our knowledge, the first-ever cluster-randomized sequential multiple assignment randomized trial in psychiatry.

  8. Clustered lot quality assurance sampling: a tool to monitor immunization coverage rapidly during a national yellow fever and polio vaccination campaign in Cameroon, May 2009.

    PubMed

    Pezzoli, L; Tchio, R; Dzossa, A D; Ndjomo, S; Takeu, A; Anya, B; Ticha, J; Ronveaux, O; Lewis, R F

    2012-01-01

    We used the clustered lot quality assurance sampling (clustered-LQAS) technique to identify districts with low immunization coverage and guide mop-up actions during the last 4 days of a combined oral polio vaccine (OPV) and yellow fever (YF) vaccination campaign conducted in Cameroon in May 2009. We monitored 17 pre-selected districts at risk for low coverage. We designed LQAS plans to reject districts with YF vaccination coverage <90% and with OPV coverage <95%. In each lot the sample size was 50 (five clusters of 10) with decision values of 3 for assessing OPV and 7 for YF coverage. We 'rejected' 10 districts for low YF coverage and 14 for low OPV coverage. Hence we recommended a 2-day extension of the campaign. Clustered-LQAS proved to be useful in guiding the campaign vaccination strategy before the completion of the operations.

  9. Cluster Randomized Test-Negative Design (CR-TND) Trials: A Novel and Efficient Method to Assess the Efficacy of Community Level Dengue Interventions.

    PubMed

    Anders, Katherine L; Cutcher, Zoe; Kleinschmidt, Immo; Donnelly, Christl A; Ferguson, Neil M; Indriani, Citra; O'Neill, Scott L; Jewell, Nicholas P; Simmons, Cameron P

    2018-05-07

    Cluster randomized trials are the gold standard for assessing efficacy of community-level interventions, such as vector control strategies against dengue. We describe a novel cluster randomized trial methodology with a test-negative design, which offers advantages over traditional approaches. It utilizes outcome-based sampling of patients presenting with a syndrome consistent with the disease of interest, who are subsequently classified as test-positive cases or test-negative controls on the basis of diagnostic testing. We use simulations of a cluster trial to demonstrate validity of efficacy estimates under the test-negative approach. This demonstrates that, provided study arms are balanced for both test-negative and test-positive illness at baseline and that other test-negative design assumptions are met, the efficacy estimates closely match true efficacy. We also briefly discuss analytical considerations for an odds ratio-based effect estimate arising from clustered data, and outline potential approaches to analysis. We conclude that application of the test-negative design to certain cluster randomized trials could increase their efficiency and ease of implementation.

  10. Statistical design and analysis plan for an impact evaluation of an HIV treatment and prevention intervention for female sex workers in Zimbabwe: a study protocol for a cluster randomised controlled trial.

    PubMed

    Hargreaves, James R; Fearon, Elizabeth; Davey, Calum; Phillips, Andrew; Cambiano, Valentina; Cowan, Frances M

    2016-01-05

    Pragmatic cluster-randomised trials should seek to make unbiased estimates of effect and be reported according to CONSORT principles, and the study population should be representative of the target population. This is challenging when conducting trials amongst 'hidden' populations without a sample frame. We describe a pair-matched cluster-randomised trial of a combination HIV-prevention intervention to reduce the proportion of female sex workers (FSW) with a detectable HIV viral load in Zimbabwe, recruiting via respondent driven sampling (RDS). We will cross-sectionally survey approximately 200 FSW at baseline and at endline to characterise each of 14 sites. RDS is a variant of chain referral sampling and has been adapted to approximate random sampling. Primary analysis will use the 'RDS-2' method to estimate cluster summaries and will adapt Hayes and Moulton's '2-step' method to adjust effect estimates for individual-level confounders and further adjust for cluster baseline prevalence. We will adapt CONSORT to accommodate RDS. In the absence of observable refusal rates, we will compare the recruitment process between matched pairs. We will need to investigate whether cluster-specific recruitment or the intervention itself affects the accuracy of the RDS estimation process, potentially causing differential biases. To do this, we will calculate RDS-diagnostic statistics for each cluster at each time point and compare these statistics within matched pairs and time points. Sensitivity analyses will assess the impact of potential biases arising from assumptions made by the RDS-2 estimation. We are not aware of any other completed pragmatic cluster RCTs that are recruiting participants using RDS. Our statistical design and analysis approach seeks to transparently document participant recruitment and allow an assessment of the representativeness of the study to the target population, a key aspect of pragmatic trials. The challenges we have faced in the design of this trial are likely to be shared in other contexts aiming to serve the needs of legally and/or socially marginalised populations for which no sampling frame exists and especially when the social networks of participants are both the target of intervention and the means of recruitment. The trial was registered at Pan African Clinical Trials Registry (PACTR201312000722390) on 9 December 2013.

  11. A multicomponent matched filter cluster confirmation tool for eROSITA: initial application to the RASS and DES-SV data sets

    DOE PAGES

    Klein, M.; Mohr, J. J.; Desai, S.; ...

    2017-11-14

    We describe a multi-component matched filter cluster confirmation tool (MCMF) designed for the study of large X-ray source catalogs produced by the upcoming X-ray all-sky survey mission eROSITA. We apply the method to confirm a sample of 88 clusters with redshifts $0.05

  12. A multicomponent matched filter cluster confirmation tool for eROSITA: initial application to the RASS and DES-SV data sets

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Klein, M.; Mohr, J. J.; Desai, S.

    We describe a multi-component matched filter cluster confirmation tool (MCMF) designed for the study of large X-ray source catalogs produced by the upcoming X-ray all-sky survey mission eROSITA. We apply the method to confirm a sample of 88 clusters with redshifts $0.05

  13. Effect of study design and setting on tuberculosis clustering estimates using Mycobacterial Interspersed Repetitive Units-Variable Number Tandem Repeats (MIRU-VNTR): a systematic review.

    PubMed

    Mears, Jessica; Abubakar, Ibrahim; Cohen, Theodore; McHugh, Timothy D; Sonnenberg, Pam

    2015-01-21

    To systematically review the evidence for the impact of study design and setting on the interpretation of tuberculosis (TB) transmission using clustering derived from Mycobacterial Interspersed Repetitive Units-Variable Number Tandem Repeats (MIRU-VNTR) strain typing. MEDLINE, EMBASE, CINHAL, Web of Science and Scopus were searched for articles published before 21st October 2014. Studies in humans that reported the proportion of clustering of TB isolates by MIRU-VNTR were included in the analysis. Univariable meta-regression analyses were conducted to assess the influence of study design and setting on the proportion of clustering. The search identified 27 eligible articles reporting clustering between 0% and 63%. The number of MIRU-VNTR loci typed, requiring consent to type patient isolates (as a proxy for sampling fraction), the TB incidence and the maximum cluster size explained 14%, 14%, 27% and 48% of between-study variation, respectively, and had a significant association with the proportion of clustering. Although MIRU-VNTR typing is being adopted worldwide there is a paucity of data on how study design and setting may influence estimates of clustering. We have highlighted study design variables for consideration in the design and interpretation of future studies. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  14. Statistical inferences for data from studies conducted with an aggregated multivariate outcome-dependent sample design

    PubMed Central

    Lu, Tsui-Shan; Longnecker, Matthew P.; Zhou, Haibo

    2016-01-01

    Outcome-dependent sampling (ODS) scheme is a cost-effective sampling scheme where one observes the exposure with a probability that depends on the outcome. The well-known such design is the case-control design for binary response, the case-cohort design for the failure time data and the general ODS design for a continuous response. While substantial work has been done for the univariate response case, statistical inference and design for the ODS with multivariate cases remain under-developed. Motivated by the need in biological studies for taking the advantage of the available responses for subjects in a cluster, we propose a multivariate outcome dependent sampling (Multivariate-ODS) design that is based on a general selection of the continuous responses within a cluster. The proposed inference procedure for the Multivariate-ODS design is semiparametric where all the underlying distributions of covariates are modeled nonparametrically using the empirical likelihood methods. We show that the proposed estimator is consistent and developed the asymptotically normality properties. Simulation studies show that the proposed estimator is more efficient than the estimator obtained using only the simple-random-sample portion of the Multivariate-ODS or the estimator from a simple random sample with the same sample size. The Multivariate-ODS design together with the proposed estimator provides an approach to further improve study efficiency for a given fixed study budget. We illustrate the proposed design and estimator with an analysis of association of PCB exposure to hearing loss in children born to the Collaborative Perinatal Study. PMID:27966260

  15. An enhanced cluster analysis program with bootstrap significance testing for ecological community analysis

    USGS Publications Warehouse

    McKenna, J.E.

    2003-01-01

    The biosphere is filled with complex living patterns and important questions about biodiversity and community and ecosystem ecology are concerned with structure and function of multispecies systems that are responsible for those patterns. Cluster analysis identifies discrete groups within multivariate data and is an effective method of coping with these complexities, but often suffers from subjective identification of groups. The bootstrap testing method greatly improves objective significance determination for cluster analysis. The BOOTCLUS program makes cluster analysis that reliably identifies real patterns within a data set more accessible and easier to use than previously available programs. A variety of analysis options and rapid re-analysis provide a means to quickly evaluate several aspects of a data set. Interpretation is influenced by sampling design and a priori designation of samples into replicate groups, and ultimately relies on the researcher's knowledge of the organisms and their environment. However, the BOOTCLUS program provides reliable, objectively determined groupings of multivariate data.

  16. Sample size determination for GEE analyses of stepped wedge cluster randomized trials.

    PubMed

    Li, Fan; Turner, Elizabeth L; Preisser, John S

    2018-06-19

    In stepped wedge cluster randomized trials, intact clusters of individuals switch from control to intervention from a randomly assigned period onwards. Such trials are becoming increasingly popular in health services research. When a closed cohort is recruited from each cluster for longitudinal follow-up, proper sample size calculation should account for three distinct types of intraclass correlations: the within-period, the inter-period, and the within-individual correlations. Setting the latter two correlation parameters to be equal accommodates cross-sectional designs. We propose sample size procedures for continuous and binary responses within the framework of generalized estimating equations that employ a block exchangeable within-cluster correlation structure defined from the distinct correlation types. For continuous responses, we show that the intraclass correlations affect power only through two eigenvalues of the correlation matrix. We demonstrate that analytical power agrees well with simulated power for as few as eight clusters, when data are analyzed using bias-corrected estimating equations for the correlation parameters concurrently with a bias-corrected sandwich variance estimator. © 2018, The International Biometric Society.

  17. Multi-species attributes as the condition for adaptive sampling of rare species using two-stage sequential sampling with an auxiliary variable

    USGS Publications Warehouse

    Panahbehagh, B.; Smith, D.R.; Salehi, M.M.; Hornbach, D.J.; Brown, D.J.; Chan, F.; Marinova, D.; Anderssen, R.S.

    2011-01-01

    Assessing populations of rare species is challenging because of the large effort required to locate patches of occupied habitat and achieve precise estimates of density and abundance. The presence of a rare species has been shown to be correlated with presence or abundance of more common species. Thus, ecological community richness or abundance can be used to inform sampling of rare species. Adaptive sampling designs have been developed specifically for rare and clustered populations and have been applied to a wide range of rare species. However, adaptive sampling can be logistically challenging, in part, because variation in final sample size introduces uncertainty in survey planning. Two-stage sequential sampling (TSS), a recently developed design, allows for adaptive sampling, but avoids edge units and has an upper bound on final sample size. In this paper we present an extension of two-stage sequential sampling that incorporates an auxiliary variable (TSSAV), such as community attributes, as the condition for adaptive sampling. We develop a set of simulations to approximate sampling of endangered freshwater mussels to evaluate the performance of the TSSAV design. The performance measures that we are interested in are efficiency and probability of sampling a unit occupied by the rare species. Efficiency measures the precision of population estimate from the TSSAV design relative to a standard design, such as simple random sampling (SRS). The simulations indicate that the density and distribution of the auxiliary population is the most important determinant of the performance of the TSSAV design. Of the design factors, such as sample size, the fraction of the primary units sampled was most important. For the best scenarios, the odds of sampling the rare species was approximately 1.5 times higher for TSSAV compared to SRS and efficiency was as high as 2 (i.e., variance from TSSAV was half that of SRS). We have found that design performance, especially for adaptive designs, is often case-specific. Efficiency of adaptive designs is especially sensitive to spatial distribution. We recommend that simulations tailored to the application of interest are highly useful for evaluating designs in preparation for sampling rare and clustered populations.

  18. Trap configuration and spacing influences parameter estimates in spatial capture-recapture models

    USGS Publications Warehouse

    Sun, Catherine C.; Fuller, Angela K.; Royle, J. Andrew

    2014-01-01

    An increasing number of studies employ spatial capture-recapture models to estimate population size, but there has been limited research on how different spatial sampling designs and trap configurations influence parameter estimators. Spatial capture-recapture models provide an advantage over non-spatial models by explicitly accounting for heterogeneous detection probabilities among individuals that arise due to the spatial organization of individuals relative to sampling devices. We simulated black bear (Ursus americanus) populations and spatial capture-recapture data to evaluate the influence of trap configuration and trap spacing on estimates of population size and a spatial scale parameter, sigma, that relates to home range size. We varied detection probability and home range size, and considered three trap configurations common to large-mammal mark-recapture studies: regular spacing, clustered, and a temporal sequence of different cluster configurations (i.e., trap relocation). We explored trap spacing and number of traps per cluster by varying the number of traps. The clustered arrangement performed well when detection rates were low, and provides for easier field implementation than the sequential trap arrangement. However, performance differences between trap configurations diminished as home range size increased. Our simulations suggest it is important to consider trap spacing relative to home range sizes, with traps ideally spaced no more than twice the spatial scale parameter. While spatial capture-recapture models can accommodate different sampling designs and still estimate parameters with accuracy and precision, our simulations demonstrate that aspects of sampling design, namely trap configuration and spacing, must consider study area size, ranges of individual movement, and home range sizes in the study population.

  19. BioCluster: tool for identification and clustering of Enterobacteriaceae based on biochemical data.

    PubMed

    Abdullah, Ahmed; Sabbir Alam, S M; Sultana, Munawar; Hossain, M Anwar

    2015-06-01

    Presumptive identification of different Enterobacteriaceae species is routinely achieved based on biochemical properties. Traditional practice includes manual comparison of each biochemical property of the unknown sample with known reference samples and inference of its identity based on the maximum similarity pattern with the known samples. This process is labor-intensive, time-consuming, error-prone, and subjective. Therefore, automation of sorting and similarity in calculation would be advantageous. Here we present a MATLAB-based graphical user interface (GUI) tool named BioCluster. This tool was designed for automated clustering and identification of Enterobacteriaceae based on biochemical test results. In this tool, we used two types of algorithms, i.e., traditional hierarchical clustering (HC) and the Improved Hierarchical Clustering (IHC), a modified algorithm that was developed specifically for the clustering and identification of Enterobacteriaceae species. IHC takes into account the variability in result of 1-47 biochemical tests within this Enterobacteriaceae family. This tool also provides different options to optimize the clustering in a user-friendly way. Using computer-generated synthetic data and some real data, we have demonstrated that BioCluster has high accuracy in clustering and identifying enterobacterial species based on biochemical test data. This tool can be freely downloaded at http://microbialgen.du.ac.bd/biocluster/. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.

  20. Design of a Phase III cluster randomized trial to assess the efficacy and safety of a malaria transmission blocking vaccine.

    PubMed

    Delrieu, Isabelle; Leboulleux, Didier; Ivinson, Karen; Gessner, Bradford D

    2015-03-24

    Vaccines interrupting Plasmodium falciparum malaria transmission targeting sexual, sporogonic, or mosquito-stage antigens (SSM-VIMT) are currently under development to reduce malaria transmission. An international group of malaria experts was established to evaluate the feasibility and optimal design of a Phase III cluster randomized trial (CRT) that could support regulatory review and approval of an SSM-VIMT. The consensus design is a CRT with a sentinel population randomly selected from defined inner and buffer zones in each cluster, a cluster size sufficient to assess true vaccine efficacy in the inner zone, and inclusion of ongoing assessment of vaccine impact stratified by distance of residence from the cluster edge. Trials should be conducted first in areas of moderate transmission, where SSM-VIMT impact should be greatest. Sample size estimates suggest that such a trial is feasible, and within the range of previously supported trials of malaria interventions, although substantial issues to implementation exist. Copyright © 2015 Elsevier Ltd. All rights reserved.

  1. Statistical inferences for data from studies conducted with an aggregated multivariate outcome-dependent sample design.

    PubMed

    Lu, Tsui-Shan; Longnecker, Matthew P; Zhou, Haibo

    2017-03-15

    Outcome-dependent sampling (ODS) scheme is a cost-effective sampling scheme where one observes the exposure with a probability that depends on the outcome. The well-known such design is the case-control design for binary response, the case-cohort design for the failure time data, and the general ODS design for a continuous response. While substantial work has been carried out for the univariate response case, statistical inference and design for the ODS with multivariate cases remain under-developed. Motivated by the need in biological studies for taking the advantage of the available responses for subjects in a cluster, we propose a multivariate outcome-dependent sampling (multivariate-ODS) design that is based on a general selection of the continuous responses within a cluster. The proposed inference procedure for the multivariate-ODS design is semiparametric where all the underlying distributions of covariates are modeled nonparametrically using the empirical likelihood methods. We show that the proposed estimator is consistent and developed the asymptotically normality properties. Simulation studies show that the proposed estimator is more efficient than the estimator obtained using only the simple-random-sample portion of the multivariate-ODS or the estimator from a simple random sample with the same sample size. The multivariate-ODS design together with the proposed estimator provides an approach to further improve study efficiency for a given fixed study budget. We illustrate the proposed design and estimator with an analysis of association of polychlorinated biphenyl exposure to hearing loss in children born to the Collaborative Perinatal Study. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  2. Cardiorespiratory Fitness Levels among U.S. Youth Aged 12-15 Years: United States, 1999-2004 and 2012

    MedlinePlus

    ... use a complex, stratified, multistage probability cluster sampling design. NHANES data collection is based on a nationally ... conjunction with the 2012 NHANES and the survey design was based on the design for NHANES, with ...

  3. Designing a multi-objective, multi-support accuracy assessment of the 2001 National Land Cover Data (NLCD 2001) of the conterminous United States

    USGS Publications Warehouse

    Stehman, S.V.; Wickham, J.D.; Wade, T.G.; Smith, J.H.

    2008-01-01

    The database design and diverse application of NLCD 2001 pose significant challenges for accuracy assessment because numerous objectives are of interest, including accuracy of land-cover, percent urban imperviousness, percent tree canopy, land-cover composition, and net change. A multi-support approach is needed because these objectives require spatial units of different sizes for reference data collection and analysis. Determining a sampling design that meets the full suite of desirable objectives for the NLCD 2001 accuracy assessment requires reconciling potentially conflicting design features that arise from targeting the different objectives. Multi-stage cluster sampling provides the general structure to achieve a multi-support assessment, and the flexibility to target different objectives at different stages of the design. We describe the implementation of two-stage cluster sampling for the initial phase of the NLCD 2001 assessment, and identify gaps in existing knowledge where research is needed to allow full implementation of a multi-objective, multi-support assessment. ?? 2008 American Society for Photogrammetry and Remote Sensing.

  4. Optimal sampling design for estimating spatial distribution and abundance of a freshwater mussel population

    USGS Publications Warehouse

    Pooler, P.S.; Smith, D.R.

    2005-01-01

    We compared the ability of simple random sampling (SRS) and a variety of systematic sampling (SYS) designs to estimate abundance, quantify spatial clustering, and predict spatial distribution of freshwater mussels. Sampling simulations were conducted using data obtained from a census of freshwater mussels in a 40 X 33 m section of the Cacapon River near Capon Bridge, West Virginia, and from a simulated spatially random population generated to have the same abundance as the real population. Sampling units that were 0.25 m 2 gave more accurate and precise abundance estimates and generally better spatial predictions than 1-m2 sampling units. Systematic sampling with ???2 random starts was more efficient than SRS. Estimates of abundance based on SYS were more accurate when the distance between sampling units across the stream was less than or equal to the distance between sampling units along the stream. Three measures for quantifying spatial clustering were examined: Hopkins Statistic, the Clumping Index, and Morisita's Index. Morisita's Index was the most reliable, and the Hopkins Statistic was prone to false rejection of complete spatial randomness. SYS designs with units spaced equally across and up stream provided the most accurate predictions when estimating the spatial distribution by kriging. Our research indicates that SYS designs with sampling units equally spaced both across and along the stream would be appropriate for sampling freshwater mussels even if no information about the true underlying spatial distribution of the population were available to guide the design choice. ?? 2005 by The North American Benthological Society.

  5. VizieR Online Data Catalog: 44 SZ-selected galaxy clusters ACT observations (Sifon+, 2016)

    NASA Astrophysics Data System (ADS)

    Sifon, C.; Battaglia, N.; Hasselfield, M.; Menanteau, F.; Barrientos, L. F.; Bond, J. R.; Crichton, D.; Devlin, M. J.; Dunner, R.; Hilton, M.; Hincks, A. D.; Hlozek, R.; Huffenberger, K. M.; Hughes, J. P.; Infante, L.; Kosowsky, A.; Marsden, D.; Marriage, T. A.; Moodley, K.; Niemack, M. D.; Page, L. A.; Spergel, D. N.; Staggs, S. T.; Trac, H.; Wollack, E. J.

    2017-11-01

    ACT is a 6-metre off-axis Gregorian telescope located at an altitude of 5200um in the Atacama desert in Chile, designed to observe the CMB at arcminute resolution. Galaxy clusters were detected in the 148GHz band by matched-filtering the maps with the pressure profile suggested by Arnaud et al. (2010A&A...517A..92A), fit to X-ray selected local (z<0.2) clusters, with varying cluster sizes,θ500, from 1.18 to 27-arcmin. Because of the complete overlap of ACT equatorial observations with Sloan Digital Sky Survey Data Release 8 (SDSS DR8; Aihara et al., 2011ApJS..193...29A) imaging, all cluster candidates were assessed with optical data (Menanteau et al., 2013ApJ...765...67M). We observed 20 clusters from the equatorial sample with the Gemini Multi-Object Spectrograph (GMOS) on the Gemini-South telescope, split in semesters 2011B (ObsID:GS-2011B-C-1, PI:Barrientos/Menanteau) and 2012A (ObsID:GS-2012A-C-1, PI:Menanteau), prioritizing clusters in the cosmological sample at 0.3

  6. Accounting for twin births in sample size calculations for randomised trials.

    PubMed

    Yelland, Lisa N; Sullivan, Thomas R; Collins, Carmel T; Price, David J; McPhee, Andrew J; Lee, Katherine J

    2018-05-04

    Including twins in randomised trials leads to non-independence or clustering in the data. Clustering has important implications for sample size calculations, yet few trials take this into account. Estimates of the intracluster correlation coefficient (ICC), or the correlation between outcomes of twins, are needed to assist with sample size planning. Our aims were to provide ICC estimates for infant outcomes, describe the information that must be specified in order to account for clustering due to twins in sample size calculations, and develop a simple tool for performing sample size calculations for trials including twins. ICCs were estimated for infant outcomes collected in four randomised trials that included twins. The information required to account for clustering due to twins in sample size calculations is described. A tool that calculates the sample size based on this information was developed in Microsoft Excel and in R as a Shiny web app. ICC estimates ranged between -0.12, indicating a weak negative relationship, and 0.98, indicating a strong positive relationship between outcomes of twins. Example calculations illustrate how the ICC estimates and sample size calculator can be used to determine the target sample size for trials including twins. Clustering among outcomes measured on twins should be taken into account in sample size calculations to obtain the desired power. Our ICC estimates and sample size calculator will be useful for designing future trials that include twins. Publication of additional ICCs is needed to further assist with sample size planning for future trials. © 2018 John Wiley & Sons Ltd.

  7. Design and simulation study of the immunization Data Quality Audit (DQA).

    PubMed

    Woodard, Stacy; Archer, Linda; Zell, Elizabeth; Ronveaux, Olivier; Birmingham, Maureen

    2007-08-01

    The goal of the Data Quality Audit (DQA) is to assess whether the Global Alliance for Vaccines and Immunization-funded countries are adequately reporting the number of diphtheria-tetanus-pertussis immunizations given, on which the "shares" are awarded. Given that this sampling design is a modified two-stage cluster sample (modified because a stratified, rather than a simple, random sample of health facilities is obtained from the selected clusters); the formula for the calculation of the standard error for the estimate is unknown. An approximated standard error has been proposed, and the first goal of this simulation is to assess the accuracy of the standard error. Results from the simulations based on hypothetical populations were found not to be representative of the actual DQAs that were conducted. Additional simulations were then conducted on the actual DQA data to better access the precision of the DQ with both the original and the increased sample sizes.

  8. Survey methods for assessing land cover map accuracy

    USGS Publications Warehouse

    Nusser, S.M.; Klaas, E.E.

    2003-01-01

    The increasing availability of digital photographic materials has fueled efforts by agencies and organizations to generate land cover maps for states, regions, and the United States as a whole. Regardless of the information sources and classification methods used, land cover maps are subject to numerous sources of error. In order to understand the quality of the information contained in these maps, it is desirable to generate statistically valid estimates of accuracy rates describing misclassification errors. We explored a full sample survey framework for creating accuracy assessment study designs that balance statistical and operational considerations in relation to study objectives for a regional assessment of GAP land cover maps. We focused not only on appropriate sample designs and estimation approaches, but on aspects of the data collection process, such as gaining cooperation of land owners and using pixel clusters as an observation unit. The approach was tested in a pilot study to assess the accuracy of Iowa GAP land cover maps. A stratified two-stage cluster sampling design addressed sample size requirements for land covers and the need for geographic spread while minimizing operational effort. Recruitment methods used for private land owners yielded high response rates, minimizing a source of nonresponse error. Collecting data for a 9-pixel cluster centered on the sampled pixel was simple to implement, and provided better information on rarer vegetation classes as well as substantial gains in precision relative to observing data at a single-pixel.

  9. Grouping methods for estimating the prevalences of rare traits from complex survey data that preserve confidentiality of respondents.

    PubMed

    Hyun, Noorie; Gastwirth, Joseph L; Graubard, Barry I

    2018-03-26

    Originally, 2-stage group testing was developed for efficiently screening individuals for a disease. In response to the HIV/AIDS epidemic, 1-stage group testing was adopted for estimating prevalences of a single or multiple traits from testing groups of size q, so individuals were not tested. This paper extends the methodology of 1-stage group testing to surveys with sample weighted complex multistage-cluster designs. Sample weighted-generalized estimating equations are used to estimate the prevalences of categorical traits while accounting for the error rates inherent in the tests. Two difficulties arise when using group testing in complex samples: (1) How does one weight the results of the test on each group as the sample weights will differ among observations in the same group. Furthermore, if the sample weights are related to positivity of the diagnostic test, then group-level weighting is needed to reduce bias in the prevalence estimation; (2) How does one form groups that will allow accurate estimation of the standard errors of prevalence estimates under multistage-cluster sampling allowing for intracluster correlation of the test results. We study 5 different grouping methods to address the weighting and cluster sampling aspects of complex designed samples. Finite sample properties of the estimators of prevalences, variances, and confidence interval coverage for these grouping methods are studied using simulations. National Health and Nutrition Examination Survey data are used to illustrate the methods. Copyright © 2018 John Wiley & Sons, Ltd.

  10. [Design of the National Surveillance of Nutritional Indicators (MONIN), Peru 2007-2010].

    PubMed

    Campos-Sánchez, Miguel; Ricaldi-Sueldo, Rita; Miranda-Cuadros, Marianella

    2011-06-01

    To describe the design and methods of the national surveillance of nutritional indicators (MONIN) 2007-2010, carried out by INS/CENAN. MONIN was designed as a continuous (repeated cross-sectional) survey, with stratified multi-stage random sampling, considering the universe as all under five children and pregnant women residing in Peru, divided into 5 geographical strata and 6 trimesters (randomly permuted weeks, about 78% of the time between November 19, 2007 and April 2, 2010). The total sample was 3,827 children in 361 completed clusters. The dropout rate was 8.4% in clusters, 1.8% in houses, and 13.2% in households. Dropout was also 4.2, 13.3, 21.2, 55% and 29% in anthropometry, hemoglobin, food intake, retinol and ioduria measurements, respectively. The MONIN design is feasible and useful for the estimation of indicators of childhood malnutrition.

  11. Group sequential designs for stepped-wedge cluster randomised trials

    PubMed Central

    Grayling, Michael J; Wason, James MS; Mander, Adrian P

    2017-01-01

    Background/Aims: The stepped-wedge cluster randomised trial design has received substantial attention in recent years. Although various extensions to the original design have been proposed, no guidance is available on the design of stepped-wedge cluster randomised trials with interim analyses. In an individually randomised trial setting, group sequential methods can provide notable efficiency gains and ethical benefits. We address this by discussing how established group sequential methodology can be adapted for stepped-wedge designs. Methods: Utilising the error spending approach to group sequential trial design, we detail the assumptions required for the determination of stepped-wedge cluster randomised trials with interim analyses. We consider early stopping for efficacy, futility, or efficacy and futility. We describe first how this can be done for any specified linear mixed model for data analysis. We then focus on one particular commonly utilised model and, using a recently completed stepped-wedge cluster randomised trial, compare the performance of several designs with interim analyses to the classical stepped-wedge design. Finally, the performance of a quantile substitution procedure for dealing with the case of unknown variance is explored. Results: We demonstrate that the incorporation of early stopping in stepped-wedge cluster randomised trial designs could reduce the expected sample size under the null and alternative hypotheses by up to 31% and 22%, respectively, with no cost to the trial’s type-I and type-II error rates. The use of restricted error maximum likelihood estimation was found to be more important than quantile substitution for controlling the type-I error rate. Conclusion: The addition of interim analyses into stepped-wedge cluster randomised trials could help guard against time-consuming trials conducted on poor performing treatments and also help expedite the implementation of efficacious treatments. In future, trialists should consider incorporating early stopping of some kind into stepped-wedge cluster randomised trials according to the needs of the particular trial. PMID:28653550

  12. Group sequential designs for stepped-wedge cluster randomised trials.

    PubMed

    Grayling, Michael J; Wason, James Ms; Mander, Adrian P

    2017-10-01

    The stepped-wedge cluster randomised trial design has received substantial attention in recent years. Although various extensions to the original design have been proposed, no guidance is available on the design of stepped-wedge cluster randomised trials with interim analyses. In an individually randomised trial setting, group sequential methods can provide notable efficiency gains and ethical benefits. We address this by discussing how established group sequential methodology can be adapted for stepped-wedge designs. Utilising the error spending approach to group sequential trial design, we detail the assumptions required for the determination of stepped-wedge cluster randomised trials with interim analyses. We consider early stopping for efficacy, futility, or efficacy and futility. We describe first how this can be done for any specified linear mixed model for data analysis. We then focus on one particular commonly utilised model and, using a recently completed stepped-wedge cluster randomised trial, compare the performance of several designs with interim analyses to the classical stepped-wedge design. Finally, the performance of a quantile substitution procedure for dealing with the case of unknown variance is explored. We demonstrate that the incorporation of early stopping in stepped-wedge cluster randomised trial designs could reduce the expected sample size under the null and alternative hypotheses by up to 31% and 22%, respectively, with no cost to the trial's type-I and type-II error rates. The use of restricted error maximum likelihood estimation was found to be more important than quantile substitution for controlling the type-I error rate. The addition of interim analyses into stepped-wedge cluster randomised trials could help guard against time-consuming trials conducted on poor performing treatments and also help expedite the implementation of efficacious treatments. In future, trialists should consider incorporating early stopping of some kind into stepped-wedge cluster randomised trials according to the needs of the particular trial.

  13. A multi purpose source chamber at the PLEIADES beamline at SOLEIL for spectroscopic studies of isolated species: cold molecules, clusters, and nanoparticles.

    PubMed

    Lindblad, Andreas; Söderström, Johan; Nicolas, Christophe; Robert, Emmanuel; Miron, Catalin

    2013-11-01

    This paper describes the philosophy and design goals regarding the construction of a versatile sample environment: a source capable of producing beams of atoms, molecules, clusters, and nanoparticles in view of studying their interaction with short wavelength (vacuum ultraviolet and x-ray) synchrotron radiation. In the design, specific care has been taken of (a) the use standard components, (b) ensuring modularity, i.e., that swiftly switching between different experimental configurations was possible. To demonstrate the efficiency of the design, proof-of-principle experiments have been conducted by recording x-ray absorption and photoelectron spectra from isolated nanoparticles (SiO2) and free mixed clusters (Ar/Xe). The results from those experiments are showcased and briefly discussed.

  14. Spatial design and strength of spatial signal: Effects on covariance estimation

    USGS Publications Warehouse

    Irvine, Kathryn M.; Gitelman, Alix I.; Hoeting, Jennifer A.

    2007-01-01

    In a spatial regression context, scientists are often interested in a physical interpretation of components of the parametric covariance function. For example, spatial covariance parameter estimates in ecological settings have been interpreted to describe spatial heterogeneity or “patchiness” in a landscape that cannot be explained by measured covariates. In this article, we investigate the influence of the strength of spatial dependence on maximum likelihood (ML) and restricted maximum likelihood (REML) estimates of covariance parameters in an exponential-with-nugget model, and we also examine these influences under different sampling designs—specifically, lattice designs and more realistic random and cluster designs—at differing intensities of sampling (n=144 and 361). We find that neither ML nor REML estimates perform well when the range parameter and/or the nugget-to-sill ratio is large—ML tends to underestimate the autocorrelation function and REML produces highly variable estimates of the autocorrelation function. The best estimates of both the covariance parameters and the autocorrelation function come under the cluster sampling design and large sample sizes. As a motivating example, we consider a spatial model for stream sulfate concentration.

  15. Sample design effects in landscape genetics

    USGS Publications Warehouse

    Oyler-McCance, Sara J.; Fedy, Bradley C.; Landguth, Erin L.

    2012-01-01

    An important research gap in landscape genetics is the impact of different field sampling designs on the ability to detect the effects of landscape pattern on gene flow. We evaluated how five different sampling regimes (random, linear, systematic, cluster, and single study site) affected the probability of correctly identifying the generating landscape process of population structure. Sampling regimes were chosen to represent a suite of designs common in field studies. We used genetic data generated from a spatially-explicit, individual-based program and simulated gene flow in a continuous population across a landscape with gradual spatial changes in resistance to movement. Additionally, we evaluated the sampling regimes using realistic and obtainable number of loci (10 and 20), number of alleles per locus (5 and 10), number of individuals sampled (10-300), and generational time after the landscape was introduced (20 and 400). For a simulated continuously distributed species, we found that random, linear, and systematic sampling regimes performed well with high sample sizes (>200), levels of polymorphism (10 alleles per locus), and number of molecular markers (20). The cluster and single study site sampling regimes were not able to correctly identify the generating process under any conditions and thus, are not advisable strategies for scenarios similar to our simulations. Our research emphasizes the importance of sampling data at ecologically appropriate spatial and temporal scales and suggests careful consideration for sampling near landscape components that are likely to most influence the genetic structure of the species. In addition, simulating sampling designs a priori could help guide filed data collection efforts.

  16. Design-based and model-based inference in surveys of freshwater mollusks

    USGS Publications Warehouse

    Dorazio, R.M.

    1999-01-01

    Well-known concepts in statistical inference and sampling theory are used to develop recommendations for planning and analyzing the results of quantitative surveys of freshwater mollusks. Two methods of inference commonly used in survey sampling (design-based and model-based) are described and illustrated using examples relevant in surveys of freshwater mollusks. The particular objectives of a survey and the type of information observed in each unit of sampling can be used to help select the sampling design and the method of inference. For example, the mean density of a sparsely distributed population of mollusks can be estimated with higher precision by using model-based inference or by using design-based inference with adaptive cluster sampling than by using design-based inference with conventional sampling. More experience with quantitative surveys of natural assemblages of freshwater mollusks is needed to determine the actual benefits of different sampling designs and inferential procedures.

  17. Career Decision Statuses among Portuguese Secondary School Students: A Cluster Analytical Approach

    ERIC Educational Resources Information Center

    Santos, Paulo Jorge; Ferreira, Joaquim Armando

    2012-01-01

    Career indecision is a complex phenomenon and an increasing number of authors have proposed that undecided individuals do not form a group with homogeneous characteristics. This study examines career decision statuses among a sample of 362 12th-grade Portuguese students. A cluster-analytical procedure, based on a battery of instruments designed to…

  18. Fusion And Inference From Multiple And Massive Disparate Distributed Dynamic Data Sets

    DTIC Science & Technology

    2017-07-01

    principled methodology for two-sample graph testing; designed a provably almost-surely perfect vertex clustering algorithm for block model graphs; proved...3.7 Semi-Supervised Clustering Methodology ...................................................................... 9 3.8 Robust Hypothesis Testing...dimensional Euclidean space – allows the full arsenal of statistical and machine learning methodology for multivariate Euclidean data to be deployed for

  19. A comprehensive HST BVI catalogue of star clusters in five Hickson compact groups of galaxies

    NASA Astrophysics Data System (ADS)

    Fedotov, K.; Gallagher, S. C.; Durrell, P. R.; Bastian, N.; Konstantopoulos, I. S.; Charlton, J.; Johnson, K. E.; Chandar, R.

    2015-05-01

    We present a photometric catalogue of star cluster candidates in Hickson compact groups (HCGs) 7, 31, 42, 59, and 92, based on observations with the Advanced Camera for Surveys and the Wide Field Camera 3 on the Hubble Space Telescope. The catalogue contains precise cluster positions (right ascension and declination), magnitudes, and colours in the BVI filters. The number of detected sources ranges from 2200 to 5600 per group, from which we construct the high-confidence sample by applying a number of criteria designed to reduce foreground and background contaminants. Furthermore, the high-confidence cluster candidates for each of the 16 galaxies in our sample are split into two subpopulations: one that may contain young star clusters and one that is dominated by globular older clusters. The ratio of young star cluster to globular cluster candidates varies from group to group, from equal numbers to the extreme of HCG 31 which has a ratio of 8 to 1, due to a recent starburst induced by interactions in the group. We find that the number of blue clusters with MV < -9 correlates well with the current star formation rate in an individual galaxy, while the number of globular cluster candidates with MV < -7.8 correlates well (though with large scatter) with the stellar mass. Analyses of the high-confidence sample presented in this paper show that star clusters can be successfully used to infer the gross star formation history of the host groups and therefore determine their placement in a proposed evolutionary sequence for compact galaxy groups.

  20. Multilevel Factorial Experiments for Developing Behavioral Interventions: Power, Sample Size, and Resource Considerations†

    PubMed Central

    Dziak, John J.; Nahum-Shani, Inbal; Collins, Linda M.

    2012-01-01

    Factorial experimental designs have many potential advantages for behavioral scientists. For example, such designs may be useful in building more potent interventions, by helping investigators to screen several candidate intervention components simultaneously and decide which are likely to offer greater benefit before evaluating the intervention as a whole. However, sample size and power considerations may challenge investigators attempting to apply such designs, especially when the population of interest is multilevel (e.g., when students are nested within schools, or employees within organizations). In this article we examine the feasibility of factorial experimental designs with multiple factors in a multilevel, clustered setting (i.e., of multilevel multifactor experiments). We conduct Monte Carlo simulations to demonstrate how design elements such as the number of clusters, the number of lower-level units, and the intraclass correlation affect power. Our results suggest that multilevel, multifactor experiments are feasible for factor-screening purposes, because of the economical properties of complete and fractional factorial experimental designs. We also discuss resources for sample size planning and power estimation for multilevel factorial experiments. These results are discussed from a resource management perspective, in which the goal is to choose a design that maximizes the scientific benefit using the resources available for an investigation. PMID:22309956

  1. Multilevel factorial experiments for developing behavioral interventions: power, sample size, and resource considerations.

    PubMed

    Dziak, John J; Nahum-Shani, Inbal; Collins, Linda M

    2012-06-01

    Factorial experimental designs have many potential advantages for behavioral scientists. For example, such designs may be useful in building more potent interventions by helping investigators to screen several candidate intervention components simultaneously and to decide which are likely to offer greater benefit before evaluating the intervention as a whole. However, sample size and power considerations may challenge investigators attempting to apply such designs, especially when the population of interest is multilevel (e.g., when students are nested within schools, or when employees are nested within organizations). In this article, we examine the feasibility of factorial experimental designs with multiple factors in a multilevel, clustered setting (i.e., of multilevel, multifactor experiments). We conduct Monte Carlo simulations to demonstrate how design elements-such as the number of clusters, the number of lower-level units, and the intraclass correlation-affect power. Our results suggest that multilevel, multifactor experiments are feasible for factor-screening purposes because of the economical properties of complete and fractional factorial experimental designs. We also discuss resources for sample size planning and power estimation for multilevel factorial experiments. These results are discussed from a resource management perspective, in which the goal is to choose a design that maximizes the scientific benefit using the resources available for an investigation. (c) 2012 APA, all rights reserved

  2. Clustered lot quality assurance sampling to assess immunisation coverage: increasing rapidity and maintaining precision.

    PubMed

    Pezzoli, Lorenzo; Andrews, Nick; Ronveaux, Olivier

    2010-05-01

    Vaccination programmes targeting disease elimination aim to achieve very high coverage levels (e.g. 95%). We calculated the precision of different clustered lot quality assurance sampling (LQAS) designs in computer-simulated surveys to provide local health officers in the field with preset LQAS plans to simply and rapidly assess programmes with high coverage targets. We calculated sample size (N), decision value (d) and misclassification errors (alpha and beta) of several LQAS plans by running 10 000 simulations. We kept the upper coverage threshold (UT) at 90% or 95% and decreased the lower threshold (LT) progressively by 5%. We measured the proportion of simulations with < or =d individuals unvaccinated or lower if the coverage was set at the UT (pUT) to calculate beta (1-pUT) and the proportion of simulations with >d unvaccinated individuals if the coverage was LT% (pLT) to calculate alpha (1-pLT). We divided N in clusters (between 5 and 10) and recalculated the errors hypothesising that the coverage would vary in the clusters according to a binomial distribution with preset standard deviations of 0.05 and 0.1 from the mean lot coverage. We selected the plans fulfilling these criteria: alpha < or = 5% beta < or = 20% in the unclustered design; alpha < or = 10% beta < or = 25% when the lots were divided in five clusters. When the interval between UT and LT was larger than 10% (e.g. 15%), we were able to select precise LQAS plans dividing the lot in five clusters with N = 50 (5 x 10) and d = 4 to evaluate programmes with 95% coverage target and d = 7 to evaluate programmes with 90% target. These plans will considerably increase the feasibility and the rapidity of conducting the LQAS in the field.

  3. A multi purpose source chamber at the PLEIADES beamline at SOLEIL for spectroscopic studies of isolated species: Cold molecules, clusters, and nanoparticles

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lindblad, Andreas; Söderström, Johan; Nicolas, Christophe

    2013-11-15

    This paper describes the philosophy and design goals regarding the construction of a versatile sample environment: a source capable of producing beams of atoms, molecules, clusters, and nanoparticles in view of studying their interaction with short wavelength (vacuum ultraviolet and x-ray) synchrotron radiation. In the design, specific care has been taken of (a) the use standard components, (b) ensuring modularity, i.e., that swiftly switching between different experimental configurations was possible. To demonstrate the efficiency of the design, proof-of-principle experiments have been conducted by recording x-ray absorption and photoelectron spectra from isolated nanoparticles (SiO{sub 2}) and free mixed clusters (Ar/Xe). Themore » results from those experiments are showcased and briefly discussed.« less

  4. National traffic speeds survey I: 2007 : traffic tech.

    DOT National Transportation Integrated Search

    2012-09-01

    The speed survey reviewed in this edition of "Traffic Tech" was designed as a geographic cluster : sample of primary sampling units (PSUs), which can be : a city, county, or group of two or three counties. PSUs : were chosen to represent a range of c...

  5. Satellite quenching time-scales in clusters from projected phase space measurements matched to simulated orbits

    NASA Astrophysics Data System (ADS)

    Oman, Kyle A.; Hudson, Michael J.

    2016-12-01

    We measure the star formation quenching efficiency and time-scale in cluster environments. Our method uses N-body simulations to estimate the probability distribution of possible orbits for a sample of observed Sloan Digital Sky Survey galaxies in and around clusters based on their position and velocity offsets from their host cluster. We study the relationship between their star formation rates and their likely orbital histories via a simple model in which star formation is quenched once a delay time after infall has elapsed. Our orbit library method is designed to isolate the environmental effect on the star formation rate due to a galaxy's present-day host cluster from `pre-processing' in previous group hosts. We find that quenching of satellite galaxies of all stellar masses in our sample (109-10^{11.5}M_{⊙}) by massive (> 10^{13} M_{⊙}) clusters is essentially 100 per cent efficient. Our fits show that all galaxies quench on their first infall, approximately at or within a Gyr of their first pericentric passage. There is little variation in the onset of quenching from galaxy-to-galaxy: the spread in this time is at most ˜2 Gyr at fixed M*. Higher mass satellites quench earlier, with very little dependence on host cluster mass in the range probed by our sample.

  6. Russian consumers' motives for food choice.

    PubMed

    Honkanen, Pirjo; Frewer, Lynn

    2009-04-01

    Knowledge about food choice motives which have potential to influence consumer consumption decisions is important when designing food and health policies, as well as marketing strategies. Russian consumers' food choice motives were studied in a survey (1081 respondents across four cities), with the purpose of identifying consumer segments based on these motives. These segments were then profiled using consumption, attitudinal and demographic variables. Face-to-face interviews were used to sample the data, which were analysed with two-step cluster analysis (SPSS). Three clusters emerged, representing 21.5%, 45.8% and 32.7% of the sample. The clusters were similar in terms of the order of motivations, but differed in motivational level. Sensory factors and availability were the most important motives for food choice in all three clusters, followed by price. This may reflect the turbulence which Russia has recently experienced politically and economically. Cluster profiles differed in relation to socio-demographic factors, consumption patterns and attitudes towards health and healthy food.

  7. Comparison of Precision of Biomass Estimates in Regional Field Sample Surveys and Airborne LiDAR-Assisted Surveys in Hedmark County, Norway

    NASA Technical Reports Server (NTRS)

    Naesset, Erik; Gobakken, Terje; Bollandsas, Ole Martin; Gregoire, Timothy G.; Nelson, Ross; Stahl, Goeran

    2013-01-01

    Airborne scanning LiDAR (Light Detection and Ranging) has emerged as a promising tool to provide auxiliary data for sample surveys aiming at estimation of above-ground tree biomass (AGB), with potential applications in REDD forest monitoring. For larger geographical regions such as counties, states or nations, it is not feasible to collect airborne LiDAR data continuously ("wall-to-wall") over the entire area of interest. Two-stage cluster survey designs have therefore been demonstrated by which LiDAR data are collected along selected individual flight-lines treated as clusters and with ground plots sampled along these LiDAR swaths. Recently, analytical AGB estimators and associated variance estimators that quantify the sampling variability have been proposed. Empirical studies employing these estimators have shown a seemingly equal or even larger uncertainty of the AGB estimates obtained with extensive use of LiDAR data to support the estimation as compared to pure field-based estimates employing estimators appropriate under simple random sampling (SRS). However, comparison of uncertainty estimates under SRS and sophisticated two-stage designs is complicated by large differences in the designs and assumptions. In this study, probability-based principles to estimation and inference were followed. We assumed designs of a field sample and a LiDAR-assisted survey of Hedmark County (HC) (27,390 km2), Norway, considered to be more comparable than those assumed in previous studies. The field sample consisted of 659 systematically distributed National Forest Inventory (NFI) plots and the airborne scanning LiDAR data were collected along 53 parallel flight-lines flown over the NFI plots. We compared AGB estimates based on the field survey only assuming SRS against corresponding estimates assuming two-phase (double) sampling with LiDAR and employing model-assisted estimators. We also compared AGB estimates based on the field survey only assuming two-stage sampling (the NFI plots being grouped in clusters) against corresponding estimates assuming two-stage sampling with the LiDAR and employing model-assisted estimators. For each of the two comparisons, the standard errors of the AGB estimates were consistently lower for the LiDAR-assisted designs. The overall reduction of the standard errors in the LiDAR-assisted estimation was around 40-60% compared to the pure field survey. We conclude that the previously proposed two-stage model-assisted estimators are inappropriate for surveys with unequal lengths of the LiDAR flight-lines and new estimators are needed. Some options for design of LiDAR-assisted sample surveys under REDD are also discussed, which capitalize on the flexibility offered when the field survey is designed as an integrated part of the overall survey design as opposed to previous LiDAR-assisted sample surveys in the boreal and temperate zones which have been restricted by the current design of an existing NFI.

  8. The Importance and Role of Intracluster Correlations in Planning Cluster Trials

    PubMed Central

    Preisser, John S.; Reboussin, Beth A.; Song, Eun-Young; Wolfson, Mark

    2008-01-01

    There is increasing recognition of the critical role of intracluster correlations of health behavior outcomes in cluster intervention trials. This study examines the estimation, reporting, and use of intracluster correlations in planning cluster trials. We use an estimating equations approach to estimate the intracluster correlations corresponding to the multiple-time-point nested cross-sectional design. Sample size formulae incorporating 2 types of intracluster correlations are examined for the purpose of planning future trials. The traditional intracluster correlation is the correlation among individuals within the same community at a specific time point. A second type is the correlation among individuals within the same community at different time points. For a “time × condition” analysis of a pretest–posttest nested cross-sectional trial design, we show that statistical power considerations based upon a posttest-only design generally are not an adequate substitute for sample size calculations that incorporate both types of intracluster correlations. Estimation, reporting, and use of intracluster correlations are illustrated for several dichotomous measures related to underage drinking collected as part of a large nonrandomized trial to enforce underage drinking laws in the United States from 1998 to 2004. PMID:17879427

  9. Substance use disorders in a sample of Canadian patients with chronic mental illness.

    PubMed

    Toner, B B; Gillies, L A; Prendergast, P; Cote, F H; Browne, C

    1992-03-01

    In a study designed to investigate the pattern of substance use disorders among a group of chronic mentally ill patients in Toronto, 102 patients completed the Structured Clinical Interview for DSM-III-R and a modified substance-use-disorder module of the Diagnostic Interview Schedule. Forty percent of the sample met criteria for substance use disorders, and 49 percent for personality disorder. Among patients with personality disorder, all those with a personality disorder in cluster B (that is, with antisocial, borderline, histrionic, or narcissistic personality disorder) had a substance use disorder, while the majority of patients in cluster A and cluster C were not substance abusers. In the overall sample, the group with substance use disorders was significantly younger than the group without. In contrast to findings of previous studies, women met criteria for substance use disorders as often as men did.

  10. Baseline adjustments for binary data in repeated cross-sectional cluster randomized trials.

    PubMed

    Nixon, R M; Thompson, S G

    2003-09-15

    Analysis of covariance models, which adjust for a baseline covariate, are often used to compare treatment groups in a controlled trial in which individuals are randomized. Such analysis adjusts for any baseline imbalance and usually increases the precision of the treatment effect estimate. We assess the value of such adjustments in the context of a cluster randomized trial with repeated cross-sectional design and a binary outcome. In such a design, a new sample of individuals is taken from the clusters at each measurement occasion, so that baseline adjustment has to be at the cluster level. Logistic regression models are used to analyse the data, with cluster level random effects to allow for different outcome probabilities in each cluster. We compare the estimated treatment effect and its precision in models that incorporate a covariate measuring the cluster level probabilities at baseline and those that do not. In two data sets, taken from a cluster randomized trial in the treatment of menorrhagia, the value of baseline adjustment is only evident when the number of subjects per cluster is large. We assess the generalizability of these findings by undertaking a simulation study, and find that increased precision of the treatment effect requires both large cluster sizes and substantial heterogeneity between clusters at baseline, but baseline imbalance arising by chance in a randomized study can always be effectively adjusted for. Copyright 2003 John Wiley & Sons, Ltd.

  11. Adaptive cluster sampling: An efficient method for assessing inconspicuous species

    Treesearch

    Andrea M. Silletti; Joan Walker

    2003-01-01

    Restorationistis typically evaluate the success of a project by estimating the population sizes of species that have been planted or seeded. Because total census is raely feasible, they must rely on sampling methods for population estimates. However, traditional random sampling designs may be inefficient for species that, for one reason or another, are challenging to...

  12. A cluster-randomised, controlled trial to assess the impact of a workplace osteoporosis prevention intervention on the dietary and physical activity behaviours of working women: study protocol

    PubMed Central

    2013-01-01

    Background Osteoporosis is a debilitating disease and its risk can be reduced through adequate calcium consumption and physical activity. This protocol paper describes a workplace-based intervention targeting behaviour change in premenopausal women working in sedentary occupations. Method/Design A cluster-randomised design was used, comparing the efficacy of a tailored intervention to standard care. Workplaces were the clusters and units of randomisation and intervention. Sample size calculations incorporated the cluster design. Final number of clusters was determined to be 16, based on a cluster size of 20 and calcium intake parameters (effect size 250 mg, ICC 0.5 and standard deviation 290 mg) as it required the highest number of clusters. Sixteen workplaces were recruited from a pool of 97 workplaces and randomly assigned to intervention and control arms (eight in each). Women meeting specified inclusion criteria were then recruited to participate. Workplaces in the intervention arm received three participatory workshops and organisation wide educational activities. Workplaces in the control/standard care arm received print resources. Intervention workshops were guided by self-efficacy theory and included participatory activities such as goal setting, problem solving, local food sampling, exercise trials, group discussion and behaviour feedback. Outcomes measures were calcium intake (milligrams/day) and physical activity level (duration: minutes/week), measured at baseline, four weeks and six months post intervention. Discussion This study addresses the current lack of evidence for behaviour change interventions focussing on osteoporosis prevention. It addresses missed opportunities of using workplaces as a platform to target high-risk individuals with sedentary occupations. The intervention was designed to modify behaviour levels to bring about risk reduction. It is the first to address dietary and physical activity components each with unique intervention strategies in the context of osteoporosis prevention. The intervention used locally relevant behavioural strategies previously shown to support good outcomes in other countries. The combination of these elements have not been incorporated in similar studies in the past, supporting the study hypothesis that the intervention will be more efficacious than standard practice in osteoporosis prevention through improvements in calcium intake and physical activity. PMID:23627684

  13. Portable ultrahigh-vacuum sample storage system for polarization-dependent total-reflection fluorescence x-ray absorption fine structure spectroscopy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Watanabe, Yoshihide, E-mail: e0827@mosk.tytlabs.co.jp; Nishimura, Yusaku F.; Suzuki, Ryo

    A portable ultrahigh-vacuum sample storage system was designed and built to investigate the detailed geometric structures of mass-selected metal clusters on oxide substrates by polarization-dependent total-reflection fluorescence x-ray absorption fine structure spectroscopy (PTRF-XAFS). This ultrahigh-vacuum (UHV) sample storage system provides the handover of samples between two different sample manipulating systems. The sample storage system is adaptable for public transportation, facilitating experiments using air-sensitive samples in synchrotron radiation or other quantum beam facilities. The samples were transferred by the developed portable UHV transfer system via a public transportation at a distance over 400 km. The performance of the transfer system was demonstratedmore » by a successful PTRF-XAFS study of Pt{sub 4} clusters deposited on a TiO{sub 2}(110) surface.« less

  14. A two-stage cluster sampling method using gridded population data, a GIS, and Google Earth(TM) imagery in a population-based mortality survey in Iraq.

    PubMed

    Galway, Lp; Bell, Nathaniel; Sae, Al Shatari; Hagopian, Amy; Burnham, Gilbert; Flaxman, Abraham; Weiss, Wiliam M; Rajaratnam, Julie; Takaro, Tim K

    2012-04-27

    Mortality estimates can measure and monitor the impacts of conflict on a population, guide humanitarian efforts, and help to better understand the public health impacts of conflict. Vital statistics registration and surveillance systems are rarely functional in conflict settings, posing a challenge of estimating mortality using retrospective population-based surveys. We present a two-stage cluster sampling method for application in population-based mortality surveys. The sampling method utilizes gridded population data and a geographic information system (GIS) to select clusters in the first sampling stage and Google Earth TM imagery and sampling grids to select households in the second sampling stage. The sampling method is implemented in a household mortality study in Iraq in 2011. Factors affecting feasibility and methodological quality are described. Sampling is a challenge in retrospective population-based mortality studies and alternatives that improve on the conventional approaches are needed. The sampling strategy presented here was designed to generate a representative sample of the Iraqi population while reducing the potential for bias and considering the context specific challenges of the study setting. This sampling strategy, or variations on it, are adaptable and should be considered and tested in other conflict settings.

  15. A two-stage cluster sampling method using gridded population data, a GIS, and Google EarthTM imagery in a population-based mortality survey in Iraq

    PubMed Central

    2012-01-01

    Background Mortality estimates can measure and monitor the impacts of conflict on a population, guide humanitarian efforts, and help to better understand the public health impacts of conflict. Vital statistics registration and surveillance systems are rarely functional in conflict settings, posing a challenge of estimating mortality using retrospective population-based surveys. Results We present a two-stage cluster sampling method for application in population-based mortality surveys. The sampling method utilizes gridded population data and a geographic information system (GIS) to select clusters in the first sampling stage and Google Earth TM imagery and sampling grids to select households in the second sampling stage. The sampling method is implemented in a household mortality study in Iraq in 2011. Factors affecting feasibility and methodological quality are described. Conclusion Sampling is a challenge in retrospective population-based mortality studies and alternatives that improve on the conventional approaches are needed. The sampling strategy presented here was designed to generate a representative sample of the Iraqi population while reducing the potential for bias and considering the context specific challenges of the study setting. This sampling strategy, or variations on it, are adaptable and should be considered and tested in other conflict settings. PMID:22540266

  16. Nurses' beliefs about nursing diagnosis: A study with cluster analysis.

    PubMed

    D'Agostino, Fabio; Pancani, Luca; Romero-Sánchez, José Manuel; Lumillo-Gutierrez, Iris; Paloma-Castro, Olga; Vellone, Ercole; Alvaro, Rosaria

    2018-06-01

    To identify clusters of nurses in relation to their beliefs about nursing diagnosis among two populations (Italian and Spanish); to investigate differences among clusters of nurses in each population considering the nurses' socio-demographic data, attitudes towards nursing diagnosis, intentions to make nursing diagnosis and actual behaviours in making nursing diagnosis. Nurses' beliefs concerning nursing diagnosis can influence its use in practice but this is still unclear. A cross-sectional design. A convenience sample of nurses in Italy and Spain was enrolled. Data were collected between 2014-2015 using tools, that is, a socio-demographic questionnaire and behavioural, normative and control beliefs, attitudes, intentions and behaviours scales. The sample included 499 nurses (272 Italians & 227 Spanish). Of these, 66.5% of the Italian and 90.7% of the Spanish sample were female. The mean age was 36.5 and 45.2 years old in the Italian and Spanish sample respectively. Six clusters of nurses were identified in Spain and four in Italy. Three clusters were similar among the two populations. Similar significant associations between age, years of work, attitudes towards nursing diagnosis, intentions to make nursing diagnosis and behaviours in making nursing diagnosis and cluster membership in each population were identified. Belief profiles identified unique subsets of nurses that have distinct characteristics. Categorizing nurses by belief patterns may help administrators and educators to tailor interventions aimed at improving nursing diagnosis use in practice. © 2018 John Wiley & Sons Ltd.

  17. A pilot cluster randomized controlled trial of structured goal-setting following stroke.

    PubMed

    Taylor, William J; Brown, Melanie; William, Levack; McPherson, Kathryn M; Reed, Kirk; Dean, Sarah G; Weatherall, Mark

    2012-04-01

    To determine the feasibility, the cluster design effect and the variance and minimal clinical importance difference in the primary outcome in a pilot study of a structured approach to goal-setting. A cluster randomized controlled trial. Inpatient rehabilitation facilities. People who were admitted to inpatient rehabilitation following stroke who had sufficient cognition to engage in structured goal-setting and complete the primary outcome measure. Structured goal elicitation using the Canadian Occupational Performance Measure. Quality of life at 12 weeks using the Schedule for Individualised Quality of Life (SEIQOL-DW), Functional Independence Measure, Short Form 36 and Patient Perception of Rehabilitation (measuring satisfaction with rehabilitation). Assessors were blinded to the intervention. Four rehabilitation services and 41 patients were randomized. We found high values of the intraclass correlation for the outcome measures (ranging from 0.03 to 0.40) and high variance of the SEIQOL-DW (SD 19.6) in relation to the minimally importance difference of 2.1, leading to impractically large sample size requirements for a cluster randomized design. A cluster randomized design is not a practical means of avoiding contamination effects in studies of inpatient rehabilitation goal-setting. Other techniques for coping with contamination effects are necessary.

  18. Nonlinear inversion of electrical resistivity imaging using pruning Bayesian neural networks

    NASA Astrophysics Data System (ADS)

    Jiang, Fei-Bo; Dai, Qian-Wei; Dong, Li

    2016-06-01

    Conventional artificial neural networks used to solve electrical resistivity imaging (ERI) inversion problem suffer from overfitting and local minima. To solve these problems, we propose to use a pruning Bayesian neural network (PBNN) nonlinear inversion method and a sample design method based on the K-medoids clustering algorithm. In the sample design method, the training samples of the neural network are designed according to the prior information provided by the K-medoids clustering results; thus, the training process of the neural network is well guided. The proposed PBNN, based on Bayesian regularization, is used to select the hidden layer structure by assessing the effect of each hidden neuron to the inversion results. Then, the hyperparameter α k , which is based on the generalized mean, is chosen to guide the pruning process according to the prior distribution of the training samples under the small-sample condition. The proposed algorithm is more efficient than other common adaptive regularization methods in geophysics. The inversion of synthetic data and field data suggests that the proposed method suppresses the noise in the neural network training stage and enhances the generalization. The inversion results with the proposed method are better than those of the BPNN, RBFNN, and RRBFNN inversion methods as well as the conventional least squares inversion.

  19. Sampling in health geography: reconciling geographical objectives and probabilistic methods. An example of a health survey in Vientiane (Lao PDR)

    PubMed Central

    Vallée, Julie; Souris, Marc; Fournet, Florence; Bochaton, Audrey; Mobillion, Virginie; Peyronnie, Karine; Salem, Gérard

    2007-01-01

    Background Geographical objectives and probabilistic methods are difficult to reconcile in a unique health survey. Probabilistic methods focus on individuals to provide estimates of a variable's prevalence with a certain precision, while geographical approaches emphasise the selection of specific areas to study interactions between spatial characteristics and health outcomes. A sample selected from a small number of specific areas creates statistical challenges: the observations are not independent at the local level, and this results in poor statistical validity at the global level. Therefore, it is difficult to construct a sample that is appropriate for both geographical and probability methods. Methods We used a two-stage selection procedure with a first non-random stage of selection of clusters. Instead of randomly selecting clusters, we deliberately chose a group of clusters, which as a whole would contain all the variation in health measures in the population. As there was no health information available before the survey, we selected a priori determinants that can influence the spatial homogeneity of the health characteristics. This method yields a distribution of variables in the sample that closely resembles that in the overall population, something that cannot be guaranteed with randomly-selected clusters, especially if the number of selected clusters is small. In this way, we were able to survey specific areas while minimising design effects and maximising statistical precision. Application We applied this strategy in a health survey carried out in Vientiane, Lao People's Democratic Republic. We selected well-known health determinants with unequal spatial distribution within the city: nationality and literacy. We deliberately selected a combination of clusters whose distribution of nationality and literacy is similar to the distribution in the general population. Conclusion This paper describes the conceptual reasoning behind the construction of the survey sample and shows that it can be advantageous to choose clusters using reasoned hypotheses, based on both probability and geographical approaches, in contrast to a conventional, random cluster selection strategy. PMID:17543100

  20. Sampling in health geography: reconciling geographical objectives and probabilistic methods. An example of a health survey in Vientiane (Lao PDR).

    PubMed

    Vallée, Julie; Souris, Marc; Fournet, Florence; Bochaton, Audrey; Mobillion, Virginie; Peyronnie, Karine; Salem, Gérard

    2007-06-01

    Geographical objectives and probabilistic methods are difficult to reconcile in a unique health survey. Probabilistic methods focus on individuals to provide estimates of a variable's prevalence with a certain precision, while geographical approaches emphasise the selection of specific areas to study interactions between spatial characteristics and health outcomes. A sample selected from a small number of specific areas creates statistical challenges: the observations are not independent at the local level, and this results in poor statistical validity at the global level. Therefore, it is difficult to construct a sample that is appropriate for both geographical and probability methods. We used a two-stage selection procedure with a first non-random stage of selection of clusters. Instead of randomly selecting clusters, we deliberately chose a group of clusters, which as a whole would contain all the variation in health measures in the population. As there was no health information available before the survey, we selected a priori determinants that can influence the spatial homogeneity of the health characteristics. This method yields a distribution of variables in the sample that closely resembles that in the overall population, something that cannot be guaranteed with randomly-selected clusters, especially if the number of selected clusters is small. In this way, we were able to survey specific areas while minimising design effects and maximising statistical precision. We applied this strategy in a health survey carried out in Vientiane, Lao People's Democratic Republic. We selected well-known health determinants with unequal spatial distribution within the city: nationality and literacy. We deliberately selected a combination of clusters whose distribution of nationality and literacy is similar to the distribution in the general population. This paper describes the conceptual reasoning behind the construction of the survey sample and shows that it can be advantageous to choose clusters using reasoned hypotheses, based on both probability and geographical approaches, in contrast to a conventional, random cluster selection strategy.

  1. Cancer detection based on Raman spectra super-paramagnetic clustering

    NASA Astrophysics Data System (ADS)

    González-Solís, José Luis; Guizar-Ruiz, Juan Ignacio; Martínez-Espinosa, Juan Carlos; Martínez-Zerega, Brenda Esmeralda; Juárez-López, Héctor Alfonso; Vargas-Rodríguez, Héctor; Gallegos-Infante, Luis Armando; González-Silva, Ricardo Armando; Espinoza-Padilla, Pedro Basilio; Palomares-Anda, Pascual

    2016-08-01

    The clustering of Raman spectra of serum sample is analyzed using the super-paramagnetic clustering technique based in the Potts spin model. We investigated the clustering of biochemical networks by using Raman data that define edge lengths in the network, and where the interactions are functions of the Raman spectra's individual band intensities. For this study, we used two groups of 58 and 102 control Raman spectra and the intensities of 160, 150 and 42 Raman spectra of serum samples from breast and cervical cancer and leukemia patients, respectively. The spectra were collected from patients from different hospitals from Mexico. By using super-paramagnetic clustering technique, we identified the most natural and compact clusters allowing us to discriminate the control and cancer patients. A special interest was the leukemia case where its nearly hierarchical observed structure allowed the identification of the patients's leukemia type. The goal of this study is to apply a model of statistical physics, as the super-paramagnetic, to find these natural clusters that allow us to design a cancer detection method. To the best of our knowledge, this is the first report of preliminary results evaluating the usefulness of super-paramagnetic clustering in the discipline of spectroscopy where it is used for classification of spectra.

  2. ADAPTIVE MATCHING IN RANDOMIZED TRIALS AND OBSERVATIONAL STUDIES

    PubMed Central

    van der Laan, Mark J.; Balzer, Laura B.; Petersen, Maya L.

    2014-01-01

    SUMMARY In many randomized and observational studies the allocation of treatment among a sample of n independent and identically distributed units is a function of the covariates of all sampled units. As a result, the treatment labels among the units are possibly dependent, complicating estimation and posing challenges for statistical inference. For example, cluster randomized trials frequently sample communities from some target population, construct matched pairs of communities from those included in the sample based on some metric of similarity in baseline community characteristics, and then randomly allocate a treatment and a control intervention within each matched pair. In this case, the observed data can neither be represented as the realization of n independent random variables, nor, contrary to current practice, as the realization of n/2 independent random variables (treating the matched pair as the independent sampling unit). In this paper we study estimation of the average causal effect of a treatment under experimental designs in which treatment allocation potentially depends on the pre-intervention covariates of all units included in the sample. We define efficient targeted minimum loss based estimators for this general design, present a theorem that establishes the desired asymptotic normality of these estimators and allows for asymptotically valid statistical inference, and discuss implementation of these estimators. We further investigate the relative asymptotic efficiency of this design compared with a design in which unit-specific treatment assignment depends only on the units’ covariates. Our findings have practical implications for the optimal design and analysis of pair matched cluster randomized trials, as well as for observational studies in which treatment decisions may depend on characteristics of the entire sample. PMID:25097298

  3. Estimating the intra-cluster correlation coefficient for evaluating an educational intervention program to improve rabies awareness and dog bite prevention among children in Sikkim, India: A pilot study.

    PubMed

    Auplish, Aashima; Clarke, Alison S; Van Zanten, Trent; Abel, Kate; Tham, Charmaine; Bhutia, Thinlay N; Wilks, Colin R; Stevenson, Mark A; Firestone, Simon M

    2017-05-01

    Educational initiatives targeting at-risk populations have long been recognized as a mainstay of ongoing rabies control efforts. Cluster-based studies are often utilized to assess levels of knowledge, attitudes and practices of a population in response to education campaigns. The design of cluster-based studies requires estimates of intra-cluster correlation coefficients obtained from previous studies. This study estimates the school-level intra-cluster correlation coefficient (ICC) for rabies knowledge change following an educational intervention program. A cross-sectional survey was conducted with 226 students from 7 schools in Sikkim, India, using cluster sampling. In order to assess knowledge uptake, rabies education sessions with pre- and post-session questionnaires were administered. Paired differences of proportions were estimated for questions answered correctly. A mixed effects logistic regression model was developed to estimate school-level and student-level ICCs and to test for associations between gender, age, school location and educational level. The school- and student-level ICCs for rabies knowledge and awareness were 0.04 (95% CI: 0.01, 0.19) and 0.05 (95% CI: 0.2, 0.09), respectively. These ICCs suggest design effect multipliers of 5.45 schools and 1.05 students per school, will be required when estimating sample sizes and designing future cluster randomized trials. There was a good baseline level of rabies knowledge (mean pre-session score 71%), however, key knowledge gaps were identified in understanding appropriate behavior around scared dogs, potential sources of rabies and how to correctly order post rabies exposure precaution steps. After adjusting for the effect of gender, age, school location and education level, school and individual post-session test scores improved by 19%, with similar performance amongst boys and girls attending schools in urban and rural regions. The proportion of participants that were able to correctly order post-exposure precautionary steps following educational intervention increased by 87%. The ICC estimates presented in this study will aid in designing cluster-based studies evaluating educational interventions as part of disease control programs. This study demonstrates the likely benefits of educational intervention incorporating bite prevention and rabies education. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. The Method of Randomization for Cluster-Randomized Trials: Challenges of Including Patients with Multiple Chronic Conditions

    PubMed Central

    Esserman, Denise; Allore, Heather G.; Travison, Thomas G.

    2016-01-01

    Cluster-randomized clinical trials (CRT) are trials in which the unit of randomization is not a participant but a group (e.g. healthcare systems or community centers). They are suitable when the intervention applies naturally to the cluster (e.g. healthcare policy); when lack of independence among participants may occur (e.g. nursing home hygiene); or when it is most ethical to apply an intervention to all within a group (e.g. school-level immunization). Because participants in the same cluster receive the same intervention, CRT may approximate clinical practice, and may produce generalizable findings. However, when not properly designed or interpreted, CRT may induce biased results. CRT designs have features that add complexity to statistical estimation and inference. Chief among these is the cluster-level correlation in response measurements induced by the randomization. A critical consideration is the experimental unit of inference; often it is desirable to consider intervention effects at the level of the individual rather than the cluster. Finally, given that the number of clusters available may be limited, simple forms of randomization may not achieve balance between intervention and control arms at either the cluster- or participant-level. In non-clustered clinical trials, balance of key factors may be easier to achieve because the sample can be homogenous by exclusion of participants with multiple chronic conditions (MCC). CRTs, which are often pragmatic, may eschew such restrictions. Failure to account for imbalance may induce bias and reducing validity. This article focuses on the complexities of randomization in the design of CRTs, such as the inclusion of patients with MCC, and imbalances in covariate factors across clusters. PMID:27478520

  5. Homogeneity tests of clustered diagnostic markers with applications to the BioCycle Study

    PubMed Central

    Tang, Liansheng Larry; Liu, Aiyi; Schisterman, Enrique F.; Zhou, Xiao-Hua; Liu, Catherine Chun-ling

    2014-01-01

    Diagnostic trials often require the use of a homogeneity test among several markers. Such a test may be necessary to determine the power both during the design phase and in the initial analysis stage. However, no formal method is available for the power and sample size calculation when the number of markers is greater than two and marker measurements are clustered in subjects. This article presents two procedures for testing the accuracy among clustered diagnostic markers. The first procedure is a test of homogeneity among continuous markers based on a global null hypothesis of the same accuracy. The result under the alternative provides the explicit distribution for the power and sample size calculation. The second procedure is a simultaneous pairwise comparison test based on weighted areas under the receiver operating characteristic curves. This test is particularly useful if a global difference among markers is found by the homogeneity test. We apply our procedures to the BioCycle Study designed to assess and compare the accuracy of hormone and oxidative stress markers in distinguishing women with ovulatory menstrual cycles from those without. PMID:22733707

  6. HIFLUGCS: X-ray luminosity-dynamical mass relation and its implications for mass calibrations with the SPIDERS and 4MOST surveys

    NASA Astrophysics Data System (ADS)

    Zhang, Yu-Ying; Reiprich, Thomas H.; Schneider, Peter; Clerc, Nicolas; Merloni, Andrea; Schwope, Axel; Borm, Katharina; Andernach, Heinz; Caretta, César A.; Wu, Xiang-Ping

    2017-03-01

    We present the relation of X-ray luminosity versus dynamical mass for 63 nearby clusters of galaxies in a flux-limited sample, the HIghest X-ray FLUx Galaxy Cluster Sample (HIFLUGCS, consisting of 64 clusters). The luminosity measurements are obtained based on 1.3 Ms of clean XMM-Newton data and ROSAT pointed observations. The masses are estimated using optical spectroscopic redshifts of 13647 cluster galaxies in total. We classify clusters into disturbed and undisturbed based on a combination of the X-ray luminosity concentration and the offset between the brightest cluster galaxy and X-ray flux-weighted center. Given sufficient numbers (I.e., ≥45) of member galaxies when the dynamical masses are computed, the luminosity versus mass relations agree between the disturbed and undisturbed clusters. The cool-core clusters still dominate the scatter in the luminosity versus mass relation even when a core-corrected X-ray luminosity is used, which indicates that the scatter of this scaling relation mainly reflects the structure formation history of the clusters. As shown by the clusters with only few spectroscopically confirmed members, the dynamical masses can be underestimated and thus lead to a biased scaling relation. To investigate the potential of spectroscopic surveys to follow up high-redshift galaxy clusters or groups observed in X-ray surveys for the identifications and mass calibrations, we carried out Monte Carlo resampling of the cluster galaxy redshifts and calibrated the uncertainties of the redshift and dynamical mass estimates when only reduced numbers of galaxy redshifts per cluster are available. The resampling considers the SPIDERS and 4MOST configurations, designed for the follow-up of the eROSITA clusters, and was carried out for each cluster in the sample at the actual cluster redshift as well as at the assigned input cluster redshifts of 0.2, 0.4, 0.6, and 0.8. To follow up very distant clusters or groups, we also carried out the mass calibration based on the resampling with only ten redshifts per cluster, and redshift calibration based on the resampling with only five and ten redshifts per cluster, respectively. Our results demonstrate the power of combining upcoming X-ray and optical spectroscopic surveys for mass calibration of clusters. The scatter in the dynamical mass estimates for the clusters with at least ten members is within 50%.

  7. The Distance to the Coma Cluster from the Tully--Fisher Relation

    NASA Astrophysics Data System (ADS)

    Herter, T.; Vogt, N. P.; Haynes, M. P.; Giovanelli, R.

    1993-12-01

    As part of a survey to determine the distances to nearby (z < .04) Abell clusters via application of the Tully--Fisher (TF) relation, we have obtained 21 cm HI line widths, optical rotation curves and photometric I--band CCD images of galaxies within and near the Coma cluster. Because spiral galaxies within the cluster itself are HI deficient and thus are detected marginally or not at all in HI, distance determinations using only the radio TF relation exclude true cluster members. Our sample includes eight HI deficient galaxies within 1.5 degrees of the cluster center, for which optical velocity widths are derived from their Hα and [NII] rotation curves. The 21 cm line widths have been extracted using a new algorithm designed to optimize the measurement for TF applications, taking into account the effects of spectral resolution and smoothing. The optical width is constructed from the velocity histogram, and is therefore a global value akin to the HI width. A correction for turbulent broadening of the HI is derived from comparison of the optical and HI widths. Using a combined sample of 260 galaxies in 11 clusters and an additional 30 field objects at comparable distances, we have performed a calibration of the radio and optical analogs of the TF relation. Preliminary results show a clear linear relationship with a small offset between optical and radio widths, and good agreement in deriving Tully--Fisher distances to clusters. Our Coma sample consists of 28 galaxies with optical widths and 42 with HI line widths, with an overlapping set of 20 galaxies. We will present the data on the Coma cluster, and discuss the results of our analysis.

  8. Cluster analysis of molecular simulation trajectories for systems where both conformation and orientation of the sampled states are important.

    PubMed

    Abramyan, Tigran M; Snyder, James A; Thyparambil, Aby A; Stuart, Steven J; Latour, Robert A

    2016-08-05

    Clustering methods have been widely used to group together similar conformational states from molecular simulations of biomolecules in solution. For applications such as the interaction of a protein with a surface, the orientation of the protein relative to the surface is also an important clustering parameter because of its potential effect on adsorbed-state bioactivity. This study presents cluster analysis methods that are specifically designed for systems where both molecular orientation and conformation are important, and the methods are demonstrated using test cases of adsorbed proteins for validation. Additionally, because cluster analysis can be a very subjective process, an objective procedure for identifying both the optimal number of clusters and the best clustering algorithm to be applied to analyze a given dataset is presented. The method is demonstrated for several agglomerative hierarchical clustering algorithms used in conjunction with three cluster validation techniques. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  9. An Australian Version of the Neighborhood Environment Walkability Scale: Validity Evidence

    ERIC Educational Resources Information Center

    Cerin, Ester; Leslie, Eva; Owen, Neville; Bauman, Adrian

    2008-01-01

    This study examined validity evidence for the Australian version of the Neighborhood Environment Walkability Scale (NEWS-AU). A stratified two-stage cluster sampling design was used to recruit 2,650 adults from Adelaide (Australia). The sample was drawn from residential addresses within eight high-walkable and eight low-walkable suburbs matched…

  10. Groundwater source contamination mechanisms: Physicochemical profile clustering, risk factor analysis and multivariate modelling

    NASA Astrophysics Data System (ADS)

    Hynds, Paul; Misstear, Bruce D.; Gill, Laurence W.; Murphy, Heather M.

    2014-04-01

    An integrated domestic well sampling and "susceptibility assessment" programme was undertaken in the Republic of Ireland from April 2008 to November 2010. Overall, 211 domestic wells were sampled, assessed and collated with local climate data. Based upon groundwater physicochemical profile, three clusters have been identified and characterised by source type (borehole or hand-dug well) and local geological setting. Statistical analysis indicates that cluster membership is significantly associated with the prevalence of bacteria (p = 0.001), with mean Escherichia coli presence within clusters ranging from 15.4% (Cluster-1) to 47.6% (Cluster-3). Bivariate risk factor analysis shows that on-site septic tank presence was the only risk factor significantly associated (p < 0.05) with bacterial presence within all clusters. Point agriculture adjacency was significantly associated with both borehole-related clusters. Well design criteria were associated with hand-dug wells and boreholes in areas characterised by high permeability subsoils, while local geological setting was significant for hand-dug wells and boreholes in areas dominated by low/moderate permeability subsoils. Multivariate susceptibility models were developed for all clusters, with predictive accuracies of 84% (Cluster-1) to 91% (Cluster-2) achieved. Septic tank setback was a common variable within all multivariate models, while agricultural sources were also significant, albeit to a lesser degree. Furthermore, well liner clearance was a significant factor in all models, indicating that direct surface ingress is a significant well contamination mechanism. Identification and elucidation of cluster-specific contamination mechanisms may be used to develop improved overall risk management and wellhead protection strategies, while also informing future remediation and maintenance efforts.

  11. Linking Teacher Competences to Organizational Citizenship Behaviour: The Role of Empowerment

    ERIC Educational Resources Information Center

    Kasekende, Francis; Munene, John C.; Otengei, Samson Omuudu; Ntayi, Joseph Mpeera

    2016-01-01

    Purpose: The purpose of this paper is to examine relationship between teacher competences and organizational citizenship behavior (OCB) with empowerment as a mediating factor. Design/methodology/approach: The study took a cross-sectional descriptive and analytical design. Using cluster and random sampling procedures, data were obtained from 383…

  12. Anomalous dismeter distribution shifts estimated from FIA inventories through time

    Treesearch

    Francis A. Roesch; Paul C. Van Deusen

    2010-01-01

    In the past decade, the United States Department of Agriculture Forest Service’s Forest Inventory and Analysis Program (FIA) has replaced regionally autonomous, periodic, state-wide forest inventories using various probability proportional to tree size sampling designs with a nationally consistent annual forest inventory design utilizing systematically spaced clusters...

  13. Clustering of longitudinal data by using an extended baseline: A new method for treatment efficacy clustering in longitudinal data.

    PubMed

    Schramm, Catherine; Vial, Céline; Bachoud-Lévi, Anne-Catherine; Katsahian, Sandrine

    2018-01-01

    Heterogeneity in treatment efficacy is a major concern in clinical trials. Clustering may help to identify the treatment responders and the non-responders. In the context of longitudinal cluster analyses, sample size and variability of the times of measurements are the main issues with the current methods. Here, we propose a new two-step method for the Clustering of Longitudinal data by using an Extended Baseline. The first step relies on a piecewise linear mixed model for repeated measurements with a treatment-time interaction. The second step clusters the random predictions and considers several parametric (model-based) and non-parametric (partitioning, ascendant hierarchical clustering) algorithms. A simulation study compares all options of the clustering of longitudinal data by using an extended baseline method with the latent-class mixed model. The clustering of longitudinal data by using an extended baseline method with the two model-based algorithms was the more robust model. The clustering of longitudinal data by using an extended baseline method with all the non-parametric algorithms failed when there were unequal variances of treatment effect between clusters or when the subgroups had unbalanced sample sizes. The latent-class mixed model failed when the between-patients slope variability is high. Two real data sets on neurodegenerative disease and on obesity illustrate the clustering of longitudinal data by using an extended baseline method and show how clustering may help to identify the marker(s) of the treatment response. The application of the clustering of longitudinal data by using an extended baseline method in exploratory analysis as the first stage before setting up stratified designs can provide a better estimation of treatment effect in future clinical trials.

  14. Joint scaling properties of Sunyaev-Zel'dovich and optical richness observables in an optically-selected galaxy cluster sample

    NASA Astrophysics Data System (ADS)

    Greer, Christopher Holland

    Galaxy cluster abundance measurements are an important tool used to study the universe as a whole. The advent of multiple large-area galaxy cluster surveys across multiple ensures that cluster measurements will play a key role in understanding the dark energy currently thought to be accelerating the universe. The main systematic limitation at the moment is the understanding of the observable-mass relation. Recent theoretical work has shown that combining samples of clusters from surveys at different wavelengths can mitigate this systematic limitation. Precise measurements of the scatter in the observable-mass relation can lead to further improvements. We present Combined Array for Research in Millimeter-wave Astronomy (CARMA) observations of the Sunyaev-Zel'dovich (SZ) signal for 28 galaxy clusters selected from the Sloan Digital Sky Survey (SDSS) maxBCG catalog. This cluster sample represents a complete, volume-limited sample of the richest galaxy clusters in the SDSS between redshifts 0.2 ≥ z ≥ 0.3, as measured by the RedMaPPer algorithm being developed for the Dark Energy Survey (DES; Rykoff et al. 2012). We develop a formalism that uses the cluster abundance in tandem with the galaxy richness measurements from SDSS and the SZ signal measurements from CARMA to calibrate the SZ and optical observable-mass relations. We find that the scatter in richness at fixed mass is σlog λ| M = 0.24+0.09-0.07 using SZ signal calculated by integrating a cluster pressure profile to a radius of 1 Mpc at the redshift of the cluster. We also calculate the SZ signal at R500 and find that the choice of scaling relation used to determined R500 has a non-trivial effect on the constraints of the observable-mass relationship. Finally, we investigate the source of disagreement between the positions of the SZ signal and SDSS Brightest Cluster Galaxies (BCGs). Improvements to the richness calculator that account for blue BCGs in the cores of cool-core X-ray clusters, as well as multiple BCGs in merger situations will help reduce σ log λ|M further. This work is the first independent calibration of the RedMaPPer algorithm that is being designed for the Dark Energy Survey.

  15. Exploring cluster Monte Carlo updates with Boltzmann machines

    NASA Astrophysics Data System (ADS)

    Wang, Lei

    2017-11-01

    Boltzmann machines are physics informed generative models with broad applications in machine learning. They model the probability distribution of an input data set with latent variables and generate new samples accordingly. Applying the Boltzmann machines back to physics, they are ideal recommender systems to accelerate the Monte Carlo simulation of physical systems due to their flexibility and effectiveness. More intriguingly, we show that the generative sampling of the Boltzmann machines can even give different cluster Monte Carlo algorithms. The latent representation of the Boltzmann machines can be designed to mediate complex interactions and identify clusters of the physical system. We demonstrate these findings with concrete examples of the classical Ising model with and without four-spin plaquette interactions. In the future, automatic searches in the algorithm space parametrized by Boltzmann machines may discover more innovative Monte Carlo updates.

  16. THE RED-SEQUENCE CLUSTER SURVEY-2 (RCS-2): SURVEY DETAILS AND PHOTOMETRIC CATALOG CONSTRUCTION

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gilbank, David G.; Gladders, M. D.; Yee, H. K. C.

    2011-03-15

    The second Red-sequence Cluster Survey (RCS-2) is a {approx}1000 deg{sup 2}, multi-color imaging survey using the square-degree imager, MegaCam, on the Canada-France-Hawaii Telescope. It is designed to detect clusters of galaxies over the redshift range 0.1 {approx}< z {approx}< 1. The primary aim is to build a statistically complete, large ({approx}10{sup 4}) sample of clusters, covering a sufficiently long redshift baseline to be able to place constraints on cosmological parameters via the evolution of the cluster mass function. Other main science goals include building a large sample of high surface brightness, strongly gravitationally lensed arcs associated with these clusters, andmore » an unprecedented sample of several tens of thousands of galaxy clusters and groups, spanning a large range of halo mass, with which to study the properties and evolution of their member galaxies. This paper describes the design of the survey and the methodology for acquiring, reducing, and calibrating the data for the production of high-precision photometric catalogs. We describe the method for calibrating our griz imaging data using the colors of the stellar locus and overlapping Two Micron All Sky Survey photometry. This yields an absolute accuracy of <0.03 mag on any color and {approx}0.05 mag in the r-band magnitude, verified with respect to the Sloan Digital Sky Survey (SDSS). Our astrometric calibration is accurate to <<0.''3 from comparison with SDSS positions. RCS-2 reaches average 5{sigma} point-source limiting magnitudes of griz = [24.4, 24.3, 23.7, 22.8], approximately 1-2 mag deeper than the SDSS. Due to the queue-scheduled nature of the observations, the data are highly uniform and taken in excellent seeing, mostly FWHM {approx}< 0.''7 in the r band. In addition to the main science goals just described, these data form the basis for a number of other planned and ongoing projects (including the WiggleZ survey), making RCS-2 an important next-generation imaging survey.« less

  17. GalWeight: A New and Effective Weighting Technique for Determining Galaxy Cluster and Group Membership

    NASA Astrophysics Data System (ADS)

    Abdullah, Mohamed H.; Wilson, Gillian; Klypin, Anatoly

    2018-07-01

    We introduce GalWeight, a new technique for assigning galaxy cluster membership. This technique is specifically designed to simultaneously maximize the number of bona fide cluster members while minimizing the number of contaminating interlopers. The GalWeight technique can be applied to both massive galaxy clusters and poor galaxy groups. Moreover, it is effective in identifying members in both the virial and infall regions with high efficiency. We apply the GalWeight technique to MDPL2 and Bolshoi N-body simulations, and find that it is >98% accurate in correctly assigning cluster membership. We show that GalWeight compares very favorably against four well-known existing cluster membership techniques (shifting gapper, den Hartog, caustic, SIM). We also apply the GalWeight technique to a sample of 12 Abell clusters (including the Coma cluster) using observations from the Sloan Digital Sky Survey. We conclude by discussing GalWeight’s potential for other astrophysical applications.

  18. Objective sampling design in a highly heterogeneous landscape - characterizing environmental determinants of malaria vector distribution in French Guiana, in the Amazonian region.

    PubMed

    Roux, Emmanuel; Gaborit, Pascal; Romaña, Christine A; Girod, Romain; Dessay, Nadine; Dusfour, Isabelle

    2013-12-01

    Sampling design is a key issue when establishing species inventories and characterizing habitats within highly heterogeneous landscapes. Sampling efforts in such environments may be constrained and many field studies only rely on subjective and/or qualitative approaches to design collection strategy. The region of Cacao, in French Guiana, provides an excellent study site to understand the presence and abundance of Anopheles mosquitoes, their species dynamics and the transmission risk of malaria across various environments. We propose an objective methodology to define a stratified sampling design. Following thorough environmental characterization, a factorial analysis of mixed groups allows the data to be reduced and non-collinear principal components to be identified while balancing the influences of the different environmental factors. Such components defined new variables which could then be used in a robust k-means clustering procedure. Then, we identified five clusters that corresponded to our sampling strata and selected sampling sites in each stratum. We validated our method by comparing the species overlap of entomological collections from selected sites and the environmental similarities of the same sites. The Morisita index was significantly correlated (Pearson linear correlation) with environmental similarity based on i) the balanced environmental variable groups considered jointly (p = 0.001) and ii) land cover/use (p-value < 0.001). The Jaccard index was significantly correlated with land cover/use-based environmental similarity (p-value = 0.001). The results validate our sampling approach. Land cover/use maps (based on high spatial resolution satellite images) were shown to be particularly useful when studying the presence, density and diversity of Anopheles mosquitoes at local scales and in very heterogeneous landscapes.

  19. Objective sampling design in a highly heterogeneous landscape - characterizing environmental determinants of malaria vector distribution in French Guiana, in the Amazonian region

    PubMed Central

    2013-01-01

    Background Sampling design is a key issue when establishing species inventories and characterizing habitats within highly heterogeneous landscapes. Sampling efforts in such environments may be constrained and many field studies only rely on subjective and/or qualitative approaches to design collection strategy. The region of Cacao, in French Guiana, provides an excellent study site to understand the presence and abundance of Anopheles mosquitoes, their species dynamics and the transmission risk of malaria across various environments. We propose an objective methodology to define a stratified sampling design. Following thorough environmental characterization, a factorial analysis of mixed groups allows the data to be reduced and non-collinear principal components to be identified while balancing the influences of the different environmental factors. Such components defined new variables which could then be used in a robust k-means clustering procedure. Then, we identified five clusters that corresponded to our sampling strata and selected sampling sites in each stratum. Results We validated our method by comparing the species overlap of entomological collections from selected sites and the environmental similarities of the same sites. The Morisita index was significantly correlated (Pearson linear correlation) with environmental similarity based on i) the balanced environmental variable groups considered jointly (p = 0.001) and ii) land cover/use (p-value << 0.001). The Jaccard index was significantly correlated with land cover/use-based environmental similarity (p-value = 0.001). Conclusions The results validate our sampling approach. Land cover/use maps (based on high spatial resolution satellite images) were shown to be particularly useful when studying the presence, density and diversity of Anopheles mosquitoes at local scales and in very heterogeneous landscapes. PMID:24289184

  20. A cluster-randomised, controlled trial to assess the impact of a workplace osteoporosis prevention intervention on the dietary and physical activity behaviours of working women: study protocol.

    PubMed

    Tan, Ai May; Lamontagne, Anthony D; Sarmugam, Rani; Howard, Peter

    2013-04-29

    Osteoporosis is a debilitating disease and its risk can be reduced through adequate calcium consumption and physical activity. This protocol paper describes a workplace-based intervention targeting behaviour change in premenopausal women working in sedentary occupations. A cluster-randomised design was used, comparing the efficacy of a tailored intervention to standard care. Workplaces were the clusters and units of randomisation and intervention. Sample size calculations incorporated the cluster design. Final number of clusters was determined to be 16, based on a cluster size of 20 and calcium intake parameters (effect size 250 mg, ICC 0.5 and standard deviation 290 mg) as it required the highest number of clusters.Sixteen workplaces were recruited from a pool of 97 workplaces and randomly assigned to intervention and control arms (eight in each). Women meeting specified inclusion criteria were then recruited to participate. Workplaces in the intervention arm received three participatory workshops and organisation wide educational activities. Workplaces in the control/standard care arm received print resources. Intervention workshops were guided by self-efficacy theory and included participatory activities such as goal setting, problem solving, local food sampling, exercise trials, group discussion and behaviour feedback.Outcomes measures were calcium intake (milligrams/day) and physical activity level (duration: minutes/week), measured at baseline, four weeks and six months post intervention. This study addresses the current lack of evidence for behaviour change interventions focussing on osteoporosis prevention. It addresses missed opportunities of using workplaces as a platform to target high-risk individuals with sedentary occupations. The intervention was designed to modify behaviour levels to bring about risk reduction. It is the first to address dietary and physical activity components each with unique intervention strategies in the context of osteoporosis prevention. The intervention used locally relevant behavioural strategies previously shown to support good outcomes in other countries. The combination of these elements have not been incorporated in similar studies in the past, supporting the study hypothesis that the intervention will be more efficacious than standard practice in osteoporosis prevention through improvements in calcium intake and physical activity.

  1. Intraclass correlation and design effect in BMI, physical activity and diet: a cross-sectional study of 56 countries.

    PubMed

    Masood, Mohd; Reidpath, Daniel D

    2016-01-07

    Measuring the intraclass correlation coefficient (ICC) and design effect (DE) may help to modify the public health interventions for body mass index (BMI), physical activity and diet according to geographic targeting of interventions in different countries. The purpose of this study was to quantify the level of clustering and DE in BMI, physical activity and diet in 56 low-income, middle-income and high-income countries. Cross-sectional study design. Multicountry national survey data. The World Health Survey (WHS), 2003, data were used to examine clustering in BMI, physical activity in metabolic equivalent of task (MET) and diet in fruits and vegetables intake (FVI) from low-income, middle-income and high-income countries. Multistage sampling in the WHS used geographical clusters as primary sampling units (PSU). These PSUs were used as a clustering or grouping variable in this analysis. Multilevel intercept only regression models were used to calculate the ICC and DE for each country. The median ICC (0.039) and median DE (1.82) for BMI were low; however, FVI had a higher median ICC (0.189) and median DE (4.16). For MET, the median ICC was 0.141 and median DE was 4.59. In some countries, however, the ICC and DE for BMI were large. For instance, South Africa had the highest ICC (0.39) and DE (11.9) for BMI, whereas Uruguay had the highest ICC (0.434) for MET and Ethiopia had the highest ICC (0.471) for FVI. This study shows that across a wide range of countries, there was low area level clustering for BMI, whereas MET and FVI showed high area level clustering. These results suggested that the country level clustering effect should be considered in developing preventive approaches for BMI, as well as improving physical activity and healthy diets for each country. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

  2. Validity, Reliability and Difficulty Indices for Instructor-Built Exam Questions

    ERIC Educational Resources Information Center

    Jandaghi, Gholamreza; Shaterian, Fatemeh

    2008-01-01

    The purpose of the research is to determine college Instructor's skill rate in designing exam questions in chemistry subject. The statistical population was all of chemistry exam sheets for two semesters in one academic year from which a sample of 364 exam sheets was drawn using multistage cluster sampling. Two experts assessed the sheets and by…

  3. Application of adaptive cluster sampling to low-density populations of freshwater mussels

    USGS Publications Warehouse

    Smith, D.R.; Villella, R.F.; Lemarie, D.P.

    2003-01-01

    Freshwater mussels appear to be promising candidates for adaptive cluster sampling because they are benthic macroinvertebrates that cluster spatially and are frequently found at low densities. We applied adaptive cluster sampling to estimate density of freshwater mussels at 24 sites along the Cacapon River, WV, where a preliminary timed search indicated that mussels were present at low density. Adaptive cluster sampling increased yield of individual mussels and detection of uncommon species; however, it did not improve precision of density estimates. Because finding uncommon species, collecting individuals of those species, and estimating their densities are important conservation activities, additional research is warranted on application of adaptive cluster sampling to freshwater mussels. However, at this time we do not recommend routine application of adaptive cluster sampling to freshwater mussel populations. The ultimate, and currently unanswered, question is how to tell when adaptive cluster sampling should be used, i.e., when is a population sufficiently rare and clustered for adaptive cluster sampling to be efficient and practical? A cost-effective procedure needs to be developed to identify biological populations for which adaptive cluster sampling is appropriate.

  4. Statistical strategy for inventorying and monitoring the ecosystem resources of the Mexican States of Jalisco and Colima at multiple scales and resolution levels

    Treesearch

    H. T. Schreuder; M. S. Williams; C. Aguirre-Bravo; P. L. Patterson

    2003-01-01

    The sampling strategy is presented for the initial phase of the natural resources pilot project in the Mexican States of Jalisco and Colima. The sampling design used is ground-based cluster sampling with poststratification based on Landsat Thematic Mapper imagery. The data collected will serve as a basis for additional data collection, mapping, and spatial modeling...

  5. X-ray versus infrared selection of distant galaxy clusters: A case study using the XMM-LSS and SpARCS cluster samples

    NASA Astrophysics Data System (ADS)

    Willis, J. P.; Ramos-Ceja, M. E.; Muzzin, A.; Pacaud, F.; Yee, H. K. C.; Wilson, G.

    2018-04-01

    We present a comparison of two samples of z > 0.8 galaxy clusters selected using different wavelength-dependent techniques and examine the physical differences between them. We consider 18 clusters from the X-ray selected XMM-LSS distant cluster survey and 92 clusters from the optical-MIR selected SpARCS cluster survey. Both samples are selected from the same approximately 9 square degree sky area and we examine them using common XMM-Newton, Spitzer-SWIRE and CFHT Legacy Survey data. Clusters from each sample are compared employing aperture measures of X-ray and MIR emission. We divide the SpARCS distant cluster sample into three sub-samples: a) X-ray bright, b) X-ray faint, MIR bright, and c) X-ray faint, MIR faint clusters. We determine that X-ray and MIR selected clusters display very similar surface brightness distributions of galaxy MIR light. In addition, the average location and amplitude of the galaxy red sequence as measured from stacked colour histograms is very similar in the X-ray and MIR-selected samples. The sub-sample of X-ray faint, MIR bright clusters displays a distribution of BCG-barycentre position offsets which extends to higher values than all other samples. This observation indicates that such clusters may exist in a more disturbed state compared to the majority of the distant cluster population sampled by XMM-LSS and SpARCS. This conclusion is supported by stacked X-ray images for the X-ray faint, MIR bright cluster sub-sample that display weak, centrally-concentrated X-ray emission, consistent with a population of growing clusters accreting from an extended envelope of material.

  6. Biases in the OSSOS Detection of Large Semimajor Axis Trans-Neptunian Objects

    NASA Astrophysics Data System (ADS)

    Gladman, Brett; Shankman, Cory; OSSOS Collaboration

    2017-10-01

    The accumulating but small set of large semimajor axis trans-Neptunian objects (TNOs) shows an apparent clustering in the orientations of their orbits. This clustering must either be representative of the intrinsic distribution of these TNOs, or else have arisen as a result of observation biases and/or statistically expected variations for such a small set of detected objects. The clustered TNOs were detected across different and independent surveys, which has led to claims that the detections are therefore free of observational bias. This apparent clustering has led to the so-called “Planet 9” hypothesis that a super-Earth currently resides in the distant solar system and causes this clustering. The Outer Solar System Origins Survey (OSSOS) is a large program that ran on the Canada-France-Hawaii Telescope from 2013 to 2017, discovering more than 800 new TNOs. One of the primary design goals of OSSOS was the careful determination of observational biases that would manifest within the detected sample. We demonstrate the striking and non-intuitive biases that exist for the detection of TNOs with large semimajor axes. The eight large semimajor axis OSSOS detections are an independent data set, of comparable size to the conglomerate samples used in previous studies. We conclude that the orbital distribution of the OSSOS sample is consistent with being detected from a uniform underlying angular distribution.

  7. OSSOS. VI. Striking Biases in the Detection of Large Semimajor Axis Trans-Neptunian Objects

    NASA Astrophysics Data System (ADS)

    Shankman, Cory; Kavelaars, J. J.; Bannister, Michele T.; Gladman, Brett J.; Lawler, Samantha M.; Chen, Ying-Tung; Jakubik, Marian; Kaib, Nathan; Alexandersen, Mike; Gwyn, Stephen D. J.; Petit, Jean-Marc; Volk, Kathryn

    2017-08-01

    The accumulating but small set of large semimajor axis trans-Neptunian objects (TNOs) shows an apparent clustering in the orientations of their orbits. This clustering must either be representative of the intrinsic distribution of these TNOs, or else have arisen as a result of observation biases and/or statistically expected variations for such a small set of detected objects. The clustered TNOs were detected across different and independent surveys, which has led to claims that the detections are therefore free of observational bias. This apparent clustering has led to the so-called “Planet 9” hypothesis that a super-Earth currently resides in the distant solar system and causes this clustering. The Outer Solar System Origins Survey (OSSOS) is a large program that ran on the Canada–France–Hawaii Telescope from 2013 to 2017, discovering more than 800 new TNOs. One of the primary design goals of OSSOS was the careful determination of observational biases that would manifest within the detected sample. We demonstrate the striking and non-intuitive biases that exist for the detection of TNOs with large semimajor axes. The eight large semimajor axis OSSOS detections are an independent data set, of comparable size to the conglomerate samples used in previous studies. We conclude that the orbital distribution of the OSSOS sample is consistent with being detected from a uniform underlying angular distribution.

  8. X-ray versus infrared selection of distant galaxy clusters: a case study using the XMM-LSS and SpARCS cluster samples

    NASA Astrophysics Data System (ADS)

    Willis, J. P.; Ramos-Ceja, M. E.; Muzzin, A.; Pacaud, F.; Yee, H. K. C.; Wilson, G.

    2018-07-01

    We present a comparison of two samples of z> 0.8 galaxy clusters selected using different wavelength-dependent techniques and examine the physical differences between them. We consider 18 clusters from the X-ray-selected XMM Large Scale Structure (LSS) distant cluster survey and 92 clusters from the optical-mid-infrared (MIR)-selected Spitzer Adaptation of the Red Sequence Cluster survey (SpARCS) cluster survey. Both samples are selected from the same approximately 9 sq deg sky area and we examine them using common XMM-Newton, Spitizer Wide-Area Infrared Extra-galactic (SWIRE) survey, and Canada-France-Hawaii Telescope Legacy Survey data. Clusters from each sample are compared employing aperture measures of X-ray and MIR emission. We divide the SpARCS distant cluster sample into three sub-samples: (i) X-ray bright, (ii) X-ray faint, MIR bright, and (iii) X-ray faint, MIR faint clusters. We determine that X-ray- and MIR-selected clusters display very similar surface brightness distributions of galaxy MIR light. In addition, the average location and amplitude of the galaxy red sequence as measured from stacked colour histograms is very similar in the X-ray- and MIR-selected samples. The sub-sample of X-ray faint, MIR bright clusters displays a distribution of brightest cluster galaxy-barycentre position offsets which extends to higher values than all other samples. This observation indicates that such clusters may exist in a more disturbed state compared to the majority of the distant cluster population sampled by XMM-LSS and SpARCS. This conclusion is supported by stacked X-ray images for the X-ray faint, MIR bright cluster sub-sample that display weak, centrally concentrated X-ray emission, consistent with a population of growing clusters accreting from an extended envelope of material.

  9. Design of partially supervised classifiers for multispectral image data

    NASA Technical Reports Server (NTRS)

    Jeon, Byeungwoo; Landgrebe, David

    1993-01-01

    A partially supervised classification problem is addressed, especially when the class definition and corresponding training samples are provided a priori only for just one particular class. In practical applications of pattern classification techniques, a frequently observed characteristic is the heavy, often nearly impossible requirements on representative prior statistical class characteristics of all classes in a given data set. Considering the effort in both time and man-power required to have a well-defined, exhaustive list of classes with a corresponding representative set of training samples, this 'partially' supervised capability would be very desirable, assuming adequate classifier performance can be obtained. Two different classification algorithms are developed to achieve simplicity in classifier design by reducing the requirement of prior statistical information without sacrificing significant classifying capability. The first one is based on optimal significance testing, where the optimal acceptance probability is estimated directly from the data set. In the second approach, the partially supervised classification is considered as a problem of unsupervised clustering with initially one known cluster or class. A weighted unsupervised clustering procedure is developed to automatically define other classes and estimate their class statistics. The operational simplicity thus realized should make these partially supervised classification schemes very viable tools in pattern classification.

  10. Two-Phase and Graph-Based Clustering Methods for Accurate and Efficient Segmentation of Large Mass Spectrometry Images.

    PubMed

    Dexter, Alex; Race, Alan M; Steven, Rory T; Barnes, Jennifer R; Hulme, Heather; Goodwin, Richard J A; Styles, Iain B; Bunch, Josephine

    2017-11-07

    Clustering is widely used in MSI to segment anatomical features and differentiate tissue types, but existing approaches are both CPU and memory-intensive, limiting their application to small, single data sets. We propose a new approach that uses a graph-based algorithm with a two-phase sampling method that overcomes this limitation. We demonstrate the algorithm on a range of sample types and show that it can segment anatomical features that are not identified using commonly employed algorithms in MSI, and we validate our results on synthetic MSI data. We show that the algorithm is robust to fluctuations in data quality by successfully clustering data with a designed-in variance using data acquired with varying laser fluence. Finally, we show that this method is capable of generating accurate segmentations of large MSI data sets acquired on the newest generation of MSI instruments and evaluate these results by comparison with histopathology.

  11. Impact of Sampling Density on the Extent of HIV Clustering

    PubMed Central

    Novitsky, Vlad; Moyo, Sikhulile; Lei, Quanhong; DeGruttola, Victor

    2014-01-01

    Abstract Identifying and monitoring HIV clusters could be useful in tracking the leading edge of HIV transmission in epidemics. Currently, greater specificity in the definition of HIV clusters is needed to reduce confusion in the interpretation of HIV clustering results. We address sampling density as one of the key aspects of HIV cluster analysis. The proportion of viral sequences in clusters was estimated at sampling densities from 1.0% to 70%. A set of 1,248 HIV-1C env gp120 V1C5 sequences from a single community in Botswana was utilized in simulation studies. Matching numbers of HIV-1C V1C5 sequences from the LANL HIV Database were used as comparators. HIV clusters were identified by phylogenetic inference under bootstrapped maximum likelihood and pairwise distance cut-offs. Sampling density below 10% was associated with stochastic HIV clustering with broad confidence intervals. HIV clustering increased linearly at sampling density >10%, and was accompanied by narrowing confidence intervals. Patterns of HIV clustering were similar at bootstrap thresholds 0.7 to 1.0, but the extent of HIV clustering decreased with higher bootstrap thresholds. The origin of sampling (local concentrated vs. scattered global) had a substantial impact on HIV clustering at sampling densities ≥10%. Pairwise distances at 10% were estimated as a threshold for cluster analysis of HIV-1 V1C5 sequences. The node bootstrap support distribution provided additional evidence for 10% sampling density as the threshold for HIV cluster analysis. The detectability of HIV clusters is substantially affected by sampling density. A minimal genotyping density of 10% and sampling density of 50–70% are suggested for HIV-1 V1C5 cluster analysis. PMID:25275430

  12. How Professionalized Is College Teaching? Norms and the Ideal of Service. ASHE Annual Meeting Paper.

    ERIC Educational Resources Information Center

    Braxton, John M.; Bayer, Alan E.

    This study examined the behavioral expectations and norms for college and university faculty particularly whether they varied with respect to the level of commitment to teaching at different institutions and in different disciplines. A cluster sampling design was used to select a random sample of the population of faculty in biology, history,…

  13. How Much Videos Win over Audios in Listening Instruction for EFL Learners

    ERIC Educational Resources Information Center

    Yasin, Burhanuddin; Mustafa, Faisal; Permatasari, Rizki

    2017-01-01

    This study aims at comparing the benefits of using videos instead of audios for improving students' listening skills. This experimental study used a pre-test and post-test control group design. The sample, selected by cluster random sampling resulted in the selection of 32 second year high school students for each group. The instruments used were…

  14. A practical Bayesian stepped wedge design for community-based cluster-randomized clinical trials: The British Columbia Telehealth Trial.

    PubMed

    Cunanan, Kristen M; Carlin, Bradley P; Peterson, Kevin A

    2016-12-01

    Many clinical trial designs are impractical for community-based clinical intervention trials. Stepped wedge trial designs provide practical advantages, but few descriptions exist of their clinical implementational features, statistical design efficiencies, and limitations. Enhance efficiency of stepped wedge trial designs by evaluating the impact of design characteristics on statistical power for the British Columbia Telehealth Trial. The British Columbia Telehealth Trial is a community-based, cluster-randomized, controlled clinical trial in rural and urban British Columbia. To determine the effect of an Internet-based telehealth intervention on healthcare utilization, 1000 subjects with an existing diagnosis of congestive heart failure or type 2 diabetes will be enrolled from 50 clinical practices. Hospital utilization is measured using a composite of disease-specific hospital admissions and emergency visits. The intervention comprises online telehealth data collection and counseling provided to support a disease-specific action plan developed by the primary care provider. The planned intervention is sequentially introduced across all participating practices. We adopt a fully Bayesian, Markov chain Monte Carlo-driven statistical approach, wherein we use simulation to determine the effect of cluster size, sample size, and crossover interval choice on type I error and power to evaluate differences in hospital utilization. For our Bayesian stepped wedge trial design, simulations suggest moderate decreases in power when crossover intervals from control to intervention are reduced from every 3 to 2 weeks, and dramatic decreases in power as the numbers of clusters decrease. Power and type I error performance were not notably affected by the addition of nonzero cluster effects or a temporal trend in hospitalization intensity. Stepped wedge trial designs that intervene in small clusters across longer periods can provide enhanced power to evaluate comparative effectiveness, while offering practical implementation advantages in geographic stratification, temporal change, use of existing data, and resource distribution. Current population estimates were used; however, models may not reflect actual event rates during the trial. In addition, temporal or spatial heterogeneity can bias treatment effect estimates. © The Author(s) 2016.

  15. Assessing map accuracy in a remotely sensed, ecoregion-scale cover map

    USGS Publications Warehouse

    Edwards, T.C.; Moisen, Gretchen G.; Cutler, D.R.

    1998-01-01

    Landscape- and ecoregion-based conservation efforts increasingly use a spatial component to organize data for analysis and interpretation. A challenge particular to remotely sensed cover maps generated from these efforts is how best to assess the accuracy of the cover maps, especially when they can exceed 1000 s/km2 in size. Here we develop and describe a methodological approach for assessing the accuracy of large-area cover maps, using as a test case the 21.9 million ha cover map developed for Utah Gap Analysis. As part of our design process, we first reviewed the effect of intracluster correlation and a simple cost function on the relative efficiency of cluster sample designs to simple random designs. Our design ultimately combined clustered and subsampled field data stratified by ecological modeling unit and accessibility (hereafter a mixed design). We next outline estimation formulas for simple map accuracy measures under our mixed design and report results for eight major cover types and the three ecoregions mapped as part of the Utah Gap Analysis. Overall accuracy of the map was 83.2% (SE=1.4). Within ecoregions, accuracy ranged from 78.9% to 85.0%. Accuracy by cover type varied, ranging from a low of 50.4% for barren to a high of 90.6% for man modified. In addition, we examined gains in efficiency of our mixed design compared with a simple random sample approach. In regard to precision, our mixed design was more precise than a simple random design, given fixed sample costs. We close with a discussion of the logistical constraints facing attempts to assess the accuracy of large-area, remotely sensed cover maps.

  16. Changing cluster composition in cluster randomised controlled trials: design and analysis considerations

    PubMed Central

    2014-01-01

    Background There are many methodological challenges in the conduct and analysis of cluster randomised controlled trials, but one that has received little attention is that of post-randomisation changes to cluster composition. To illustrate this, we focus on the issue of cluster merging, considering the impact on the design, analysis and interpretation of trial outcomes. Methods We explored the effects of merging clusters on study power using standard methods of power calculation. We assessed the potential impacts on study findings of both homogeneous cluster merges (involving clusters randomised to the same arm of a trial) and heterogeneous merges (involving clusters randomised to different arms of a trial) by simulation. To determine the impact on bias and precision of treatment effect estimates, we applied standard methods of analysis to different populations under analysis. Results Cluster merging produced a systematic reduction in study power. This effect depended on the number of merges and was most pronounced when variability in cluster size was at its greatest. Simulations demonstrate that the impact on analysis was minimal when cluster merges were homogeneous, with impact on study power being balanced by a change in observed intracluster correlation coefficient (ICC). We found a decrease in study power when cluster merges were heterogeneous, and the estimate of treatment effect was attenuated. Conclusions Examples of cluster merges found in previously published reports of cluster randomised trials were typically homogeneous rather than heterogeneous. Simulations demonstrated that trial findings in such cases would be unbiased. However, simulations also showed that any heterogeneous cluster merges would introduce bias that would be hard to quantify, as well as having negative impacts on the precision of estimates obtained. Further methodological development is warranted to better determine how to analyse such trials appropriately. Interim recommendations include avoidance of cluster merges where possible, discontinuation of clusters following heterogeneous merges, allowance for potential loss of clusters and additional variability in cluster size in the original sample size calculation, and use of appropriate ICC estimates that reflect cluster size. PMID:24884591

  17. A comparison of adaptive sampling designs and binary spatial models: A simulation study using a census of Bromus inermis

    USGS Publications Warehouse

    Irvine, Kathryn M.; Thornton, Jamie; Backus, Vickie M.; Hohmann, Matthew G.; Lehnhoff, Erik A.; Maxwell, Bruce D.; Michels, Kurt; Rew, Lisa

    2013-01-01

    Commonly in environmental and ecological studies, species distribution data are recorded as presence or absence throughout a spatial domain of interest. Field based studies typically collect observations by sampling a subset of the spatial domain. We consider the effects of six different adaptive and two non-adaptive sampling designs and choice of three binary models on both predictions to unsampled locations and parameter estimation of the regression coefficients (species–environment relationships). Our simulation study is unique compared to others to date in that we virtually sample a true known spatial distribution of a nonindigenous plant species, Bromus inermis. The census of B. inermis provides a good example of a species distribution that is both sparsely (1.9 % prevalence) and patchily distributed. We find that modeling the spatial correlation using a random effect with an intrinsic Gaussian conditionally autoregressive prior distribution was equivalent or superior to Bayesian autologistic regression in terms of predicting to un-sampled areas when strip adaptive cluster sampling was used to survey B. inermis. However, inferences about the relationships between B. inermis presence and environmental predictors differed between the two spatial binary models. The strip adaptive cluster designs we investigate provided a significant advantage in terms of Markov chain Monte Carlo chain convergence when trying to model a sparsely distributed species across a large area. In general, there was little difference in the choice of neighborhood, although the adaptive king was preferred when transects were randomly placed throughout the spatial domain.

  18. Task shifting of frontline community health workers for cardiovascular risk reduction: design and rationale of a cluster randomised controlled trial (DISHA study) in India.

    PubMed

    Jeemon, Panniyammakal; Narayanan, Gitanjali; Kondal, Dimple; Kahol, Kashvi; Bharadwaj, Ashok; Purty, Anil; Negi, Prakash; Ladhani, Sulaiman; Sanghvi, Jyoti; Singh, Kuldeep; Kapoor, Deksha; Sobti, Nidhi; Lall, Dorothy; Manimunda, Sathyaprakash; Dwivedi, Supriya; Toteja, Gurudyal; Prabhakaran, Dorairaj

    2016-03-15

    Effective task-shifting interventions targeted at reducing the global cardiovascular disease (CVD) epidemic in low and middle-income countries (LMICs) are urgently needed. DISHA is a cluster randomised controlled trial conducted across 10 sites (5 in phase 1 and 5 in phase 2) in India in 120 clusters. At each site, 12 clusters were randomly selected from a district. A cluster is defined as a small village with 250-300 households and well defined geographical boundaries. They were then randomly allocated to intervention and control clusters in a 1:1 allocation sequence. If any of the intervention and control clusters were <10 km apart, one was dropped and replaced with another randomly selected cluster from the same district. The study included a representative baseline cross-sectional survey, development of a structured intervention model, delivery of intervention for a minimum period of 18 months by trained frontline health workers (mainly Anganwadi workers and ASHA workers) and a post intervention survey in a representative sample. The study staff had no information on intervention allocation until the completion of the baseline survey. In order to ensure comparability of data across sites, the DISHA study follows a common protocol and manual of operation with standardized measurement techniques. Our study is the largest community based cluster randomised trial in low and middle-income country settings designed to test the effectiveness of 'task shifting' interventions involving frontline health workers for cardiovascular risk reduction. CTRI/2013/10/004049 . Registered 7 October 2013.

  19. Dietary Supplement Use Among U.S. Adults Has Increased Since NHANES III (1988-1994)

    MedlinePlus

    ... uses a complex, stratified, multistage probability cluster sampling design and oversamples in order to increase precision in estimates for certain groups. NHANES III was one in a series of periodic surveys conducted in two cycles during ...

  20. Forecasting the brittle failure of heterogeneous, porous geomaterials

    NASA Astrophysics Data System (ADS)

    Vasseur, Jérémie; Wadsworth, Fabian; Heap, Michael; Main, Ian; Lavallée, Yan; Dingwell, Donald

    2017-04-01

    Heterogeneity develops in magmas during ascent and is dominated by the development of crystal and importantly, bubble populations or pore-network clusters which grow, interact, localize, coalesce, outgas and resorb. Pore-scale heterogeneity is also ubiquitous in sedimentary basin fill during diagenesis. As a first step, we construct numerical simulations in 3D in which randomly generated heterogeneous and polydisperse spheres are placed in volumes and which are permitted to overlap with one another, designed to represent the random growth and interaction of bubbles in a liquid volume. We use these simulated geometries to show that statistical predictions of the inter-bubble lengthscales and evolving bubble surface area or cluster densities can be made based on fundamental percolation theory. As a second step, we take a range of well constrained random heterogeneous rock samples including sandstones, andesites, synthetic partially sintered glass bead samples, and intact glass samples and subject them to a variety of stress loading conditions at a range of temperatures until failure. We record in real time the evolution of the number of acoustic events that precede failure and show that in all scenarios, the acoustic event rate accelerates toward failure, consistent with previous findings. Applying tools designed to forecast the failure time based on these precursory signals, we constrain the absolute error on the forecast time. We find that for all sample types, the error associated with an accurate forecast of failure scales non-linearly with the lengthscale between the pore clusters in the material. Moreover, using a simple micromechanical model for the deformation of porous elastic bodies, we show that the ratio between the equilibrium sub-critical crack length emanating from the pore clusters relative to the inter-pore lengthscale, provides a scaling for the error on forecast accuracy. Thus for the first time we provide a potential quantitative correction for forecasting the failure of porous brittle solids that build the Earth's crust.

  1. Private Universities in Kenya Seek Alternative Ways to Manage Change in Teacher Education Curriculum in Compliance with the Commission for University Education Reforms

    ERIC Educational Resources Information Center

    Amimo, Catherine Adhiambo

    2016-01-01

    This study investigated management of change in teacher education curriculum in Private universities in Kenya. The study employed a concurrent mixed methods design that is based on the use of both quantitative and qualitative approaches. A multi-stage sampling process which included purposive, convenience, cluster, and snowball sampling methods…

  2. Accounting for multiple births in randomised trials: a systematic review.

    PubMed

    Yelland, Lisa Nicole; Sullivan, Thomas Richard; Makrides, Maria

    2015-03-01

    Multiple births are an important subgroup to consider in trials aimed at reducing preterm birth or its consequences. Including multiples results in a unique mixture of independent and clustered data, which has implications for the design, analysis and reporting of the trial. We aimed to determine how multiple births were taken into account in the design and analysis of recent trials involving preterm infants, and whether key information relevant to multiple births was reported. We conducted a systematic review of multicentre randomised trials involving preterm infants published between 2008 and 2013. Information relevant to multiple births was extracted. Of the 56 trials included in the review, 6 (11%) excluded multiples and 24 (43%) failed to indicate whether multiples were included. Among the 26 trials that reported multiples were included, only one (4%) accounted for clustering in the sample size calculations and eight (31%) took the clustering into account in the analysis of the primary outcome. Of the 20 trials that randomised infants, 12 (60%) failed to report how infants from the same birth were randomised. Information on multiple births is often poorly reported in trials involving preterm infants, and clustering due to multiple births is rarely taken into account. Since ignoring clustering could result in inappropriate recommendations for clinical practice, clustering should be taken into account in the design and analysis of future neonatal and perinatal trials including infants from a multiple birth. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  3. Variability in body size and shape of UK offshore workers: A cluster analysis approach.

    PubMed

    Stewart, Arthur; Ledingham, Robert; Williams, Hector

    2017-01-01

    Male UK offshore workers have enlarged dimensions compared with UK norms and knowledge of specific sizes and shapes typifying their physiques will assist a range of functions related to health and ergonomics. A representative sample of the UK offshore workforce (n = 588) underwent 3D photonic scanning, from which 19 extracted dimensional measures were used in k-means cluster analysis to characterise physique groups. Of the 11 resulting clusters four somatotype groups were expressed: one cluster was muscular and lean, four had greater muscularity than adiposity, three had equal adiposity and muscularity and three had greater adiposity than muscularity. Some clusters appeared constitutionally similar to others, differing only in absolute size. These cluster centroids represent an evidence-base for future designs in apparel and other applications where body size and proportions affect functional performance. They also constitute phenotypic evidence providing insight into the 'offshore culture' which may underpin the enlarged dimensions of offshore workers. Copyright © 2016 Elsevier Ltd. All rights reserved.

  4. A Database of Young Star Clusters for Five Hundred Galaxies

    NASA Astrophysics Data System (ADS)

    Evans, Jessica; Whitmore, B. C.; Lindsay, K.; Chandar, R.; Larsen, S.

    2009-01-01

    The study of young massive stellar clusters has faced a series of observational challenges, such as the use of inconsistent data sets and low number statistics. To rectify these shortcomings, this project will use the source lists developed as part of the Hubble Legacy Archive to obtain a large, uniform database of super star clusters in nearby star-forming galaxies in order to address two fundamental astronomical questions: 1) To what degree is the cluster luminosity (and mass) function of star clusters universal? 2) What fraction of super star clusters are "missing" in optical studies (i.e., are hidden by dust)? The archive's recent data release (Data Release 2 - September, 2008) will help us achieve the large sample necessary (N 50 galaxies for multi-wavelength, N 500 galaxies for ACS F814W). The uniform data set will comprise of ACS, WFPC2, and NICMOS data, with DAOphot used for object detection. This database will also support comparisons with new Monte-Carlo simulations that have independently been developed in the past few years, and will be used to test the Whitmore, Chandar, Fall (2007) framework designed to understand the demographics of star clusters in all star forming galaxies. The catalogs will increase the number of galaxies with measured mass and luminosity functions by an order of magnitude, and will provide a powerful new tool for comparative studies, both ours and the community's. The poster will describe our preliminary investigation for the first 30 galaxies in the sample.

  5. PCR detection of uncultured rumen bacteria.

    PubMed

    Rosero, Jaime A; Strosová, Lenka; Mrázek, Jakub; Fliegerová, Kateřina; Kopečný, Jan

    2012-07-01

    16S rRNA sequences of ruminal uncultured bacterial clones from public databases were phylogenetically examined. The sequences were found to form two unique clusters not affiliated with any known bacterial species: cluster of unidentified sequences of free floating rumen fluid uncultured bacteria (FUB) and cluster of unidentified sequences of bacteria associated with rumen epithelium (AUB). A set of PCR primers targeting 16S rRNA of ruminal free uncultured bacteria and rumen epithelium adhering uncultured bacteria was designed based on these sequences. FUB primers were used for relative quantification of uncultured bacteria in ovine rumen samples. The effort to increase the population size of FUB group has been successful in sulfate reducing broth and culture media supplied with cellulose.

  6. Micro-scale Spatial Clustering of Cholera Risk Factors in Urban Bangladesh.

    PubMed

    Bi, Qifang; Azman, Andrew S; Satter, Syed Moinuddin; Khan, Azharul Islam; Ahmed, Dilruba; Riaj, Altaf Ahmed; Gurley, Emily S; Lessler, Justin

    2016-02-01

    Close interpersonal contact likely drives spatial clustering of cases of cholera and diarrhea, but spatial clustering of risk factors may also drive this pattern. Few studies have focused specifically on how exposures for disease cluster at small spatial scales. Improving our understanding of the micro-scale clustering of risk factors for cholera may help to target interventions and power studies with cluster designs. We selected sets of spatially matched households (matched-sets) near cholera case households between April and October 2013 in a cholera endemic urban neighborhood of Tongi Township in Bangladesh. We collected data on exposures to suspected cholera risk factors at the household and individual level. We used intra-class correlation coefficients (ICCs) to characterize clustering of exposures within matched-sets and households, and assessed if clustering depended on the geographical extent of the matched-sets. Clustering over larger spatial scales was explored by assessing the relationship between matched-sets. We also explored whether different exposures tended to appear together in individuals, households, and matched-sets. Household level exposures, including: drinking municipal supplied water (ICC = 0.97, 95%CI = 0.96, 0.98), type of latrine (ICC = 0.88, 95%CI = 0.71, 1.00), and intermittent access to drinking water (ICC = 0.96, 95%CI = 0.87, 1.00) exhibited strong clustering within matched-sets. As the geographic extent of matched-sets increased, the concordance of exposures within matched-sets decreased. Concordance between matched-sets of exposures related to water supply was elevated at distances of up to approximately 400 meters. Household level hygiene practices were correlated with infrastructure shown to increase cholera risk. Co-occurrence of different individual level exposures appeared to mostly reflect the differing domestic roles of study participants. Strong spatial clustering of exposures at a small spatial scale in a cholera endemic population suggests a possible role for highly targeted interventions. Studies with cluster designs in areas with strong spatial clustering of exposures should increase sample size to account for the correlation of these exposures.

  7. Quasi-Likelihood Techniques in a Logistic Regression Equation for Identifying Simulium damnosum s.l. Larval Habitats Intra-cluster Covariates in Togo.

    PubMed

    Jacob, Benjamin G; Novak, Robert J; Toe, Laurent; Sanfo, Moussa S; Afriyie, Abena N; Ibrahim, Mohammed A; Griffith, Daniel A; Unnasch, Thomas R

    2012-01-01

    The standard methods for regression analyses of clustered riverine larval habitat data of Simulium damnosum s.l. a major black-fly vector of Onchoceriasis, postulate models relating observational ecological-sampled parameter estimators to prolific habitats without accounting for residual intra-cluster error correlation effects. Generally, this correlation comes from two sources: (1) the design of the random effects and their assumed covariance from the multiple levels within the regression model; and, (2) the correlation structure of the residuals. Unfortunately, inconspicuous errors in residual intra-cluster correlation estimates can overstate precision in forecasted S.damnosum s.l. riverine larval habitat explanatory attributes regardless how they are treated (e.g., independent, autoregressive, Toeplitz, etc). In this research, the geographical locations for multiple riverine-based S. damnosum s.l. larval ecosystem habitats sampled from 2 pre-established epidemiological sites in Togo were identified and recorded from July 2009 to June 2010. Initially the data was aggregated into proc genmod. An agglomerative hierarchical residual cluster-based analysis was then performed. The sampled clustered study site data was then analyzed for statistical correlations using Monthly Biting Rates (MBR). Euclidean distance measurements and terrain-related geomorphological statistics were then generated in ArcGIS. A digital overlay was then performed also in ArcGIS using the georeferenced ground coordinates of high and low density clusters stratified by Annual Biting Rates (ABR). This data was overlain onto multitemporal sub-meter pixel resolution satellite data (i.e., QuickBird 0.61m wavbands ). Orthogonal spatial filter eigenvectors were then generated in SAS/GIS. Univariate and non-linear regression-based models (i.e., Logistic, Poisson and Negative Binomial) were also employed to determine probability distributions and to identify statistically significant parameter estimators from the sampled data. Thereafter, Durbin-Watson test statistics were used to test the null hypothesis that the regression residuals were not autocorrelated against the alternative that the residuals followed an autoregressive process in AUTOREG. Bayesian uncertainty matrices were also constructed employing normal priors for each of the sampled estimators in PROC MCMC. The residuals revealed both spatially structured and unstructured error effects in the high and low ABR-stratified clusters. The analyses also revealed that the estimators, levels of turbidity and presence of rocks were statistically significant for the high-ABR-stratified clusters, while the estimators distance between habitats and floating vegetation were important for the low-ABR-stratified cluster. Varying and constant coefficient regression models, ABR- stratified GIS-generated clusters, sub-meter resolution satellite imagery, a robust residual intra-cluster diagnostic test, MBR-based histograms, eigendecomposition spatial filter algorithms and Bayesian matrices can enable accurate autoregressive estimation of latent uncertainity affects and other residual error probabilities (i.e., heteroskedasticity) for testing correlations between georeferenced S. damnosum s.l. riverine larval habitat estimators. The asymptotic distribution of the resulting residual adjusted intra-cluster predictor error autocovariate coefficients can thereafter be established while estimates of the asymptotic variance can lead to the construction of approximate confidence intervals for accurately targeting productive S. damnosum s.l habitats based on spatiotemporal field-sampled count data.

  8. Deconvoluting simulated metagenomes: the performance of hard- and soft- clustering algorithms applied to metagenomic chromosome conformation capture (3C)

    PubMed Central

    DeMaere, Matthew Z.

    2016-01-01

    Background Chromosome conformation capture, coupled with high throughput DNA sequencing in protocols like Hi-C and 3C-seq, has been proposed as a viable means of generating data to resolve the genomes of microorganisms living in naturally occuring environments. Metagenomic Hi-C and 3C-seq datasets have begun to emerge, but the feasibility of resolving genomes when closely related organisms (strain-level diversity) are present in the sample has not yet been systematically characterised. Methods We developed a computational simulation pipeline for metagenomic 3C and Hi-C sequencing to evaluate the accuracy of genomic reconstructions at, above, and below an operationally defined species boundary. We simulated datasets and measured accuracy over a wide range of parameters. Five clustering algorithms were evaluated (2 hard, 3 soft) using an adaptation of the extended B-cubed validation measure. Results When all genomes in a sample are below 95% sequence identity, all of the tested clustering algorithms performed well. When sequence data contains genomes above 95% identity (our operational definition of strain-level diversity), a naive soft-clustering extension of the Louvain method achieves the highest performance. Discussion Previously, only hard-clustering algorithms have been applied to metagenomic 3C and Hi-C data, yet none of these perform well when strain-level diversity exists in a metagenomic sample. Our simple extension of the Louvain method performed the best in these scenarios, however, accuracy remained well below the levels observed for samples without strain-level diversity. Strain resolution is also highly dependent on the amount of available 3C sequence data, suggesting that depth of sequencing must be carefully considered during experimental design. Finally, there appears to be great scope to improve the accuracy of strain resolution through further algorithm development. PMID:27843713

  9. A Bayesian hierarchical model for mortality data from cluster-sampling household surveys in humanitarian crises.

    PubMed

    Heudtlass, Peter; Guha-Sapir, Debarati; Speybroeck, Niko

    2018-05-31

    The crude death rate (CDR) is one of the defining indicators of humanitarian emergencies. When data from vital registration systems are not available, it is common practice to estimate the CDR from household surveys with cluster-sampling design. However, sample sizes are often too small to compare mortality estimates to emergency thresholds, at least in a frequentist framework. Several authors have proposed Bayesian methods for health surveys in humanitarian crises. Here, we develop an approach specifically for mortality data and cluster-sampling surveys. We describe a Bayesian hierarchical Poisson-Gamma mixture model with generic (weakly informative) priors that could be used as default in absence of any specific prior knowledge, and compare Bayesian and frequentist CDR estimates using five different mortality datasets. We provide an interpretation of the Bayesian estimates in the context of an emergency threshold and demonstrate how to interpret parameters at the cluster level and ways in which informative priors can be introduced. With the same set of weakly informative priors, Bayesian CDR estimates are equivalent to frequentist estimates, for all practical purposes. The probability that the CDR surpasses the emergency threshold can be derived directly from the posterior of the mean of the mixing distribution. All observation in the datasets contribute to the estimation of cluster-level estimates, through the hierarchical structure of the model. In a context of sparse data, Bayesian mortality assessments have advantages over frequentist ones already when using only weakly informative priors. More informative priors offer a formal and transparent way of combining new data with existing data and expert knowledge and can help to improve decision-making in humanitarian crises by complementing frequentist estimates.

  10. Incorporating the sampling design in weighting adjustments for panel attrition

    PubMed Central

    Chen, Qixuan; Gelman, Andrew; Tracy, Melissa; Norris, Fran H.; Galea, Sandro

    2015-01-01

    We review weighting adjustment methods for panel attrition and suggest approaches for incorporating design variables, such as strata, clusters and baseline sample weights. Design information can typically be included in attrition analysis using multilevel models or decision tree methods such as the CHAID algorithm. We use simulation to show that these weighting approaches can effectively reduce bias in the survey estimates that would occur from omitting the effect of design factors on attrition while keeping the resulted weights stable. We provide a step-by-step illustration on creating weighting adjustments for panel attrition in the Galveston Bay Recovery Study, a survey of residents in a community following a disaster, and provide suggestions to analysts in decision making about weighting approaches. PMID:26239405

  11. Evaluation of Nine Consensus Indices in Delphi Foresight Research and Their Dependency on Delphi Survey Characteristics: A Simulation Study and Debate on Delphi Design and Interpretation.

    PubMed

    Birko, Stanislav; Dove, Edward S; Özdemir, Vural

    2015-01-01

    The extent of consensus (or the lack thereof) among experts in emerging fields of innovation can serve as antecedents of scientific, societal, investor and stakeholder synergy or conflict. Naturally, how we measure consensus is of great importance to science and technology strategic foresight. The Delphi methodology is a widely used anonymous survey technique to evaluate consensus among a panel of experts. Surprisingly, there is little guidance on how indices of consensus can be influenced by parameters of the Delphi survey itself. We simulated a classic three-round Delphi survey building on the concept of clustered consensus/dissensus. We evaluated three study characteristics that are pertinent for design of Delphi foresight research: (1) the number of survey questions, (2) the sample size, and (3) the extent to which experts conform to group opinion (the Group Conformity Index) in a Delphi study. Their impacts on the following nine Delphi consensus indices were then examined in 1000 simulations: Clustered Mode, Clustered Pairwise Agreement, Conger's Kappa, De Moivre index, Extremities Version of the Clustered Pairwise Agreement, Fleiss' Kappa, Mode, the Interquartile Range and Pairwise Agreement. The dependency of a consensus index on the Delphi survey characteristics was expressed from 0.000 (no dependency) to 1.000 (full dependency). The number of questions (range: 6 to 40) in a survey did not have a notable impact whereby the dependency values remained below 0.030. The variation in sample size (range: 6 to 50) displayed the top three impacts for the Interquartile Range, the Clustered Mode and the Mode (dependency = 0.396, 0.130, 0.116, respectively). The Group Conformity Index, a construct akin to measuring stubbornness/flexibility of experts' opinions, greatly impacted all nine Delphi consensus indices (dependency = 0.200 to 0.504), except the Extremity CPWA and the Interquartile Range that were impacted only beyond the first decimal point (dependency = 0.087 and 0.083, respectively). Scholars in technology design, foresight research and future(s) studies might consider these new findings in strategic planning of Delphi studies, for example, in rational choice of consensus indices and sample size, or accounting for confounding factors such as experts' variable degrees of conformity (stubbornness/flexibility) in modifying their opinions.

  12. Evaluation of Nine Consensus Indices in Delphi Foresight Research and Their Dependency on Delphi Survey Characteristics: A Simulation Study and Debate on Delphi Design and Interpretation

    PubMed Central

    Birko, Stanislav; Dove, Edward S.; Özdemir, Vural

    2015-01-01

    The extent of consensus (or the lack thereof) among experts in emerging fields of innovation can serve as antecedents of scientific, societal, investor and stakeholder synergy or conflict. Naturally, how we measure consensus is of great importance to science and technology strategic foresight. The Delphi methodology is a widely used anonymous survey technique to evaluate consensus among a panel of experts. Surprisingly, there is little guidance on how indices of consensus can be influenced by parameters of the Delphi survey itself. We simulated a classic three-round Delphi survey building on the concept of clustered consensus/dissensus. We evaluated three study characteristics that are pertinent for design of Delphi foresight research: (1) the number of survey questions, (2) the sample size, and (3) the extent to which experts conform to group opinion (the Group Conformity Index) in a Delphi study. Their impacts on the following nine Delphi consensus indices were then examined in 1000 simulations: Clustered Mode, Clustered Pairwise Agreement, Conger’s Kappa, De Moivre index, Extremities Version of the Clustered Pairwise Agreement, Fleiss’ Kappa, Mode, the Interquartile Range and Pairwise Agreement. The dependency of a consensus index on the Delphi survey characteristics was expressed from 0.000 (no dependency) to 1.000 (full dependency). The number of questions (range: 6 to 40) in a survey did not have a notable impact whereby the dependency values remained below 0.030. The variation in sample size (range: 6 to 50) displayed the top three impacts for the Interquartile Range, the Clustered Mode and the Mode (dependency = 0.396, 0.130, 0.116, respectively). The Group Conformity Index, a construct akin to measuring stubbornness/flexibility of experts’ opinions, greatly impacted all nine Delphi consensus indices (dependency = 0.200 to 0.504), except the Extremity CPWA and the Interquartile Range that were impacted only beyond the first decimal point (dependency = 0.087 and 0.083, respectively). Scholars in technology design, foresight research and future(s) studies might consider these new findings in strategic planning of Delphi studies, for example, in rational choice of consensus indices and sample size, or accounting for confounding factors such as experts’ variable degrees of conformity (stubbornness/flexibility) in modifying their opinions. PMID:26270647

  13. Extending the Compositional Range of Nanocasting in the Oxozirconium Cluster-Based Metal–Organic Framework NU-1000—A Comparative Structural Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhao, Wenyang; Wang, Zhao; Malonzo, Camille D.

    The process of nanocasting in metal-organic frameworks (MOFs) is a versatile approach to modify these porous materials by introducing supporting scaffolds. The nanocast scaffolds can stabilize metal-oxo clusters in MOFs at high temperatures and modulate their chemical environments. Here we demonstrate a range of nanocasting approaches in the MOF NU-1000, which contains hexanuclear oxozirconium clusters (denoted as Zr6 clusters) that are suitable for modification with other metals. We developed methods for introducing SiO2, TiO2, polymeric, and carbon scaffolds into the NU-1000 structure. The responses of NU-1000 towards different scaffold precursors were studied, including the effects on morphology, precursor distribution, andmore » porosity after nanocasting. Upon removal of organic linkers in the MOF by calcination/pyrolysis at 500 °C or above, the Zr6 clusters remained accessible and maintained their Lewis acidity in SiO2 nanocast samples, whereas additional treatment was necessary for Zr6 clusters to become accessible in carbon nanocast samples. Aggregation of Zr6 clusters was largely prevented with SiO2 or carbon scaffolds even after thermal treatment at 500 °C or above. In the case of titania nanocasting, NU- 1000 crystals underwent a pseudomorphic transformation, in which Zr6 clusters reacted with titania to form small oxaggregates of a Zr/Ti mixed oxide with a local structure resembling that of ZrTi2O6. The ability to maintain high densities of discrete Lewis acidic Zr6 clusters on SiO2 or carbon supports at high temperatures provides a starting point for designing new thermally stable catalysts.« less

  14. The quality of reporting in cluster randomised crossover trials: proposal for reporting items and an assessment of reporting quality.

    PubMed

    Arnup, Sarah J; Forbes, Andrew B; Kahan, Brennan C; Morgan, Katy E; McKenzie, Joanne E

    2016-12-06

    The cluster randomised crossover (CRXO) design is gaining popularity in trial settings where individual randomisation or parallel group cluster randomisation is not feasible or practical. Our aim is to stimulate discussion on the content of a reporting guideline for CRXO trials and to assess the reporting quality of published CRXO trials. We undertook a systematic review of CRXO trials. Searches of MEDLINE, EMBASE, and CINAHL Plus as well as citation searches of CRXO methodological articles were conducted to December 2014. Reporting quality was assessed against both modified items from 2010 CONSORT and 2012 cluster trials extension and other proposed quality measures. Of the 3425 records identified through database searching, 83 trials met the inclusion criteria. Trials were infrequently identified as "cluster randomis(z)ed crossover" in title (n = 7, 8%) or abstract (n = 21, 25%), and a rationale for the design was infrequently provided (n = 20, 24%). Design parameters such as the number of clusters and number of periods were well reported. Discussion of carryover took place in only 17 trials (20%). Sample size methods were only reported in 58% (n = 48) of trials. A range of approaches were used to report baseline characteristics. The analysis method was not adequately reported in 23% (n = 19) of trials. The observed within-cluster within-period intracluster correlation and within-cluster between-period intracluster correlation for the primary outcome data were not reported in any trial. The potential for selection, performance, and detection bias could be evaluated in 30%, 81%, and 70% of trials, respectively. There is a clear need to improve the quality of reporting in CRXO trials. Given the unique features of a CRXO trial, it is important to develop a CONSORT extension. Consensus amongst trialists on the content of such a guideline is essential.

  15. Occurrence of Radio Minihalos in a Mass-Limited Sample of Galaxy Clusters

    NASA Technical Reports Server (NTRS)

    Giacintucci, Simona; Markevitch, Maxim; Cassano, Rossella; Venturi, Tiziana; Clarke, Tracy E.; Brunetti, Gianfranco

    2017-01-01

    We investigate the occurrence of radio minihalos-diffuse radio sources of unknown origin observed in the cores of some galaxy clusters-in a statistical sample of 58 clusters drawn from the Planck Sunyaev-Zeldovich cluster catalog using a mass cut (M(sub 500) greater than 6 x 10(exp 14) solar mass). We supplement our statistical sample with a similarly sized nonstatistical sample mostly consisting of clusters in the ACCEPT X-ray catalog with suitable X-ray and radio data, which includes lower-mass clusters. Where necessary (for nine clusters), we reanalyzed the Very Large Array archival radio data to determine whether a minihalo is present. Our total sample includes all 28 currently known and recently discovered radio minihalos, including six candidates. We classify clusters as cool-core or non-cool-core according to the value of the specific entropy floor in the cluster center, rederived or newly derived from the Chandra X-ray density and temperature profiles where necessary (for 27 clusters). Contrary to the common wisdom that minihalos are rare, we find that almost all cool cores-at least 12 out of 15 (80%)-in our complete sample of massive clusters exhibit minihalos. The supplementary sample shows that the occurrence of minihalos may be lower in lower-mass cool-core clusters. No minihalos are found in non-cool cores or "warm cores." These findings will help test theories of the origin of minihalos and provide information on the physical processes and energetics of the cluster cores.

  16. Stratified sampling design based on data mining.

    PubMed

    Kim, Yeonkook J; Oh, Yoonhwan; Park, Sunghoon; Cho, Sungzoon; Park, Hayoung

    2013-09-01

    To explore classification rules based on data mining methodologies which are to be used in defining strata in stratified sampling of healthcare providers with improved sampling efficiency. We performed k-means clustering to group providers with similar characteristics, then, constructed decision trees on cluster labels to generate stratification rules. We assessed the variance explained by the stratification proposed in this study and by conventional stratification to evaluate the performance of the sampling design. We constructed a study database from health insurance claims data and providers' profile data made available to this study by the Health Insurance Review and Assessment Service of South Korea, and population data from Statistics Korea. From our database, we used the data for single specialty clinics or hospitals in two specialties, general surgery and ophthalmology, for the year 2011 in this study. Data mining resulted in five strata in general surgery with two stratification variables, the number of inpatients per specialist and population density of provider location, and five strata in ophthalmology with two stratification variables, the number of inpatients per specialist and number of beds. The percentages of variance in annual changes in the productivity of specialists explained by the stratification in general surgery and ophthalmology were 22% and 8%, respectively, whereas conventional stratification by the type of provider location and number of beds explained 2% and 0.2% of variance, respectively. This study demonstrated that data mining methods can be used in designing efficient stratified sampling with variables readily available to the insurer and government; it offers an alternative to the existing stratification method that is widely used in healthcare provider surveys in South Korea.

  17. Detection of West Nile virus and tick-borne encephalitis virus in birds in Slovakia, using a universal primer set.

    PubMed

    Csank, Tomáš; Bhide, Katarína; Bencúrová, Elena; Dolinská, Saskia; Drzewnioková, Petra; Major, Peter; Korytár, Ľuboš; Bocková, Eva; Bhide, Mangesh; Pistl, Juraj

    2016-06-01

    West Nile virus (WNV) is a mosquito-borne neurotropic pathogen that presents a major public health concern. Information on WNV prevalence and circulation in Slovakia is insufficient. Oral and cloacal swabs and bird brain samples were tested for flavivirus RNA by RT-PCR using newly designed generic primers. The species designation was confirmed by sequencing. WNV was detected in swab and brain samples, whereas one brain sample was positive for tick-borne encephalitis virus (TBEV). The WNV sequences clustered with lineages 1 and 2. These results confirm the circulation of WNV in birds in Slovakia and emphasize the risk of infection of humans and horses.

  18. The Influence of Educational Systems on the Academic Performance of JSCE Students in Rivers State

    ERIC Educational Resources Information Center

    Orluwene, Goodness W.; Igwe, Benjamin N.

    2015-01-01

    This work is a comparative study of JSCE results between the 6-3-3-4 system (2006 & 2008) and the 9-3-4 (UBE) system (2009 & 2011) in Port Harcourt using a comparative/evaluative survey design. A cluster sampling technique was used to compose a sample of 2,487 drawn from the population of 17,139 candidates in 2006, 2008, 2009 and 2011 in…

  19. Assessment of Validity, Reliability and Difficulty Indices for Teacher-Built Physics Exam Questions in First Year High School

    ERIC Educational Resources Information Center

    Jandaghi, Gholamreza

    2010-01-01

    The purpose of the research is to determine high school teachers' skill rate in designing exam questions in physics subject. The statistical population was all of physics exam shits for two semesters in one school year from which a sample of 364 exam shits was drawn using multistage cluster sampling. Two experts assessed the shits and by using…

  20. The ROSAT Brightest Cluster Sample - I. The compilation of the sample and the cluster log N-log S distribution

    NASA Astrophysics Data System (ADS)

    Ebeling, H.; Edge, A. C.; Bohringer, H.; Allen, S. W.; Crawford, C. S.; Fabian, A. C.; Voges, W.; Huchra, J. P.

    1998-12-01

    We present a 90 per cent flux-complete sample of the 201 X-ray-brightest clusters of galaxies in the northern hemisphere (delta>=0 deg), at high Galactic latitudes (|b|>=20 deg), with measured redshifts z<=0.3 and fluxes higher than 4.4x10^-12 erg cm^-2 s^-1 in the 0.1-2.4 keV band. The sample, called the ROSAT Brightest Cluster Sample (BCS), is selected from ROSAT All-Sky Survey data and is the largest X-ray-selected cluster sample compiled to date. In addition to Abell clusters, which form the bulk of the sample, the BCS also contains the X-ray-brightest Zwicky clusters and other clusters selected from their X-ray properties alone. Effort has been made to ensure the highest possible completeness of the sample and the smallest possible contamination by non-cluster X-ray sources. X-ray fluxes are computed using an algorithm tailored for the detection and characterization of X-ray emission from galaxy clusters. These fluxes are accurate to better than 15 per cent (mean 1sigma error). We find the cumulative logN-logS distribution of clusters to follow a power law kappa S^alpha with alpha=1.31^+0.06_-0.03 (errors are the 10th and 90th percentiles) down to fluxes of 2x10^-12 erg cm^-2 s^-1, i.e. considerably below the BCS flux limit. Although our best-fitting slope disagrees formally with the canonical value of -1.5 for a Euclidean distribution, the BCS logN-logS distribution is consistent with a non-evolving cluster population if cosmological effects are taken into account. Our sample will allow us to examine large-scale structure in the northern hemisphere, determine the spatial cluster-cluster correlation function, investigate correlations between the X-ray and optical properties of the clusters, establish the X-ray luminosity function for galaxy clusters, and discuss the implications of the results for cluster evolution.

  1. Learning Bayesian Networks from Correlated Data

    NASA Astrophysics Data System (ADS)

    Bae, Harold; Monti, Stefano; Montano, Monty; Steinberg, Martin H.; Perls, Thomas T.; Sebastiani, Paola

    2016-05-01

    Bayesian networks are probabilistic models that represent complex distributions in a modular way and have become very popular in many fields. There are many methods to build Bayesian networks from a random sample of independent and identically distributed observations. However, many observational studies are designed using some form of clustered sampling that introduces correlations between observations within the same cluster and ignoring this correlation typically inflates the rate of false positive associations. We describe a novel parameterization of Bayesian networks that uses random effects to model the correlation within sample units and can be used for structure and parameter learning from correlated data without inflating the Type I error rate. We compare different learning metrics using simulations and illustrate the method in two real examples: an analysis of genetic and non-genetic factors associated with human longevity from a family-based study, and an example of risk factors for complications of sickle cell anemia from a longitudinal study with repeated measures.

  2. Hierarchical modeling of cluster size in wildlife surveys

    USGS Publications Warehouse

    Royle, J. Andrew

    2008-01-01

    Clusters or groups of individuals are the fundamental unit of observation in many wildlife sampling problems, including aerial surveys of waterfowl, marine mammals, and ungulates. Explicit accounting of cluster size in models for estimating abundance is necessary because detection of individuals within clusters is not independent and detectability of clusters is likely to increase with cluster size. This induces a cluster size bias in which the average cluster size in the sample is larger than in the population at large. Thus, failure to account for the relationship between delectability and cluster size will tend to yield a positive bias in estimates of abundance or density. I describe a hierarchical modeling framework for accounting for cluster-size bias in animal sampling. The hierarchical model consists of models for the observation process conditional on the cluster size distribution and the cluster size distribution conditional on the total number of clusters. Optionally, a spatial model can be specified that describes variation in the total number of clusters per sample unit. Parameter estimation, model selection, and criticism may be carried out using conventional likelihood-based methods. An extension of the model is described for the situation where measurable covariates at the level of the sample unit are available. Several candidate models within the proposed class are evaluated for aerial survey data on mallard ducks (Anas platyrhynchos).

  3. Do major depressive disorder and dysthymic disorder confer differential risk for suicide?

    PubMed

    Witte, Tracy K; Timmons, Katherine A; Fink, Erin; Smith, April R; Joiner, Thomas E

    2009-05-01

    Although there has been a tremendous amount of research examining the risk conferred for suicide by depression in general, relatively little research examines the risk conferred by specific forms of depressive illness (e.g., dysthymic disorder, single episode versus recurrent major depressive disorder [MDD]). The purpose of the current study was to examine differences in suicidal ideation, clinician-rated suicide risk, suicide attempts, and family history of suicide in a sample of outpatients diagnosed with various forms of depressive illness. To accomplish this aim, we conducted a cluster analysis using the aforementioned suicide-related variables in a sample of 494 outpatients seen between January 2001 and July 2007 at the Florida State University Psychology Clinic. Patients were diagnosed using DSM-IV criteria. Two distinct clusters emerged that were indicative of lower and higher risk for suicide. After controlling for the number of comorbid Axis I and Axis II diagnoses, the only depressive illness that significantly predicted cluster membership was recurrent MDD, which tripled an individual's likelihood of being assigned to the higher risk cluster. The use of a cross-sectional design; the relatively low suicide risk in our sample; the relatively small number of individuals with double depression. Our results demonstrate the importance of both chronicity and severity of depression in terms of predicting increased suicide risk. Among the various forms of depressive illness examined, only recurrent MDD appeared to confer greater risk for suicide.

  4. The Chandra Strong Lens Sample: Revealing Baryonic Physics In Strong Lensing Selected Clusters

    NASA Astrophysics Data System (ADS)

    Bayliss, Matthew

    2017-08-01

    We propose for Chandra imaging of the hot intra-cluster gas in a unique new sample of 29 galaxy clusters selected purely on their strong gravitational lensing signatures. This will be the first program targeting a purely strong lensing selected cluster sample, enabling new comparisons between the ICM properties and scaling relations of strong lensing and mass/ICM selected cluster samples. Chandra imaging, combined with high precision strong lens models, ensures powerful constraints on the distribution and state of matter in the cluster cores. This represents a novel angle from which we can address the role played by baryonic physics |*| the infamous |*|gastrophysics|*| in shaping the cores of massive clusters, and opens up an exciting new galaxy cluster discovery space with Chandra.

  5. The Chandra Strong Lens Sample: Revealing Baryonic Physics In Strong Lensing Selected Clusters

    NASA Astrophysics Data System (ADS)

    Bayliss, Matthew

    2017-09-01

    We propose for Chandra imaging of the hot intra-cluster gas in a unique new sample of 29 galaxy clusters selected purely on their strong gravitational lensing signatures. This will be the first program targeting a purely strong lensing selected cluster sample, enabling new comparisons between the ICM properties and scaling relations of strong lensing and mass/ICM selected cluster samples. Chandra imaging, combined with high precision strong lens models, ensures powerful constraints on the distribution and state of matter in the cluster cores. This represents a novel angle from which we can address the role played by baryonic physics -- the infamous ``gastrophysics''-- in shaping the cores of massive clusters, and opens up an exciting new galaxy cluster discovery space with Chandra.

  6. A fast learning method for large scale and multi-class samples of SVM

    NASA Astrophysics Data System (ADS)

    Fan, Yu; Guo, Huiming

    2017-06-01

    A multi-class classification SVM(Support Vector Machine) fast learning method based on binary tree is presented to solve its low learning efficiency when SVM processing large scale multi-class samples. This paper adopts bottom-up method to set up binary tree hierarchy structure, according to achieved hierarchy structure, sub-classifier learns from corresponding samples of each node. During the learning, several class clusters are generated after the first clustering of the training samples. Firstly, central points are extracted from those class clusters which just have one type of samples. For those which have two types of samples, cluster numbers of their positive and negative samples are set respectively according to their mixture degree, secondary clustering undertaken afterwards, after which, central points are extracted from achieved sub-class clusters. By learning from the reduced samples formed by the integration of extracted central points above, sub-classifiers are obtained. Simulation experiment shows that, this fast learning method, which is based on multi-level clustering, can guarantee higher classification accuracy, greatly reduce sample numbers and effectively improve learning efficiency.

  7. Automated detectionof very low surface brightness galaxiesin the Virgo cluster

    NASA Astrophysics Data System (ADS)

    Prole, D. J.; Davies, J. I.; Keenan, O. C.; Davies, L. J. M.

    2018-07-01

    We report the automatic detection of a new sample of very low surface brightness (LSB) galaxies, likely members of the Virgo cluster. We introduce our new software, DeepScan, that has been designed specifically to detect extended LSB features automatically using the DBSCAN algorithm. We demonstrate the technique by applying it over a 5 deg2 portion of the Next Generation Virgo Survey (NGVS) data to reveal 53 LSB galaxies that are candidate cluster members based on their sizes and colours. 30 of these sources are new detections despite the region being searched specifically for LSB galaxies previously. Our final sample contains galaxies with 26.0 ≤ ⟨μe⟩ ≤ 28.5 and 19 ≤ mg ≤ 21, making them some of the faintest known in Virgo. The majority of them have colours consistent with the red sequence, and have a mean stellar mass of 106.3 ± 0.5 M⊙ assuming cluster membership. After using ProFit to fit Sérsic profiles to our detections, none of the new sources have effective radii larger than 1.5 Kpc and do not meet the criteria for ultra-diffuse galaxy (UDG) classification, so we classify them as ultra-faint dwarfs.

  8. Unsupervised learning on scientific ocean drilling datasets from the South China Sea

    NASA Astrophysics Data System (ADS)

    Tse, Kevin C.; Chiu, Hon-Chim; Tsang, Man-Yin; Li, Yiliang; Lam, Edmund Y.

    2018-06-01

    Unsupervised learning methods were applied to explore data patterns in multivariate geophysical datasets collected from ocean floor sediment core samples coming from scientific ocean drilling in the South China Sea. Compared to studies on similar datasets, but using supervised learning methods which are designed to make predictions based on sample training data, unsupervised learning methods require no a priori information and focus only on the input data. In this study, popular unsupervised learning methods including K-means, self-organizing maps, hierarchical clustering and random forest were coupled with different distance metrics to form exploratory data clusters. The resulting data clusters were externally validated with lithologic units and geologic time scales assigned to the datasets by conventional methods. Compact and connected data clusters displayed varying degrees of correspondence with existing classification by lithologic units and geologic time scales. K-means and self-organizing maps were observed to perform better with lithologic units while random forest corresponded best with geologic time scales. This study sets a pioneering example of how unsupervised machine learning methods can be used as an automatic processing tool for the increasingly high volume of scientific ocean drilling data.

  9. Automated detection of very Low Surface Brightness galaxies in the Virgo Cluster

    NASA Astrophysics Data System (ADS)

    Prole, D. J.; Davies, J. I.; Keenan, O. C.; Davies, L. J. M.

    2018-04-01

    We report the automatic detection of a new sample of very low surface brightness (LSB) galaxies, likely members of the Virgo cluster. We introduce our new software, DeepScan, that has been designed specifically to detect extended LSB features automatically using the DBSCAN algorithm. We demonstrate the technique by applying it over a 5 degree2 portion of the Next-Generation Virgo Survey (NGVS) data to reveal 53 low surface brightness galaxies that are candidate cluster members based on their sizes and colours. 30 of these sources are new detections despite the region being searched specifically for LSB galaxies previously. Our final sample contains galaxies with 26.0 ≤ ⟨μe⟩ ≤ 28.5 and 19 ≤ mg ≤ 21, making them some of the faintest known in Virgo. The majority of them have colours consistent with the red sequence, and have a mean stellar mass of 106.3 ± 0.5M⊙ assuming cluster membership. After using ProFit to fit Sérsic profiles to our detections, none of the new sources have effective radii larger than 1.5 Kpc and do not meet the criteria for ultra-diffuse galaxy (UDG) classification, so we classify them as ultra-faint dwarfs.

  10. Comparing the performance of cluster random sampling and integrated threshold mapping for targeting trachoma control, using computer simulation.

    PubMed

    Smith, Jennifer L; Sturrock, Hugh J W; Olives, Casey; Solomon, Anthony W; Brooker, Simon J

    2013-01-01

    Implementation of trachoma control strategies requires reliable district-level estimates of trachomatous inflammation-follicular (TF), generally collected using the recommended gold-standard cluster randomized surveys (CRS). Integrated Threshold Mapping (ITM) has been proposed as an integrated and cost-effective means of rapidly surveying trachoma in order to classify districts according to treatment thresholds. ITM differs from CRS in a number of important ways, including the use of a school-based sampling platform for children aged 1-9 and a different age distribution of participants. This study uses computerised sampling simulations to compare the performance of these survey designs and evaluate the impact of varying key parameters. Realistic pseudo gold standard data for 100 districts were generated that maintained the relative risk of disease between important sub-groups and incorporated empirical estimates of disease clustering at the household, village and district level. To simulate the different sampling approaches, 20 clusters were selected from each district, with individuals sampled according to the protocol for ITM and CRS. Results showed that ITM generally under-estimated the true prevalence of TF over a range of epidemiological settings and introduced more district misclassification according to treatment thresholds than did CRS. However, the extent of underestimation and resulting misclassification was found to be dependent on three main factors: (i) the district prevalence of TF; (ii) the relative risk of TF between enrolled and non-enrolled children within clusters; and (iii) the enrollment rate in schools. Although in some contexts the two methodologies may be equivalent, ITM can introduce a bias-dependent shift as prevalence of TF increases, resulting in a greater risk of misclassification around treatment thresholds. In addition to strengthening the evidence base around choice of trachoma survey methodologies, this study illustrates the use of a simulated approach in addressing operational research questions for trachoma but also other NTDs.

  11. The Mass Function of Abell Clusters

    NASA Astrophysics Data System (ADS)

    Chen, J.; Huchra, J. P.; McNamara, B. R.; Mader, J.

    1998-12-01

    The velocity dispersion and mass functions for rich clusters of galaxies provide important constraints on models of the formation of Large-Scale Structure (e.g., Frenk et al. 1990). However, prior estimates of the velocity dispersion or mass function for galaxy clusters have been based on either very small samples of clusters (Bahcall and Cen 1993; Zabludoff et al. 1994) or large but incomplete samples (e.g., the Girardi et al. (1998) determination from a sample of clusters with more than 30 measured galaxy redshifts). In contrast, we approach the problem by constructing a volume-limited sample of Abell clusters. We collected individual galaxy redshifts for our sample from two major galaxy velocity databases, the NASA Extragalactic Database, NED, maintained at IPAC, and ZCAT, maintained at SAO. We assembled a database with velocity information for possible cluster members and then selected cluster members based on both spatial and velocity data. Cluster velocity dispersions and masses were calculated following the procedures of Danese, De Zotti, and di Tullio (1980) and Heisler, Tremaine, and Bahcall (1985), respectively. The final velocity dispersion and mass functions were analyzed in order to constrain cosmological parameters by comparison to the results of N-body simulations. Our data for the cluster sample as a whole and for the individual clusters (spatial maps and velocity histograms) in our sample is available on-line at http://cfa-www.harvard.edu/ huchra/clusters. This website will be updated as more data becomes available in the master redshift compilations, and will be expanded to include more clusters and large groups of galaxies.

  12. Estimating Accuracy of Land-Cover Composition From Two-Stage Clustering Sampling

    EPA Science Inventory

    Land-cover maps are often used to compute land-cover composition (i.e., the proportion or percent of area covered by each class), for each unit in a spatial partition of the region mapped. We derive design-based estimators of mean deviation (MD), mean absolute deviation (MAD), ...

  13. Confronting models of star formation quenching in galaxy clusters with archival Spitzer data

    NASA Astrophysics Data System (ADS)

    Rudnick, Gregory

    Large scale structures in the universe form hierarchically: small structures merge to form larger ones. Over the same epoch where these structures experience significant growth, the fraction of star forming galaxies within them decreases, and at a faster rate than for field galaxies. It is now widely accepted that there must be physical processes at work in these dense environments to actively quench star formation. However, despite no shortage of candidate mechanisms, sophisticated cosmological simulations still cannot reproduce the star formation rate distributions within dense environments, such as galaxy clusters. Insufficient observational constraints are a primary obstacle to further progress. In particular, the interpretation of observations of nearby clusters relies on untested assumptions about the properties of galaxies before they entered the dense cluster environment at higher redshifts. Clearly, direct constraints on these properties are required. Our group has assembled two data sets designed to address these concerns. The first focuses on an intermediate wide-field cluster sample and the second focuses on a well-matched low-redshift cluster sample. We will use these samples, along with sophisticated models of hierarchical galaxy formation, to meet the following objectives: 1. Directly measure the SFR distribution of the progenitors of present-day cluster galaxies. We will use ground-based spectroscopy to identify cluster members within four virial radii of eight intermediate-redshift clusters. We will couple this with archival Spitzer/MIPS data to measure the SFRs of galaxies out to the cluster outskirts. 2. Measure the SFR distribution of the present-day cluster galaxies using Spitzer and WISE. Robust N-body simulations tell us statistically which galaxies at intermediate redshifts will have entered the cluster virial radius by the current epoch. By combining our wide-field coverage at high redshift with our local cluster sample, we will determine the evolution in cluster galaxy SFRs over 6 billion years making minimal assumptions about the infalling galaxy population. 3. Provide a rigorous test of the quenching processes embedded in the theoretical models. We will create observed realizations of the theoretical models by subjecting them to our observational selection. This will enable a fair comparison between the models and the data, which will provide a valuable test of current theoretical implementations of quenching processes. We will also modify the quenching prescriptions in the models to determine the parameters required to reproduce the observations. The proposed research is novel for several reasons. 1) We have wide-field Spitzer/MIPS data that allows us to robustly measure SFRs in our distant cluster galaxies. WISE data on local clusters will provide us with analogous measurements in the nearby Universe. 2) Our significant investment in ancillary spectroscopy allows us to identify infalling galaxies that will eventually join the central regions of the cluster z=0. 3) Our intermediate redshift cluster sample was chosen to have characteristics expected for the progenitors of a large fraction of the known clusters at z=0. 4) We will take advantage of our own cosmological simulations of structure growth to interpret our data. 5) We have optical photometry over the full infall region, allowing us to control for stellar masses and to distinguish passive from dusty star-forming galaxies. We will learn which, if any, of the quenching prescriptions currently employed in semi-analytic models correctly reproduces the observed characteristics of the galaxies that will become cluster galaxies at z=0. We will pinpoint the cluster-centric radii over which quenching takes place between. We will determine the timescale (as a function of stellar mass) over which it must take place. This program will cement the legacy of Spitzer and WISE as tools for studying galaxy formation in clusters.

  14. Evaluation of primary immunization coverage of infants under universal immunization programme in an urban area of bangalore city using cluster sampling and lot quality assurance sampling techniques.

    PubMed

    K, Punith; K, Lalitha; G, Suman; Bs, Pradeep; Kumar K, Jayanth

    2008-07-01

    Is LQAS technique better than cluster sampling technique in terms of resources to evaluate the immunization coverage in an urban area? To assess and compare the lot quality assurance sampling against cluster sampling in the evaluation of primary immunization coverage. Population-based cross-sectional study. Areas under Mathikere Urban Health Center. Children aged 12 months to 23 months. 220 in cluster sampling, 76 in lot quality assurance sampling. Percentages and Proportions, Chi square Test. (1) Using cluster sampling, the percentage of completely immunized, partially immunized and unimmunized children were 84.09%, 14.09% and 1.82%, respectively. With lot quality assurance sampling, it was 92.11%, 6.58% and 1.31%, respectively. (2) Immunization coverage levels as evaluated by cluster sampling technique were not statistically different from the coverage value as obtained by lot quality assurance sampling techniques. Considering the time and resources required, it was found that lot quality assurance sampling is a better technique in evaluating the primary immunization coverage in urban area.

  15. Prevalence and clustering of soil-transmitted helminth infections in a tribal area in southern India.

    PubMed

    Kaliappan, Saravanakumar Puthupalayam; George, Santosh; Francis, Mark Rohit; Kattula, Deepthi; Sarkar, Rajiv; Minz, Shantidani; Mohan, Venkata Raghava; George, Kuryan; Roy, Sheela; Ajjampur, Sitara Swarna Rao; Muliyil, Jayaprakash; Kang, Gagandeep

    2013-12-01

    To estimate the prevalence, spatial patterns and clustering in the distribution of soil-transmitted helminth (STH) infections, and factors associated with hookworm infections in a tribal population in Tamil Nadu, India. Cross-sectional study with one-stage cluster sampling of 22 clusters. Demographic and risk factor data and stool samples for microscopic ova/cysts examination were collected from 1237 participants. Geographical information systems mapping assessed spatial patterns of infection. The overall prevalence of STH was 39% (95% CI 36%–42%), with hookworm 38% (95% CI 35–41%) and Ascaris lumbricoides 1.5% (95% CI 0.8–2.2%). No Trichuris trichiura infection was detected. People involved in farming had higher odds of hookworm infection (1.68, 95% CI 1.31–2.17, P < 0.001). In the multiple logistic regression, adults (2.31, 95% CI 1.80–2.96, P < 0.001), people with pet cats (1.55, 95% CI 1.10–2.18, P = 0.011) and people who did not wash their hands with soap after defecation (1.84, 95% CI 1.27–2.67, P = 0.001) had higher odds of hookworm infection, but gender and poor usage of foot wear did not significantly increase risk. Cluster analysis, based on design effect calculation, did not show any clustering of cases among the study population; however, spatial scan statistic detected a significant cluster for hookworm infections in one village. Multiple approaches including health education, improving the existing sanitary practices and regular preventive chemotherapy are needed to control the burden of STH in similar endemic areas.

  16. Stochastic coupled cluster theory: Efficient sampling of the coupled cluster expansion

    NASA Astrophysics Data System (ADS)

    Scott, Charles J. C.; Thom, Alex J. W.

    2017-09-01

    We consider the sampling of the coupled cluster expansion within stochastic coupled cluster theory. Observing the limitations of previous approaches due to the inherently non-linear behavior of a coupled cluster wavefunction representation, we propose new approaches based on an intuitive, well-defined condition for sampling weights and on sampling the expansion in cluster operators of different excitation levels. We term these modifications even and truncated selections, respectively. Utilising both approaches demonstrates dramatically improved calculation stability as well as reduced computational and memory costs. These modifications are particularly effective at higher truncation levels owing to the large number of terms within the cluster expansion that can be neglected, as demonstrated by the reduction of the number of terms to be sampled when truncating at triple excitations by 77% and hextuple excitations by 98%.

  17. Occurrence of Radio Minihalos in a Mass-limited Sample of Galaxy Clusters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Giacintucci, Simona; Clarke, Tracy E.; Markevitch, Maxim

    2017-06-01

    We investigate the occurrence of radio minihalos—diffuse radio sources of unknown origin observed in the cores of some galaxy clusters—in a statistical sample of 58 clusters drawn from the Planck Sunyaev–Zel’dovich cluster catalog using a mass cut ( M {sub 500} > 6 × 10{sup 14} M {sub ⊙}). We supplement our statistical sample with a similarly sized nonstatistical sample mostly consisting of clusters in the ACCEPT X-ray catalog with suitable X-ray and radio data, which includes lower-mass clusters. Where necessary (for nine clusters), we reanalyzed the Very Large Array archival radio data to determine whether a minihalo is present.more » Our total sample includes all 28 currently known and recently discovered radio minihalos, including six candidates. We classify clusters as cool-core or non-cool-core according to the value of the specific entropy floor in the cluster center, rederived or newly derived from the Chandra X-ray density and temperature profiles where necessary (for 27 clusters). Contrary to the common wisdom that minihalos are rare, we find that almost all cool cores—at least 12 out of 15 (80%)—in our complete sample of massive clusters exhibit minihalos. The supplementary sample shows that the occurrence of minihalos may be lower in lower-mass cool-core clusters. No minihalos are found in non-cool cores or “warm cores.” These findings will help test theories of the origin of minihalos and provide information on the physical processes and energetics of the cluster cores.« less

  18. Efficient evaluation of sampling quality of molecular dynamics simulations by clustering of dihedral torsion angles and Sammon mapping.

    PubMed

    Frickenhaus, Stephan; Kannan, Srinivasaraghavan; Zacharias, Martin

    2009-02-01

    A direct conformational clustering and mapping approach for peptide conformations based on backbone dihedral angles has been developed and applied to compare conformational sampling of Met-enkephalin using two molecular dynamics (MD) methods. Efficient clustering in dihedrals has been achieved by evaluating all combinations resulting from independent clustering of each dihedral angle distribution, thus resolving all conformational substates. In contrast, Cartesian clustering was unable to accurately distinguish between all substates. Projection of clusters on dihedral principal component (PCA) subspaces did not result in efficient separation of highly populated clusters. However, representation in a nonlinear metric by Sammon mapping was able to separate well the 48 highest populated clusters in just two dimensions. In addition, this approach also allowed us to visualize the transition frequencies between clusters efficiently. Significantly, higher transition frequencies between more distinct conformational substates were found for a recently developed biasing-potential replica exchange MD simulation method allowing faster sampling of possible substates compared to conventional MD simulations. Although the number of theoretically possible clusters grows exponentially with peptide length, in practice, the number of clusters is only limited by the sampling size (typically much smaller), and therefore the method is well suited also for large systems. The approach could be useful to rapidly and accurately evaluate conformational sampling during MD simulations, to compare different sampling strategies and eventually to detect kinetic bottlenecks in folding pathways.

  19. Optimising cluster survey design for planning schistosomiasis preventive chemotherapy.

    PubMed

    Knowles, Sarah C L; Sturrock, Hugh J W; Turner, Hugo; Whitton, Jane M; Gower, Charlotte M; Jemu, Samuel; Phillips, Anna E; Meite, Aboulaye; Thomas, Brent; Kollie, Karsor; Thomas, Catherine; Rebollo, Maria P; Styles, Ben; Clements, Michelle; Fenwick, Alan; Harrison, Wendy E; Fleming, Fiona M

    2017-05-01

    The cornerstone of current schistosomiasis control programmes is delivery of praziquantel to at-risk populations. Such preventive chemotherapy requires accurate information on the geographic distribution of infection, yet the performance of alternative survey designs for estimating prevalence and converting this into treatment decisions has not been thoroughly evaluated. We used baseline schistosomiasis mapping surveys from three countries (Malawi, Côte d'Ivoire and Liberia) to generate spatially realistic gold standard datasets, against which we tested alternative two-stage cluster survey designs. We assessed how sampling different numbers of schools per district (2-20) and children per school (10-50) influences the accuracy of prevalence estimates and treatment class assignment, and we compared survey cost-efficiency using data from Malawi. Due to the focal nature of schistosomiasis, up to 53% simulated surveys involving 2-5 schools per district failed to detect schistosomiasis in low endemicity areas (1-10% prevalence). Increasing the number of schools surveyed per district improved treatment class assignment far more than increasing the number of children sampled per school. For Malawi, surveys of 15 schools per district and 20-30 children per school reliably detected endemic schistosomiasis and maximised cost-efficiency. In sensitivity analyses where treatment costs and the country considered were varied, optimal survey size was remarkably consistent, with cost-efficiency maximised at 15-20 schools per district. Among two-stage cluster surveys for schistosomiasis, our simulations indicated that surveying 15-20 schools per district and 20-30 children per school optimised cost-efficiency and minimised the risk of under-treatment, with surveys involving more schools of greater cost-efficiency as treatment costs rose.

  20. Stratified Sampling Design Based on Data Mining

    PubMed Central

    Kim, Yeonkook J.; Oh, Yoonhwan; Park, Sunghoon; Cho, Sungzoon

    2013-01-01

    Objectives To explore classification rules based on data mining methodologies which are to be used in defining strata in stratified sampling of healthcare providers with improved sampling efficiency. Methods We performed k-means clustering to group providers with similar characteristics, then, constructed decision trees on cluster labels to generate stratification rules. We assessed the variance explained by the stratification proposed in this study and by conventional stratification to evaluate the performance of the sampling design. We constructed a study database from health insurance claims data and providers' profile data made available to this study by the Health Insurance Review and Assessment Service of South Korea, and population data from Statistics Korea. From our database, we used the data for single specialty clinics or hospitals in two specialties, general surgery and ophthalmology, for the year 2011 in this study. Results Data mining resulted in five strata in general surgery with two stratification variables, the number of inpatients per specialist and population density of provider location, and five strata in ophthalmology with two stratification variables, the number of inpatients per specialist and number of beds. The percentages of variance in annual changes in the productivity of specialists explained by the stratification in general surgery and ophthalmology were 22% and 8%, respectively, whereas conventional stratification by the type of provider location and number of beds explained 2% and 0.2% of variance, respectively. Conclusions This study demonstrated that data mining methods can be used in designing efficient stratified sampling with variables readily available to the insurer and government; it offers an alternative to the existing stratification method that is widely used in healthcare provider surveys in South Korea. PMID:24175117

  1. Simultaneous clustering of gene expression data with clinical chemistry and pathological evaluations reveals phenotypic prototypes

    PubMed Central

    Bushel, Pierre R; Wolfinger, Russell D; Gibson, Greg

    2007-01-01

    Background Commonly employed clustering methods for analysis of gene expression data do not directly incorporate phenotypic data about the samples. Furthermore, clustering of samples with known phenotypes is typically performed in an informal fashion. The inability of clustering algorithms to incorporate biological data in the grouping process can limit proper interpretation of the data and its underlying biology. Results We present a more formal approach, the modk-prototypes algorithm, for clustering biological samples based on simultaneously considering microarray gene expression data and classes of known phenotypic variables such as clinical chemistry evaluations and histopathologic observations. The strategy involves constructing an objective function with the sum of the squared Euclidean distances for numeric microarray and clinical chemistry data and simple matching for histopathology categorical values in order to measure dissimilarity of the samples. Separate weighting terms are used for microarray, clinical chemistry and histopathology measurements to control the influence of each data domain on the clustering of the samples. The dynamic validity index for numeric data was modified with a category utility measure for determining the number of clusters in the data sets. A cluster's prototype, formed from the mean of the values for numeric features and the mode of the categorical values of all the samples in the group, is representative of the phenotype of the cluster members. The approach is shown to work well with a simulated mixed data set and two real data examples containing numeric and categorical data types. One from a heart disease study and another from acetaminophen (an analgesic) exposure in rat liver that causes centrilobular necrosis. Conclusion The modk-prototypes algorithm partitioned the simulated data into clusters with samples in their respective class group and the heart disease samples into two groups (sick and buff denoting samples having pain type representative of angina and non-angina respectively) with an accuracy of 79%. This is on par with, or better than, the assignment accuracy of the heart disease samples by several well-known and successful clustering algorithms. Following modk-prototypes clustering of the acetaminophen-exposed samples, informative genes from the cluster prototypes were identified that are descriptive of, and phenotypically anchored to, levels of necrosis of the centrilobular region of the rat liver. The biological processes cell growth and/or maintenance, amine metabolism, and stress response were shown to discern between no and moderate levels of acetaminophen-induced centrilobular necrosis. The use of well-known and traditional measurements directly in the clustering provides some guarantee that the resulting clusters will be meaningfully interpretable. PMID:17408499

  2. Micro-scale Spatial Clustering of Cholera Risk Factors in Urban Bangladesh

    PubMed Central

    Bi, Qifang; Azman, Andrew S.; Satter, Syed Moinuddin; Khan, Azharul Islam; Ahmed, Dilruba; Riaj, Altaf Ahmed; Gurley, Emily S.; Lessler, Justin

    2016-01-01

    Close interpersonal contact likely drives spatial clustering of cases of cholera and diarrhea, but spatial clustering of risk factors may also drive this pattern. Few studies have focused specifically on how exposures for disease cluster at small spatial scales. Improving our understanding of the micro-scale clustering of risk factors for cholera may help to target interventions and power studies with cluster designs. We selected sets of spatially matched households (matched-sets) near cholera case households between April and October 2013 in a cholera endemic urban neighborhood of Tongi Township in Bangladesh. We collected data on exposures to suspected cholera risk factors at the household and individual level. We used intra-class correlation coefficients (ICCs) to characterize clustering of exposures within matched-sets and households, and assessed if clustering depended on the geographical extent of the matched-sets. Clustering over larger spatial scales was explored by assessing the relationship between matched-sets. We also explored whether different exposures tended to appear together in individuals, households, and matched-sets. Household level exposures, including: drinking municipal supplied water (ICC = 0.97, 95%CI = 0.96, 0.98), type of latrine (ICC = 0.88, 95%CI = 0.71, 1.00), and intermittent access to drinking water (ICC = 0.96, 95%CI = 0.87, 1.00) exhibited strong clustering within matched-sets. As the geographic extent of matched-sets increased, the concordance of exposures within matched-sets decreased. Concordance between matched-sets of exposures related to water supply was elevated at distances of up to approximately 400 meters. Household level hygiene practices were correlated with infrastructure shown to increase cholera risk. Co-occurrence of different individual level exposures appeared to mostly reflect the differing domestic roles of study participants. Strong spatial clustering of exposures at a small spatial scale in a cholera endemic population suggests a possible role for highly targeted interventions. Studies with cluster designs in areas with strong spatial clustering of exposures should increase sample size to account for the correlation of these exposures. PMID:26866926

  3. Social network recruitment for Yo Puedo: an innovative sexual health intervention in an underserved urban neighborhood—sample and design implications.

    PubMed

    Minnis, Alexandra M; vanDommelen-Gonzalez, Evan; Luecke, Ellen; Cheng, Helen; Dow, William; Bautista-Arredondo, Sergio; Padian, Nancy S

    2015-02-01

    Most existing evidence-based sexual health interventions focus on individual-level behavior, even though there is substantial evidence that highlights the influential role of social environments in shaping adolescents' behaviors and reproductive health outcomes. We developed Yo Puedo, a combined conditional cash transfer and life skills intervention for youth to promote educational attainment, job training, and reproductive health wellness that we then evaluated for feasibility among 162 youth aged 16-21 years in a predominantly Latino community in San Francisco, CA. The intervention targeted youth's social networks and involved recruitment and randomization of small social network clusters. In this paper we describe the design of the feasibility study and report participants' baseline characteristics. Furthermore, we examined the sample and design implications of recruiting social network clusters as the unit of randomization. Baseline data provide evidence that we successfully enrolled high risk youth using a social network recruitment approach in community and school-based settings. Nearly all participants (95%) were high risk for adverse educational and reproductive health outcomes based on multiple measures of low socioeconomic status (81%) and/or reported high risk behaviors (e.g., gang affiliation, past pregnancy, recent unprotected sex, frequent substance use; 62%). We achieved variability in the study sample through heterogeneity in recruitment of the index participants, whereas the individuals within the small social networks of close friends demonstrated substantial homogeneity across sociodemographic and risk profile characteristics. Social networks recruitment was feasible and yielded a sample of high risk youth willing to enroll in a randomized study to evaluate a novel sexual health intervention.

  4. Duration of Sleep and ADHD Tendency among Adolescents in China

    ERIC Educational Resources Information Center

    Lam, Lawrence T.; Yang, L.

    2008-01-01

    Objective: This study investigates the association between duration of sleep and ADHD tendency among adolescents. Method: This population-based health survey uses a two-stage random cluster sampling design. Participants ages 13 to 17 are recruited from the total population of adolescents attending high school in one city of China. Duration of…

  5. Students' Readiness for E-Learning Application in Higher Education

    ERIC Educational Resources Information Center

    Rasouli, Atousa; Rahbania, Zahra; Attaran, Mohammad

    2016-01-01

    The main goal of this research was to investigate the readiness of art students in applying e-learning. This study adopted a survey research design. From three public Iranian Universities (Alzahra, Tarbiat Modares, and Tehran), 347 students were selected by multistage cluster sampling and via Morgan Table. Their readiness for E-learning…

  6. Missed Opportunities to Keep Children Safe? National Survey of Injury Prevention Activities of Children's Centres

    ERIC Educational Resources Information Center

    Watson, Michael Craig; Mulvaney, Caroline; Timblin, Clare; Stewart, Jane; Coupland, Carol A.; Deave, Toity; Hayes, Mike; Kendrick, Denise

    2016-01-01

    Objective: To ascertain the activities undertaken by children's centres to prevent unintentional injuries in the under-fives and, in particular, the prevention of falls, poisoning and scalds. Design: A questionnaire was posted to managers of 851 children's centres, using stratified cluster sampling. The questionnaire included questions on injury…

  7. 75 FR 44937 - Submission for OMB Review; Comment Request

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-07-30

    ... is a block cluster, which consists of one or more contiguous census blocks. The P sample is a sample of housing units and persons obtained independently from the census for a sample of block clusters. The E sample is a sample of census housing units and enumerations in the same block of clusters as the...

  8. A sero-survey of rinderpest in nomadic pastoral systems in central and southern Somalia from 2002 to 2003, using a spatially integrated random sampling approach.

    PubMed

    Tempia, S; Salman, M D; Keefe, T; Morley, P; Freier, J E; DeMartini, J C; Wamwayi, H M; Njeumi, F; Soumaré, B; Abdi, A M

    2010-12-01

    A cross-sectional sero-survey, using a two-stage cluster sampling design, was conducted between 2002 and 2003 in ten administrative regions of central and southern Somalia, to estimate the seroprevalence and geographic distribution of rinderpest (RP) in the study area, as well as to identify potential risk factors for the observed seroprevalence distribution. The study was also used to test the feasibility of the spatially integrated investigation technique in nomadic and semi-nomadic pastoral systems. In the absence of a systematic list of livestock holdings, the primary sampling units were selected by generating random map coordinates. A total of 9,216 serum samples were collected from cattle aged 12 to 36 months at 562 sampling sites. Two apparent clusters of RP seroprevalence were detected. Four potential risk factors associated with the observed seroprevalence were identified: the mobility of cattle herds, the cattle population density, the proximity of cattle herds to cattle trade routes and cattle herd size. Risk maps were then generated to assist in designing more targeted surveillance strategies. The observed seroprevalence in these areas declined over time. In subsequent years, similar seroprevalence studies in neighbouring areas of Kenya and Ethiopia also showed a very low seroprevalence of RP or the absence of antibodies against RP. The progressive decline in RP antibody prevalence is consistent with virus extinction. Verification of freedom from RP infection in the Somali ecosystem is currently in progress.

  9. Could the clinical interpretability of subgroups detected using clustering methods be improved by using a novel two-stage approach?

    PubMed

    Kent, Peter; Stochkendahl, Mette Jensen; Christensen, Henrik Wulff; Kongsted, Alice

    2015-01-01

    Recognition of homogeneous subgroups of patients can usefully improve prediction of their outcomes and the targeting of treatment. There are a number of research approaches that have been used to recognise homogeneity in such subgroups and to test their implications. One approach is to use statistical clustering techniques, such as Cluster Analysis or Latent Class Analysis, to detect latent relationships between patient characteristics. Influential patient characteristics can come from diverse domains of health, such as pain, activity limitation, physical impairment, social role participation, psychological factors, biomarkers and imaging. However, such 'whole person' research may result in data-driven subgroups that are complex, difficult to interpret and challenging to recognise clinically. This paper describes a novel approach to applying statistical clustering techniques that may improve the clinical interpretability of derived subgroups and reduce sample size requirements. This approach involves clustering in two sequential stages. The first stage involves clustering within health domains and therefore requires creating as many clustering models as there are health domains in the available data. This first stage produces scoring patterns within each domain. The second stage involves clustering using the scoring patterns from each health domain (from the first stage) to identify subgroups across all domains. We illustrate this using chest pain data from the baseline presentation of 580 patients. The new two-stage clustering resulted in two subgroups that approximated the classic textbook descriptions of musculoskeletal chest pain and atypical angina chest pain. The traditional single-stage clustering resulted in five clusters that were also clinically recognisable but displayed less distinct differences. In this paper, a new approach to using clustering techniques to identify clinically useful subgroups of patients is suggested. Research designs, statistical methods and outcome metrics suitable for performing that testing are also described. This approach has potential benefits but requires broad testing, in multiple patient samples, to determine its clinical value. The usefulness of the approach is likely to be context-specific, depending on the characteristics of the available data and the research question being asked of it.

  10. Do Major Depressive Disorder and Dysthymic Disorder confer differential risk for suicide?

    PubMed Central

    Witte, Tracy K.; Timmons, Katherine A.; Fink, Erin; Smith, April R.; Joiner, Thomas E.

    2009-01-01

    Background Although there has been a tremendous amount of research examining the risk conferred for suicide by depression in general, relatively little research examines the risk conferred by specific forms of depressive illness (e.g., dysthymic disorder, single episode versus recurrent major depressive disorder [MDD]). The purpose of the current study was to examine differences in suicidal ideation, clinician-rated suicide risk, suicide attempts, and family history of suicide in a sample of outpatients diagnosed with various forms of depressive illness. Methods To accomplish this aim, we conducted a cluster analysis using the aforementioned suicide-related variables in a sample of 494 outpatients seen between January 2001 and July 2007 at the Florida State University Psychology Clinic. Patients were diagnosed using DSM-IV criteria. Results Two distinct clusters emerged that were indicative of lower and higher risk for suicide. After controlling for the number of comorbid Axis I and Axis II diagnoses, the only depressive illness that significantly predicted cluster membership was recurrent MDD, which tripled an individual’s likelihood of being assigned to the higher risk cluster. Limitations The use of a cross-sectional design; the relatively low suicide risk in our sample; the relatively small number of individuals with double depression. Conclusions Our results demonstrate the importance of both chronicity and severity of depression in terms of predicting increased suicide risk. Among the various forms of depressive illness examined, only recurrent MDD appeared to confer greater risk for suicide. PMID:18842304

  11. Manipulation of visible-light polarization with dendritic cell-cluster metasurfaces.

    PubMed

    Fang, Zhen-Hua; Chen, Huan; An, Di; Luo, Chun-Rong; Zhao, Xiao-Peng

    2018-06-26

    Cross-polarization conversion plays an important role in visible light manipulation. Metasurface with asymmetric structure can be used to achieve polarization conversion of linearly polarized light. Based on this, we design a quasi-periodic dendritic metasurface model composed of asymmetric dendritic cells. The simulation indicates that the asymmetric dendritic structure can vertically rotate the polarization direction of the linear polarization wave in visible light. Silver dendritic cell-cluster metasurface samples were prepared by the bottom-up electrochemical deposition. It experimentally proved that they could realize the cross - polarization conversion in visible light. Cross-polarized propagating light is deflected into anomalous refraction channels. Dendritic cell-cluster metasurface with asymmetric quasi-periodic structure conveys significance in cross-polarization conversion research and features extensive practical application prospect and development potential.

  12. The South Pole Telescope: Unraveling the Mystery of Dark Energy

    NASA Astrophysics Data System (ADS)

    Reichardt, Christian L.; de Haan, Tijmen; Bleem, Lindsey E.

    2016-07-01

    The South Pole Telescope (SPT) is a 10-meter telescope designed to survey the millimeter-wave sky, taking advantage of the exceptional observing conditions at the Amundsen-Scott South Pole Station. The telescope and its ground-breaking 960-element bolometric camera finished surveying 2500 square degrees at 95. 150, and 220 GHz in November 2011. We have discovered hundreds of galaxy clusters in the SPT-SZ survey through the Sunyaev-Zel’dovich (SZ) effect. The formation of galaxy clusters the largest bound objects in the universe is highly sensitive to dark energy and the history of structure formation. I will discuss the cosmological constraints from the SPT-SZ galaxy cluster sample as well as future prospects with the soon to-be-installed SPT-3G camera.

  13. Strategies for Achieving High Sequencing Accuracy for Low Diversity Samples and Avoiding Sample Bleeding Using Illumina Platform

    PubMed Central

    Mitra, Abhishek; Skrzypczak, Magdalena; Ginalski, Krzysztof; Rowicka, Maga

    2015-01-01

    Sequencing microRNA, reduced representation sequencing, Hi-C technology and any method requiring the use of in-house barcodes result in sequencing libraries with low initial sequence diversity. Sequencing such data on the Illumina platform typically produces low quality data due to the limitations of the Illumina cluster calling algorithm. Moreover, even in the case of diverse samples, these limitations are causing substantial inaccuracies in multiplexed sample assignment (sample bleeding). Such inaccuracies are unacceptable in clinical applications, and in some other fields (e.g. detection of rare variants). Here, we discuss how both problems with quality of low-diversity samples and sample bleeding are caused by incorrect detection of clusters on the flowcell during initial sequencing cycles. We propose simple software modifications (Long Template Protocol) that overcome this problem. We present experimental results showing that our Long Template Protocol remarkably increases data quality for low diversity samples, as compared with the standard analysis protocol; it also substantially reduces sample bleeding for all samples. For comprehensiveness, we also discuss and compare experimental results from alternative approaches to sequencing low diversity samples. First, we discuss how the low diversity problem, if caused by barcodes, can be avoided altogether at the barcode design stage. Second and third, we present modified guidelines, which are more stringent than the manufacturer’s, for mixing low diversity samples with diverse samples and lowering cluster density, which in our experience consistently produces high quality data from low diversity samples. Fourth and fifth, we present rescue strategies that can be applied when sequencing results in low quality data and when there is no more biological material available. In such cases, we propose that the flowcell be re-hybridized and sequenced again using our Long Template Protocol. Alternatively, we discuss how analysis can be repeated from saved sequencing images using the Long Template Protocol to increase accuracy. PMID:25860802

  14. Unequal cluster sizes in stepped-wedge cluster randomised trials: a systematic review

    PubMed Central

    Morris, Tom; Gray, Laura

    2017-01-01

    Objectives To investigate the extent to which cluster sizes vary in stepped-wedge cluster randomised trials (SW-CRT) and whether any variability is accounted for during the sample size calculation and analysis of these trials. Setting Any, not limited to healthcare settings. Participants Any taking part in an SW-CRT published up to March 2016. Primary and secondary outcome measures The primary outcome is the variability in cluster sizes, measured by the coefficient of variation (CV) in cluster size. Secondary outcomes include the difference between the cluster sizes assumed during the sample size calculation and those observed during the trial, any reported variability in cluster sizes and whether the methods of sample size calculation and methods of analysis accounted for any variability in cluster sizes. Results Of the 101 included SW-CRTs, 48% mentioned that the included clusters were known to vary in size, yet only 13% of these accounted for this during the calculation of the sample size. However, 69% of the trials did use a method of analysis appropriate for when clusters vary in size. Full trial reports were available for 53 trials. The CV was calculated for 23 of these: the median CV was 0.41 (IQR: 0.22–0.52). Actual cluster sizes could be compared with those assumed during the sample size calculation for 14 (26%) of the trial reports; the cluster sizes were between 29% and 480% of that which had been assumed. Conclusions Cluster sizes often vary in SW-CRTs. Reporting of SW-CRTs also remains suboptimal. The effect of unequal cluster sizes on the statistical power of SW-CRTs needs further exploration and methods appropriate to studies with unequal cluster sizes need to be employed. PMID:29146637

  15. Unsupervised classification of multivariate geostatistical data: Two algorithms

    NASA Astrophysics Data System (ADS)

    Romary, Thomas; Ors, Fabien; Rivoirard, Jacques; Deraisme, Jacques

    2015-12-01

    With the increasing development of remote sensing platforms and the evolution of sampling facilities in mining and oil industry, spatial datasets are becoming increasingly large, inform a growing number of variables and cover wider and wider areas. Therefore, it is often necessary to split the domain of study to account for radically different behaviors of the natural phenomenon over the domain and to simplify the subsequent modeling step. The definition of these areas can be seen as a problem of unsupervised classification, or clustering, where we try to divide the domain into homogeneous domains with respect to the values taken by the variables in hand. The application of classical clustering methods, designed for independent observations, does not ensure the spatial coherence of the resulting classes. Image segmentation methods, based on e.g. Markov random fields, are not adapted to irregularly sampled data. Other existing approaches, based on mixtures of Gaussian random functions estimated via the expectation-maximization algorithm, are limited to reasonable sample sizes and a small number of variables. In this work, we propose two algorithms based on adaptations of classical algorithms to multivariate geostatistical data. Both algorithms are model free and can handle large volumes of multivariate, irregularly spaced data. The first one proceeds by agglomerative hierarchical clustering. The spatial coherence is ensured by a proximity condition imposed for two clusters to merge. This proximity condition relies on a graph organizing the data in the coordinates space. The hierarchical algorithm can then be seen as a graph-partitioning algorithm. Following this interpretation, a spatial version of the spectral clustering algorithm is also proposed. The performances of both algorithms are assessed on toy examples and a mining dataset.

  16. Backbone conformations and side chain flexibility of two somatostatin mimics investigated by molecular dynamics simulations.

    PubMed

    Interlandi, Gianluca

    2009-05-15

    Molecular dynamics simulations with two designed somatostatin mimics, SOM230 and SMS 201-995, were performed in explicit water for a total aggregated time of 208 ns. Analysis of the runs with SOM230 revealed the presence of two clusters of conformations. Strikingly, the two sampled conformers correspond to the two main X-ray structures in the asymmetric unit of SMS 201-995. Structural comparison between the residues of SOM230 and SMS 201-995 provides an explanation for the high binding affinity of SOM230 to four of five somatostatin receptors. Similarly, cluster analysis of the simulations with SMS 201-995 shows that the backbone of the peptide interconverts between its two main crystallographic conformers. The conformations of SMS 201-995 sampled in the two clusters violated two different sets of NOE distance constraints in agreement with a previous NMR study. Differences in side chain fluctuations between SOM230 and SMS 201-995 observed in the simulations may contribute to the relatively higher binding affinity of SOM230 to most somatostatin receptors.

  17. Dimensional assessment of personality pathology in patients with eating disorders.

    PubMed

    Goldner, E M; Srikameswaran, S; Schroeder, M L; Livesley, W J; Birmingham, C L

    1999-02-22

    This study examined patients with eating disorders on personality pathology using a dimensional method. Female subjects who met DSM-IV diagnostic criteria for eating disorder (n = 136) were evaluated and compared to an age-controlled general population sample (n = 68). We assessed 18 features of personality disorder with the Dimensional Assessment of Personality Pathology - Basic Questionnaire (DAPP-BQ). Factor analysis and cluster analysis were used to derive three clusters of patients. A five-factor solution was obtained with limited intercorrelation between factors. Cluster analysis produced three clusters with the following characteristics: Cluster 1 members (constituting 49.3% of the sample and labelled 'rigid') had higher mean scores on factors denoting compulsivity and interpersonal difficulties; Cluster 2 (18.4% of the sample) showed highest scores in factors denoting psychopathy, neuroticism and impulsive features, and appeared to constitute a borderline psychopathology group; Cluster 3 (32.4% of the sample) was characterized by few differences in personality pathology in comparison to the normal population sample. Cluster membership was associated with DSM-IV diagnosis -- a large proportion of patients with anorexia nervosa were members of Cluster 1. An empirical classification of eating-disordered patients derived from dimensional assessment of personality pathology identified three groups with clinical relevance.

  18. A multimembership catalogue for 1876 open clusters using UCAC4 data

    NASA Astrophysics Data System (ADS)

    Sampedro, L.; Dias, W. S.; Alfaro, E. J.; Monteiro, H.; Molino, A.

    2017-10-01

    The main objective of this work is to determine the cluster members of 1876 open clusters, using positions and proper motions of the astrometric fourth United States Naval Observatory (USNO) CCD Astrograph Catalog (UCAC4). For this purpose, we apply three different methods, all based on a Bayesian approach, but with different formulations: a purely parametric method, another completely non-parametric algorithm and a third, recently developed by Sampedro & Alfaro, using both formulations at different steps of the whole process. The first and second statistical moments of the members' phase-space subspace, obtained after applying the three methods, are compared for every cluster. Although, on average, the three methods yield similar results, there are also specific differences between them, as well as for some particular clusters. The comparison with other published catalogues shows good agreement. We have also estimated, for the first time, the mean proper motion for a sample of 18 clusters. The results are organized in a single catalogue formed by two main files, one with the most relevant information for each cluster, partially including that in UCAC4, and the other showing the individual membership probabilities for each star in the cluster area. The final catalogue, with an interface design that enables an easy interaction with the user, is available in electronic format at the Stellar Systems Group (SSG-IAA) web site (http://ssg.iaa.es/en/content/sampedro-cluster-catalog).

  19. The Atacama Cosmology Telescope: Physical Properties and Purity of a Galaxy Cluster Sample Selected Via the Sunyaev-Zel'Dovich Effect

    NASA Technical Reports Server (NTRS)

    Menanteau, Felipe; Gonzalez, Jorge; Juin, Jean-Baptiste; Marriage, Tobias; Reese, Erik D.; Acquaviva, Viviana; Aguirre, Paula; Appel, John Willam; Baker, Andrew J.; Barrientos, L. Felipe; hide

    2010-01-01

    We present optical and X-ray properties for the first confirmed galaxy cluster sample selected by the Sunyaev-Zel'dovich Effect from 148 GHz maps over 455 square degrees of sky made with the Atacama Cosmology Telescope. These maps. coupled with multi-band imaging on 4-meter-class optical telescopes, have yielded a sample of 23 galaxy clusters with redshifts between 0.118 and 1.066. Of these 23 clusters, 10 are newly discovered. The selection of this sample is approximately mass limited and essentially independent of redshift. We provide optical positions, images, redshifts and X-ray fluxes and luminosities for the full sample, and X-ray temperatures of an important subset. The mass limit of the full sample is around 8.0 x 10(exp 14) Stellar Mass. with a number distribution that peaks around a redshift of 0.4. For the 10 highest significance SZE-selected cluster candidates, all of which are optically confirmed, the mass threshold is 1 x 10(exp 15) Stellar Mass and the redshift range is 0.167 to 1.066. Archival observations from Chandra, XMM-Newton. and ROSAT provide X-ray luminosities and temperatures that are broadly consistent with this mass threshold. Our optical follow-up procedure also allowed us to assess the purity of the ACT cluster sample. Eighty (one hundred) percent of the 148 GHz candidates with signal-to-noise ratios greater than 5.1 (5.7) are confirmed as massive clusters. The reported sample represents one of the largest SZE-selected sample of massive clusters over all redshifts within a cosmologically-significant survey volume, which will enable cosmological studies as well as future studies on the evolution, morphology, and stellar populations in the most massive clusters in the Universe.

  20. Intraherd correlation coefficients and design effects for bovine viral diarrhoea, infectious bovine rhinotracheitis, leptospirosis and neosporosis in cow-calf system herds in North-eastern Mexico.

    PubMed

    Segura-Correa, J C; Domínguez-Díaz, D; Avalos-Ramírez, R; Argaez-Sosa, J

    2010-09-01

    Knowledge of the intraherd correlation coefficient (ICC) and design (D) effect for infectious diseases could be of interest in sample size calculation and to provide the correct standard errors of prevalence estimates in cluster or two-stage samplings surveys. Information on 813 animals from 48 non-vaccinated cow-calf herds from North-eastern Mexico was used. The ICC for the bovine viral diarrhoea (BVD), infectious bovine rhinotracheitis (IBR), leptospirosis and neosporosis diseases were calculated using a Bayesian approach adjusting for the sensitivity and specificity of the diagnostic tests. The ICC and D values for BVD, IBR, leptospirosis and neosporosis were 0.31 and 5.91, 0.18 and 3.88, 0.22 and 4.53, and 0.11 and 2.68, respectively. The ICC and D values were different from 0 and D greater than 1, therefore large sample sizes are required to obtain the same precision in prevalence estimates than for a random simple sampling design. The report of ICC and D values is of great help in planning and designing two-stage sampling studies. 2010 Elsevier B.V. All rights reserved.

  1. College Students' Academic Motivation: Differences by Gender, Class, and Source of Payment

    ERIC Educational Resources Information Center

    Brouse, Corey H.; Basch, Charles E.; LeBlanc, Michael; McKnight, Kelly R.; Lei, Ting

    2010-01-01

    The purpose of this paper is to describe college students' (n = 856) gender, year in school and source of tuition funding in relation to their academic motivation. The design was cross-sectional and used cluster sampling. The Academic Motivation Scale was used to measure students' intrinsic and extrinsic motivations as well as amotivation. Three…

  2. Family Carers' Experiences Using Support Services in Europe: Empirical Evidence from the EUROFAMCARE Study

    ERIC Educational Resources Information Center

    Lamura, Giovanni; Mnich, Eva; Nolan, Mike; Wojszel, Beata; Krevers, Barbro; Mestheneos, Liz; Dohner, Hanneli

    2008-01-01

    Purpose: This article explores the experiences of family carers of older people in using support services in six European countries: Germany, Greece, Italy, Poland, Sweden, and the UK. Design and Methods: Following a common protocol, data were collected from national samples of approximately 1,000 family carers per country and clustered into…

  3. Effects of the "Positive Action" Program on Indicators of Positive Youth Development among Urban Youth

    ERIC Educational Resources Information Center

    Lewis, Kendra M.; Vuchinich, Samuel; Ji, Peter; DuBois, David L.; Acock, Alan; Bavarian, Niloofar; Day, Joseph; Silverthorn, Naida; Flay, Brian R.

    2016-01-01

    This study evaluated effects of "Positive Action," a school-based social-emotional and character development intervention, on indicators of positive youth development (PYD) among a sample of low-income, ethnic minority youth attending 14 urban schools. The study used a matched-pair, cluster-randomized controlled design at the school…

  4. Using Theory of Planned Behavior to Predict Healthy Eating among Danish Adolescents

    ERIC Educational Resources Information Center

    Gronhoj, Alice; Bech-Larsen, Tino; Chan, Kara; Tsang, Lennon

    2013-01-01

    Purpose: The purpose of the study was to apply the theory of planned behavior to predict Danish adolescents' behavioral intention for healthy eating. Design/methodology/approach: A cluster sample survey of 410 students aged 11 to 16 years studying in Grade 6 to Grade 10 was conducted in Denmark. Findings: Perceived behavioral control followed by…

  5. Patterns and Impact of Comorbidity and Multimorbidity among Community-Resident American Indian Elders

    ERIC Educational Resources Information Center

    John, Robert; Kerby, Dave S.; Hennessy, Catherine Hagan

    2003-01-01

    Purpose: The purpose of this study is to suggest a new approach to identifying patterns of comorbidity and multimorbidity. Design and Methods: A random sample of 1,039 rural community-resident American Indian elders aged 60 years and older was surveyed. Comorbidity was investigated with four standard approaches, and with cluster analysis. Results:…

  6. VizieR Online Data Catalog: Spectroscopy of luminous compact blue galaxies (Crawford+, 2016)

    NASA Astrophysics Data System (ADS)

    Crawford, S. M.; Wirth, G. D.; Bershady, M. A.; Randriamampandry, S. M.

    2017-10-01

    Deep imaging data in UBRIz and two narrow bands were obtained with the Mini-Mosaic camera from the WIYN 3.5 m telescope for all five clusters between 1999 October and 2004 June. We obtained spectroscopic observations for a sample of cluster star-forming galaxies with the DEIMOS, Faber et al. 2003 on the Keck II Telescope during 2005 October and 2007 April. The narrow-band filters were specifically designed to detect [OII] λ3727 at the redshift of each cluster. All of the clusters have been the target of extensive observations with the HST, primarily using either WFPC2 or the Advanced Camera for Surveys (ACS). For all measurements, we have attempted to select data taken in a filter closest to the rest-frame B band. We have employed ACS imaging data whenever possible and substituted WFPC2 images only when required. For clusters observed in the far-infrared regime by the Spitzer Space Telescope, we extracted MIPS 24μm flux densities, S24, from images obtained through the Enhanced Imaging Products archive. (2 data files).

  7. Testing for X-Ray–SZ Differences and Redshift Evolution in the X-Ray Morphology of Galaxy Clusters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nurgaliev, D.; McDonald, M.; Benson, B. A.

    We present a quantitative study of the X-ray morphology of galaxy clusters, as a function of their detection method and redshift. We analyze two separate samples of galaxy clusters: a sample of 36 clusters atmore » $$0.35\\lt z\\lt 0.9$$ selected in the X-ray with the ROSAT PSPC 400 deg(2) survey, and a sample of 90 clusters at $$0.25\\lt z\\lt 1.2$$ selected via the Sunyaev–Zel’dovich (SZ) effect with the South Pole Telescope. Clusters from both samples have similar-quality Chandra observations, which allow us to quantify their X-ray morphologies via two distinct methods: centroid shifts (w) and photon asymmetry ($${A}_{\\mathrm{phot}}$$). The latter technique provides nearly unbiased morphology estimates for clusters spanning a broad range of redshift and data quality. We further compare the X-ray morphologies of X-ray- and SZ-selected clusters with those of simulated clusters. We do not find a statistically significant difference in the measured X-ray morphology of X-ray and SZ-selected clusters over the redshift range probed by these samples, suggesting that the two are probing similar populations of clusters. We find that the X-ray morphologies of simulated clusters are statistically indistinguishable from those of X-ray- or SZ-selected clusters, implying that the most important physics for dictating the large-scale gas morphology (outside of the core) is well-approximated in these simulations. Finally, we find no statistically significant redshift evolution in the X-ray morphology (both for observed and simulated clusters), over the range of $$z\\sim 0.3$$ to $$z\\sim 1$$, seemingly in contradiction with the redshift-dependent halo merger rate predicted by simulations.« less

  8. Testing for X-Ray–SZ Differences and Redshift Evolution in the X-Ray Morphology of Galaxy Clusters

    DOE PAGES

    Nurgaliev, D.; McDonald, M.; Benson, B. A.; ...

    2017-05-16

    We present a quantitative study of the X-ray morphology of galaxy clusters, as a function of their detection method and redshift. We analyze two separate samples of galaxy clusters: a sample of 36 clusters atmore » $$0.35\\lt z\\lt 0.9$$ selected in the X-ray with the ROSAT PSPC 400 deg(2) survey, and a sample of 90 clusters at $$0.25\\lt z\\lt 1.2$$ selected via the Sunyaev–Zel’dovich (SZ) effect with the South Pole Telescope. Clusters from both samples have similar-quality Chandra observations, which allow us to quantify their X-ray morphologies via two distinct methods: centroid shifts (w) and photon asymmetry ($${A}_{\\mathrm{phot}}$$). The latter technique provides nearly unbiased morphology estimates for clusters spanning a broad range of redshift and data quality. We further compare the X-ray morphologies of X-ray- and SZ-selected clusters with those of simulated clusters. We do not find a statistically significant difference in the measured X-ray morphology of X-ray and SZ-selected clusters over the redshift range probed by these samples, suggesting that the two are probing similar populations of clusters. We find that the X-ray morphologies of simulated clusters are statistically indistinguishable from those of X-ray- or SZ-selected clusters, implying that the most important physics for dictating the large-scale gas morphology (outside of the core) is well-approximated in these simulations. Finally, we find no statistically significant redshift evolution in the X-ray morphology (both for observed and simulated clusters), over the range of $$z\\sim 0.3$$ to $$z\\sim 1$$, seemingly in contradiction with the redshift-dependent halo merger rate predicted by simulations.« less

  9. A PRIOR EVALUATION OF TWO-STAGE CLUSTER SAMPLING FOR ACCURACY ASSESSMENT OF LARGE-AREA LAND-COVER MAPS

    EPA Science Inventory

    Two-stage cluster sampling reduces the cost of collecting accuracy assessment reference data by constraining sample elements to fall within a limited number of geographic domains (clusters). However, because classification error is typically positively spatially correlated, withi...

  10. Novel Signal Noise Reduction Method through Cluster Analysis, Applied to Photoplethysmography.

    PubMed

    Waugh, William; Allen, John; Wightman, James; Sims, Andrew J; Beale, Thomas A W

    2018-01-01

    Physiological signals can often become contaminated by noise from a variety of origins. In this paper, an algorithm is described for the reduction of sporadic noise from a continuous periodic signal. The design can be used where a sample of a periodic signal is required, for example, when an average pulse is needed for pulse wave analysis and characterization. The algorithm is based on cluster analysis for selecting similar repetitions or pulses from a periodic single. This method selects individual pulses without noise, returns a clean pulse signal, and terminates when a sufficiently clean and representative signal is received. The algorithm is designed to be sufficiently compact to be implemented on a microcontroller embedded within a medical device. It has been validated through the removal of noise from an exemplar photoplethysmography (PPG) signal, showing increasing benefit as the noise contamination of the signal increases. The algorithm design is generalised to be applicable for a wide range of physiological (physical) signals.

  11. Optimising cluster survey design for planning schistosomiasis preventive chemotherapy

    PubMed Central

    Sturrock, Hugh J. W.; Turner, Hugo; Whitton, Jane M.; Gower, Charlotte M.; Jemu, Samuel; Phillips, Anna E.; Meite, Aboulaye; Thomas, Brent; Kollie, Karsor; Thomas, Catherine; Rebollo, Maria P.; Styles, Ben; Clements, Michelle; Fenwick, Alan; Harrison, Wendy E.; Fleming, Fiona M.

    2017-01-01

    Background The cornerstone of current schistosomiasis control programmes is delivery of praziquantel to at-risk populations. Such preventive chemotherapy requires accurate information on the geographic distribution of infection, yet the performance of alternative survey designs for estimating prevalence and converting this into treatment decisions has not been thoroughly evaluated. Methodology/Principal findings We used baseline schistosomiasis mapping surveys from three countries (Malawi, Côte d’Ivoire and Liberia) to generate spatially realistic gold standard datasets, against which we tested alternative two-stage cluster survey designs. We assessed how sampling different numbers of schools per district (2–20) and children per school (10–50) influences the accuracy of prevalence estimates and treatment class assignment, and we compared survey cost-efficiency using data from Malawi. Due to the focal nature of schistosomiasis, up to 53% simulated surveys involving 2–5 schools per district failed to detect schistosomiasis in low endemicity areas (1–10% prevalence). Increasing the number of schools surveyed per district improved treatment class assignment far more than increasing the number of children sampled per school. For Malawi, surveys of 15 schools per district and 20–30 children per school reliably detected endemic schistosomiasis and maximised cost-efficiency. In sensitivity analyses where treatment costs and the country considered were varied, optimal survey size was remarkably consistent, with cost-efficiency maximised at 15–20 schools per district. Conclusions/Significance Among two-stage cluster surveys for schistosomiasis, our simulations indicated that surveying 15–20 schools per district and 20–30 children per school optimised cost-efficiency and minimised the risk of under-treatment, with surveys involving more schools of greater cost-efficiency as treatment costs rose. PMID:28552961

  12. The Tehran Eye Study: research design and eye examination protocol

    PubMed Central

    Hashemi, Hassan; Fotouhi, Akbar; Mohammad, Kazem

    2003-01-01

    Background Visual impairment has a profound impact on society. The majority of visually impaired people live in developing countries, and since most disorders leading to visual impairment are preventable or curable, their control is a priority in these countries. Considering the complicated epidemiology of visual impairment and the wide variety of factors involved, region specific intervention strategies are required for every community. Therefore, providing appropriate data is one of the first steps in these communities, as it is in Iran. The objectives of this study are to describe the prevalence and causes of visual impairment in the population of Tehran city; the prevalence of refractive errors, lens opacity, ocular hypertension, and color blindness in this population, and also the familial aggregation of refractive errors, lens opacity, ocular hypertension, and color blindness within the study sample. Methods Design Through a population-based, cross-sectional study, a total of 5300 Tehran citizens will be selected from 160 clusters using a stratified cluster random sampling strategy. The eligible people will be enumerated through a door-to-door household survey in the selected clusters and will be invited. All participants will be transferred to a clinic for measurements of uncorrected, best corrected and presenting visual acuity; manifest, subjective and cycloplegic refraction; color vision test; Goldmann applanation tonometry; examination of the external eye, anterior segment, media, and fundus; and an interview about demographic characteristics and history of eye diseases, eye trauma, diabetes mellitus, high blood pressure, and ophthalmologic cares. The study design and eye examination protocol are described. Conclusion We expect that findings from the TES will show the status of visual problems and their causes in the community. This study can highlight the people who should be targeted by visual impairment prevention programs. PMID:12859794

  13. Cluster Stability Estimation Based on a Minimal Spanning Trees Approach

    NASA Astrophysics Data System (ADS)

    Volkovich, Zeev (Vladimir); Barzily, Zeev; Weber, Gerhard-Wilhelm; Toledano-Kitai, Dvora

    2009-08-01

    Among the areas of data and text mining which are employed today in science, economy and technology, clustering theory serves as a preprocessing step in the data analyzing. However, there are many open questions still waiting for a theoretical and practical treatment, e.g., the problem of determining the true number of clusters has not been satisfactorily solved. In the current paper, this problem is addressed by the cluster stability approach. For several possible numbers of clusters we estimate the stability of partitions obtained from clustering of samples. Partitions are considered consistent if their clusters are stable. Clusters validity is measured as the total number of edges, in the clusters' minimal spanning trees, connecting points from different samples. Actually, we use the Friedman and Rafsky two sample test statistic. The homogeneity hypothesis, of well mingled samples within the clusters, leads to asymptotic normal distribution of the considered statistic. Resting upon this fact, the standard score of the mentioned edges quantity is set, and the partition quality is represented by the worst cluster corresponding to the minimal standard score value. It is natural to expect that the true number of clusters can be characterized by the empirical distribution having the shortest left tail. The proposed methodology sequentially creates the described value distribution and estimates its left-asymmetry. Numerical experiments, presented in the paper, demonstrate the ability of the approach to detect the true number of clusters.

  14. 2-Way k-Means as a Model for Microbiome Samples.

    PubMed

    Jackson, Weston J; Agarwal, Ipsita; Pe'er, Itsik

    2017-01-01

    Motivation . Microbiome sequencing allows defining clusters of samples with shared composition. However, this paradigm poorly accounts for samples whose composition is a mixture of cluster-characterizing ones and which therefore lie in between them in the cluster space. This paper addresses unsupervised learning of 2-way clusters. It defines a mixture model that allows 2-way cluster assignment and describes a variant of generalized k -means for learning such a model. We demonstrate applicability to microbial 16S rDNA sequencing data from the Human Vaginal Microbiome Project.

  15. 2-Way k-Means as a Model for Microbiome Samples

    PubMed Central

    2017-01-01

    Motivation. Microbiome sequencing allows defining clusters of samples with shared composition. However, this paradigm poorly accounts for samples whose composition is a mixture of cluster-characterizing ones and which therefore lie in between them in the cluster space. This paper addresses unsupervised learning of 2-way clusters. It defines a mixture model that allows 2-way cluster assignment and describes a variant of generalized k-means for learning such a model. We demonstrate applicability to microbial 16S rDNA sequencing data from the Human Vaginal Microbiome Project. PMID:29177026

  16. Evaluation of the procedure 1A component of the 1980 US/Canada wheat and barley exploratory experiment

    NASA Technical Reports Server (NTRS)

    Chapman, G. M. (Principal Investigator); Carnes, J. G.

    1981-01-01

    Several techniques which use clusters generated by a new clustering algorithm, CLASSY, are proposed as alternatives to random sampling to obtain greater precision in crop proportion estimation: (1) Proportional Allocation/relative count estimator (PA/RCE) uses proportional allocation of dots to clusters on the basis of cluster size and a relative count cluster level estimate; (2) Proportional Allocation/Bayes Estimator (PA/BE) uses proportional allocation of dots to clusters and a Bayesian cluster-level estimate; and (3) Bayes Sequential Allocation/Bayesian Estimator (BSA/BE) uses sequential allocation of dots to clusters and a Bayesian cluster level estimate. Clustering in an effective method in making proportion estimates. It is estimated that, to obtain the same precision with random sampling as obtained by the proportional sampling of 50 dots with an unbiased estimator, samples of 85 or 166 would need to be taken if dot sets with AI labels (integrated procedure) or ground truth labels, respectively were input. Dot reallocation provides dot sets that are unbiased. It is recommended that these proportion estimation techniques are maintained, particularly the PA/BE because it provides the greatest precision.

  17. X-Ray Morphological Analysis of the Planck ESZ Clusters

    NASA Astrophysics Data System (ADS)

    Lovisari, Lorenzo; Forman, William R.; Jones, Christine; Ettori, Stefano; Andrade-Santos, Felipe; Arnaud, Monique; Démoclès, Jessica; Pratt, Gabriel W.; Randall, Scott; Kraft, Ralph

    2017-09-01

    X-ray observations show that galaxy clusters have a very large range of morphologies. The most disturbed systems, which are good to study how clusters form and grow and to test physical models, may potentially complicate cosmological studies because the cluster mass determination becomes more challenging. Thus, we need to understand the cluster properties of our samples to reduce possible biases. This is complicated by the fact that different experiments may detect different cluster populations. For example, Sunyaev-Zeldovich (SZ) selected cluster samples have been found to include a greater fraction of disturbed systems than X-ray selected samples. In this paper we determine eight morphological parameters for the Planck Early Sunyaev-Zeldovich (ESZ) objects observed with XMM-Newton. We found that two parameters, concentration and centroid shift, are the best to distinguish between relaxed and disturbed systems. For each parameter we provide the values that allow selecting the most relaxed or most disturbed objects from a sample. We found that there is no mass dependence on the cluster dynamical state. By comparing our results with what was obtained with REXCESS clusters, we also confirm that the ESZ clusters indeed tend to be more disturbed, as found by previous studies.

  18. X-Ray Morphological Analysis of the Planck ESZ Clusters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lovisari, Lorenzo; Forman, William R.; Jones, Christine

    2017-09-01

    X-ray observations show that galaxy clusters have a very large range of morphologies. The most disturbed systems, which are good to study how clusters form and grow and to test physical models, may potentially complicate cosmological studies because the cluster mass determination becomes more challenging. Thus, we need to understand the cluster properties of our samples to reduce possible biases. This is complicated by the fact that different experiments may detect different cluster populations. For example, Sunyaev–Zeldovich (SZ) selected cluster samples have been found to include a greater fraction of disturbed systems than X-ray selected samples. In this paper wemore » determine eight morphological parameters for the Planck Early Sunyaev–Zeldovich (ESZ) objects observed with XMM-Newton . We found that two parameters, concentration and centroid shift, are the best to distinguish between relaxed and disturbed systems. For each parameter we provide the values that allow selecting the most relaxed or most disturbed objects from a sample. We found that there is no mass dependence on the cluster dynamical state. By comparing our results with what was obtained with REXCESS clusters, we also confirm that the ESZ clusters indeed tend to be more disturbed, as found by previous studies.« less

  19. Spectroscopic Confirmation of Two Massive Red-sequence-selected Galaxy Clusters at Z Approximately Equal to 1.2 in the Sparcs-North Cluster Survey

    NASA Technical Reports Server (NTRS)

    Muzzin, Adam; Wilson, Gillian; Yee, H.K.C.; Hoekstra, Henk; Gilbank, David; Surace, Jason; Lacy, Mark; Blindert, Kris; Majumdar, Subhabrata; Demarco, Ricardo; hide

    2008-01-01

    The Spitzer Adaptation of the Red-sequence Cluster Survey (SpARCS) is a deep z -band imaging survey covering the Spitzer SWIRE Legacy fields designed to create the first large homogeneously-selected sample of massive clusters at z > 1 using an infrared adaptation of the cluster red-sequence method. We present an overview of the northern component of the survey which has been observed with CFHT/MegaCam and covers 28.3 deg(sup 2). The southern component of the survey was observed with CTIO/MOSAICII, covers 13.6 deg(sup 2), and is summarized in a companion paper by Wilson et al. (2008). We also present spectroscopic confirmation of two rich cluster candidates at z approx. 1.2. Based on Nod-and- Shuffle spectroscopy from GMOS-N on Gemini there are 17 and 28 confirmed cluster members in SpARCS J163435+402151 and SpARCS J163852+403843 which have spectroscopic redshifts of 1.1798 and 1.1963, respectively. The clusters have velocity dispersions of 490 +/- 140 km/s and 650 +/- 160 km/s, respectively which imply masses (M(sub 200)) of (1.0 +/- 0.9) x 10(exp 14) Stellar Mass and (2.4 +/- 1.8) x 10(exp 14) Stellar Mass. Confirmation of these candidates as bonafide massive clusters demonstrates that two-filter imaging is an effective, yet observationally efficient, method for selecting clusters at z > 1.

  20. Sampling procedures for inventory of commercial volume tree species in Amazon Forest.

    PubMed

    Netto, Sylvio P; Pelissari, Allan L; Cysneiros, Vinicius C; Bonazza, Marcelo; Sanquetta, Carlos R

    2017-01-01

    The spatial distribution of tropical tree species can affect the consistency of the estimators in commercial forest inventories, therefore, appropriate sampling procedures are required to survey species with different spatial patterns in the Amazon Forest. For this, the present study aims to evaluate the conventional sampling procedures and introduce the adaptive cluster sampling for volumetric inventories of Amazonian tree species, considering the hypotheses that the density, the spatial distribution and the zero-plots affect the consistency of the estimators, and that the adaptive cluster sampling allows to obtain more accurate volumetric estimation. We use data from a census carried out in Jamari National Forest, Brazil, where trees with diameters equal to or higher than 40 cm were measured in 1,355 plots. Species with different spatial patterns were selected and sampled with simple random sampling, systematic sampling, linear cluster sampling and adaptive cluster sampling, whereby the accuracy of the volumetric estimation and presence of zero-plots were evaluated. The sampling procedures applied to species were affected by the low density of trees and the large number of zero-plots, wherein the adaptive clusters allowed concentrating the sampling effort in plots with trees and, thus, agglutinating more representative samples to estimate the commercial volume.

  1. Adaptive Cluster Sampling for Forest Inventories

    Treesearch

    Francis A. Roesch

    1993-01-01

    Adaptive cluster sampling is shown to be a viable alternative for sampling forests when there are rare characteristics of the forest trees which are of interest and occur on clustered trees. The ideas of recent work in Thompson (1990) have been extended to the case in which the initial sample is selected with unequal probabilities. An example is given in which the...

  2. The correlation between temperature and humidity with the population density of Aedes aegypti as dengue fever’s vector

    NASA Astrophysics Data System (ADS)

    Sintorini, M. M.

    2018-01-01

    The weather change in South East Asia have triggered the increase of dengue fever illness in Indonesia. Jakarta has been declared as one of dengue fever endemic region. This research aim to gain the dynamic of dengue fever incidents related to temperature, humidity and the population density of Aedes aegypti. This research implementated Design of Ecology Study. The samples were collected from April 2015 to March 2016, from houses located in the suburbs i.e. Pasar Minggu, Ciracas, Sunter Agung, Palmerah and Bendungan Hilir. The sampling based on Sampling Design Cluster and each suburb represents 153 samples. The research shows correlation between temperature (p value 0.000) and humidity (p value 0,000) with Aedes aegypti as dengue fever’s Vector. Therefore, an early warning system should be developed based on environmental factors to anticipate the spread of dengue fever.

  3. Sample size adjustments for varying cluster sizes in cluster randomized trials with binary outcomes analyzed with second-order PQL mixed logistic regression.

    PubMed

    Candel, Math J J M; Van Breukelen, Gerard J P

    2010-06-30

    Adjustments of sample size formulas are given for varying cluster sizes in cluster randomized trials with a binary outcome when testing the treatment effect with mixed effects logistic regression using second-order penalized quasi-likelihood estimation (PQL). Starting from first-order marginal quasi-likelihood (MQL) estimation of the treatment effect, the asymptotic relative efficiency of unequal versus equal cluster sizes is derived. A Monte Carlo simulation study shows this asymptotic relative efficiency to be rather accurate for realistic sample sizes, when employing second-order PQL. An approximate, simpler formula is presented to estimate the efficiency loss due to varying cluster sizes when planning a trial. In many cases sampling 14 per cent more clusters is sufficient to repair the efficiency loss due to varying cluster sizes. Since current closed-form formulas for sample size calculation are based on first-order MQL, planning a trial also requires a conversion factor to obtain the variance of the second-order PQL estimator. In a second Monte Carlo study, this conversion factor turned out to be 1.25 at most. (c) 2010 John Wiley & Sons, Ltd.

  4. Searching for the 3.5 keV Line in the Stacked Suzaku Observations of Galaxy Clusters

    NASA Technical Reports Server (NTRS)

    Bulbul, Esra; Markevitch, Maxim; Foster, Adam; Miller, Eric; Bautz, Mark; Lowenstein, Mike; Randall, Scott W.; Smith, Randall K.

    2016-01-01

    We perform a detailed study of the stacked Suzaku observations of 47 galaxy clusters, spanning a redshift range of 0.01-0.45, to search for the unidentified 3.5 keV line. This sample provides an independent test for the previously detected line. We detect a 2sigma-significant spectral feature at 3.5 keV in the spectrum of the full sample. When the sample is divided into two subsamples (cool-core and non-cool core clusters), the cool-core subsample shows no statistically significant positive residuals at the line energy. A very weak (approx. 2sigma confidence) spectral feature at 3.5 keV is permitted by the data from the non-cool-core clusters sample. The upper limit on a neutrino decay mixing angle of sin(sup 2)(2theta) = 6.1 x 10(exp -11) from the full Suzaku sample is consistent with the previous detections in the stacked XMM-Newton sample of galaxy clusters (which had a higher statistical sensitivity to faint lines), M31, and Galactic center, at a 90% confidence level. However, the constraint from the present sample, which does not include the Perseus cluster, is in tension with previously reported line flux observed in the core of the Perseus cluster with XMM-Newton and Suzaku.

  5. Social network recruitment for Yo Puedo - an innovative sexual health intervention in an underserved urban neighborhood: sample and design implications

    PubMed Central

    Minnis, Alexandra M.; vanDommelen-Gonzalez, Evan; Luecke, Ellen; Cheng, Helen; Dow, William; Bautista-Arredondo, Sergio; Padian, Nancy S.

    2016-01-01

    Most existing evidence-based sexual health interventions focus on individual-level behavior, even though there is substantial evidence that highlights the influential role of social environments in shaping adolescents’ behaviors and reproductive health outcomes. We developed Yo Puedo, a combined conditional cash transfer (CCT) and life skills intervention for youth to promote educational attainment, job training, and reproductive health wellness that we then evaluated for feasibility among 162 youth aged 16–21 years in a predominantly Latino community in San Francisco, CA. The intervention targeted youth’s social networks and involved recruitment and randomization of small social network clusters. In this paper we describe the design of the feasibility study and report participants’ baseline characteristics. Furthermore, we examined the sample and design implications of recruiting social network clusters as the unit of randomization. Baseline data provide evidence that we successfully enrolled high risk youth using a social network recruitment approach in community and school-based settings. Nearly all participants (95%) were high risk for adverse educational and reproductive health outcomes based on multiple measures of low socioeconomic status (81%) and/or reported high risk behaviors (e.g., gang affiliation, past pregnancy, recent unprotected sex, frequent substance use) (62%). We achieved variability in the study sample through heterogeneity in recruitment of the index participants, whereas the individuals within the small social networks of close friends demonstrated substantial homogeneity across sociodemographic and risk profile characteristics. Social networks recruitment was feasible and yielded a sample of high risk youth willing to enroll in a randomized study to evaluate a novel sexual health intervention. PMID:25358834

  6. CA II TRIPLET SPECTROSCOPY OF SMALL MAGELLANIC CLOUD RED GIANTS. III. ABUNDANCES AND VELOCITIES FOR A SAMPLE OF 14 CLUSTERS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Parisi, M. C.; Clariá, J. J.; Marcionni, N.

    2015-05-15

    We obtained spectra of red giants in 15 Small Magellanic Cloud (SMC) clusters in the region of the Ca ii lines with FORS2 on the Very Large Telescope. We determined the mean metallicity and radial velocity with mean errors of 0.05 dex and 2.6 km s{sup −1}, respectively, from a mean of 6.5 members per cluster. One cluster (B113) was too young for a reliable metallicity determination and was excluded from the sample. We combined the sample studied here with 15 clusters previously studied by us using the same technique, and with 7 clusters whose metallicities determined by other authorsmore » are on a scale similar to ours. This compilation of 36 clusters is the largest SMC cluster sample currently available with accurate and homogeneously determined metallicities. We found a high probability that the metallicity distribution is bimodal, with potential peaks at −1.1 and −0.8 dex. Our data show no strong evidence of a metallicity gradient in the SMC clusters, somewhat at odds with recent evidence from Ca ii triplet spectra of a large sample of field stars. This may be revealing possible differences in the chemical history of clusters and field stars. Our clusters show a significant dispersion of metallicities, whatever age is considered, which could be reflecting the lack of a unique age–metallicity relation in this galaxy. None of the chemical evolution models currently available in the literature satisfactorily represents the global chemical enrichment processes of SMC clusters.« less

  7. The Hubble Space Telescope Medium Deep Survey Cluster Sample: Methodology and Data

    NASA Astrophysics Data System (ADS)

    Ostrander, E. J.; Nichol, R. C.; Ratnatunga, K. U.; Griffiths, R. E.

    1998-12-01

    We present a new, objectively selected, sample of galaxy overdensities detected in the Hubble Space Telescope Medium Deep Survey (MDS). These clusters/groups were found using an automated procedure that involved searching for statistically significant galaxy overdensities. The contrast of the clusters against the field galaxy population is increased when morphological data are used to search around bulge-dominated galaxies. In total, we present 92 overdensities above a probability threshold of 99.5%. We show, via extensive Monte Carlo simulations, that at least 60% of these overdensities are likely to be real clusters and groups and not random line-of-sight superpositions of galaxies. For each overdensity in the MDS cluster sample, we provide a richness and the average of the bulge-to-total ratio of galaxies within each system. This MDS cluster sample potentially contains some of the most distant clusters/groups ever detected, with about 25% of the overdensities having estimated redshifts z > ~0.9. We have made this sample publicly available to facilitate spectroscopic confirmation of these clusters and help more detailed studies of cluster and galaxy evolution. We also report the serendipitous discovery of a new cluster close on the sky to the rich optical cluster Cl l0016+16 at z = 0.546. This new overdensity, HST 001831+16208, may be coincident with both an X-ray source and a radio source. HST 001831+16208 is the third cluster/group discovered near to Cl 0016+16 and appears to strengthen the claims of Connolly et al. of superclustering at high redshift.

  8. Spectroscopic characterization of galaxy clusters in RCS-1: spectroscopic confirmation, redshift accuracy, and dynamical mass-richness relation

    NASA Astrophysics Data System (ADS)

    Gilbank, David G.; Barrientos, L. Felipe; Ellingson, Erica; Blindert, Kris; Yee, H. K. C.; Anguita, T.; Gladders, M. D.; Hall, P. B.; Hertling, G.; Infante, L.; Yan, R.; Carrasco, M.; Garcia-Vergara, Cristina; Dawson, K. S.; Lidman, C.; Morokuma, T.

    2018-05-01

    We present follow-up spectroscopic observations of galaxy clusters from the first Red-sequence Cluster Survey (RCS-1). This work focuses on two samples, a lower redshift sample of ˜30 clusters ranging in redshift from z ˜ 0.2-0.6 observed with multiobject spectroscopy (MOS) on 4-6.5-m class telescopes and a z ˜ 1 sample of ˜10 clusters 8-m class telescope observations. We examine the detection efficiency and redshift accuracy of the now widely used red-sequence technique for selecting clusters via overdensities of red-sequence galaxies. Using both these data and extended samples including previously published RCS-1 spectroscopy and spectroscopic redshifts from SDSS, we find that the red-sequence redshift using simple two-filter cluster photometric redshifts is accurate to σz ≈ 0.035(1 + z) in RCS-1. This accuracy can potentially be improved with better survey photometric calibration. For the lower redshift sample, ˜5 per cent of clusters show some (minor) contamination from secondary systems with the same red-sequence intruding into the measurement aperture of the original cluster. At z ˜ 1, the rate rises to ˜20 per cent. Approximately ten per cent of projections are expected to be serious, where the two components contribute significant numbers of their red-sequence galaxies to another cluster. Finally, we present a preliminary study of the mass-richness calibration using velocity dispersions to probe the dynamical masses of the clusters. We find a relation broadly consistent with that seen in the local universe from the WINGS sample at z ˜ 0.05.

  9. Unequal cluster sizes in stepped-wedge cluster randomised trials: a systematic review.

    PubMed

    Kristunas, Caroline; Morris, Tom; Gray, Laura

    2017-11-15

    To investigate the extent to which cluster sizes vary in stepped-wedge cluster randomised trials (SW-CRT) and whether any variability is accounted for during the sample size calculation and analysis of these trials. Any, not limited to healthcare settings. Any taking part in an SW-CRT published up to March 2016. The primary outcome is the variability in cluster sizes, measured by the coefficient of variation (CV) in cluster size. Secondary outcomes include the difference between the cluster sizes assumed during the sample size calculation and those observed during the trial, any reported variability in cluster sizes and whether the methods of sample size calculation and methods of analysis accounted for any variability in cluster sizes. Of the 101 included SW-CRTs, 48% mentioned that the included clusters were known to vary in size, yet only 13% of these accounted for this during the calculation of the sample size. However, 69% of the trials did use a method of analysis appropriate for when clusters vary in size. Full trial reports were available for 53 trials. The CV was calculated for 23 of these: the median CV was 0.41 (IQR: 0.22-0.52). Actual cluster sizes could be compared with those assumed during the sample size calculation for 14 (26%) of the trial reports; the cluster sizes were between 29% and 480% of that which had been assumed. Cluster sizes often vary in SW-CRTs. Reporting of SW-CRTs also remains suboptimal. The effect of unequal cluster sizes on the statistical power of SW-CRTs needs further exploration and methods appropriate to studies with unequal cluster sizes need to be employed. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  10. Application of lot quality assurance sampling for leprosy elimination monitoring--examination of some critical factors.

    PubMed

    Gupte, M D; Murthy, B N; Mahmood, K; Meeralakshmi, S; Nagaraju, B; Prabhakaran, R

    2004-04-01

    The concept of elimination of an infectious disease is different from eradication and in a way from control as well. In disease elimination programmes the desired reduced level of prevalence is set up as the target to be achieved in a practical time frame. Elimination can be considered in the context of national or regional levels. Prevalence levels depend on occurrence of new cases and thus could remain fluctuating. There are no ready pragmatic methods to monitor the progress of leprosy elimination programmes. We therefore tried to explore newer methods to answer these demands. With the lowering of prevalence of leprosy to the desired level of 1 case per 10000 population at the global level, the programme administrators' concern will be shifted to smaller areas e.g. national and sub-national levels. For monitoring this situation, we earlier observed that lot quality assurance sampling (LQAS), a quality control tool in industry was useful in the initially high endemic areas. However, critical factors such as geographical distribution of cases and adoption of cluster sampling design instead of simple random sampling design deserve attention before LQAS could generally be recommended. The present exercise was aimed at validating applicability of LQAS, and adopting these modifications for monitoring leprosy elimination in Tamil Nadu state, which was highly endemic for leprosy. A representative sample of 64000 people drawn from eight districts of Tamil Nadu state, India, with maximum allowable number of 25 cases was considered, using LQAS methodology to test whether leprosy prevalence was at or below 7 per 10000 population. Expected number of cases for each district was obtained assuming Poisson distribution. Goodness of fit for the observed and expected cases (closeness of the expected number of cases to those observed) was tested through chi(2). Enhancing factor (design effect) for sample size was obtained by computing the intraclass correlation. The survey actually covered a population of 62157 individuals, of whom 56469 (90.8%) were examined. Ninety-six cases were detected and this number far exceeded the critical value of 25. The number of cases for each district and the number of cases in the entire surveyed area both followed Poisson distribution. The intraclass correlation coefficients were close to zero and the design effect was observed to be close to one. Based on the LQAS exercises leprosy prevalence in the state of Tamil Nadu in India was above 7 per 10000. LQAS method using clusters was validated for monitoring leprosy elimination in high endemic areas. Use of cluster sampling makes this method further useful as a rapid assessment procedure. This method needs to be tested for its applicability in moderate and low endemic areas, where the sample size may need increasing. It is further possible to consider LQAS as a monitoring tool for elimination programmes with respect to other disease conditions.

  11. Differentiating Botulinum Neurotoxin-Producing Clostridia with a Simple, Multiplex PCR Assay.

    PubMed

    Williamson, Charles H D; Vazquez, Adam J; Hill, Karen; Smith, Theresa J; Nottingham, Roxanne; Stone, Nathan E; Sobek, Colin J; Cocking, Jill H; Fernández, Rafael A; Caballero, Patricia A; Leiser, Owen P; Keim, Paul; Sahl, Jason W

    2017-09-15

    Diverse members of the genus Clostridium produce botulinum neurotoxins (BoNTs), which cause a flaccid paralysis known as botulism. While multiple species of clostridia produce BoNTs, the majority of human botulism cases have been attributed to Clostridium botulinum groups I and II. Recent comparative genomic studies have demonstrated the genomic diversity within these BoNT-producing species. This report introduces a multiplex PCR assay for differentiating members of C. botulinum group I, C. sporogenes , and two major subgroups within C. botulinum group II. Coding region sequences unique to each of the four species/subgroups were identified by in silico analyses of thousands of genome assemblies, and PCR primers were designed to amplify each marker. The resulting multiplex PCR assay correctly assigned 41 tested isolates to the appropriate species or subgroup. A separate PCR assay to determine the presence of the ntnh gene (a gene associated with the botulinum neurotoxin gene cluster) was developed and validated. The ntnh gene PCR assay provides information about the presence or absence of the botulinum neurotoxin gene cluster and the type of gene cluster present ( ha positive [ ha + ] or orfX + ). The increased availability of whole-genome sequence data and comparative genomic tools enabled the design of these assays, which provide valuable information for characterizing BoNT-producing clostridia. The PCR assays are rapid, inexpensive tests that can be applied to a variety of sample types to assign isolates to species/subgroups and to detect clostridia with botulinum neurotoxin gene ( bont ) clusters. IMPORTANCE Diverse clostridia produce the botulinum neurotoxin, one of the most potent known neurotoxins. In this study, a multiplex PCR assay was developed to differentiate clostridia that are most commonly isolated in connection with human botulism cases: C. botulinum group I, C. sporogenes , and two major subgroups within C. botulinum group II. Since BoNT-producing and nontoxigenic isolates can be found in each species, a PCR assay to determine the presence of the ntnh gene, which is a universally present component of bont gene clusters, and to provide information about the type ( ha + or orfX + ) of bont gene cluster present in a sample was also developed. The PCR assays provide simple, rapid, and inexpensive tools for screening uncharacterized isolates from clinical or environmental samples. The information provided by these assays can inform epidemiological studies, aid with identifying mixtures of isolates and unknown isolates in culture collections, and confirm the presence of bacteria of interest. Copyright © 2017 Williamson et al.

  12. X-ray and optical substructures of the DAFT/FADA survey clusters

    NASA Astrophysics Data System (ADS)

    Guennou, L.; Durret, F.; Adami, C.; Lima Neto, G. B.

    2013-04-01

    We have undertaken the DAFT/FADA survey with the double aim of setting constraints on dark energy based on weak lensing tomography and of obtaining homogeneous and high quality data for a sample of 91 massive clusters in the redshift range 0.4-0.9 for which there were HST archive data. We have analysed the XMM-Newton data available for 42 of these clusters to derive their X-ray temperatures and luminosities and search for substructures. Out of these, a spatial analysis was possible for 30 clusters, but only 23 had deep enough X-ray data for a really robust analysis. This study was coupled with a dynamical analysis for the 26 clusters having at least 30 spectroscopic galaxy redshifts in the cluster range. Altogether, the X-ray sample of 23 clusters and the optical sample of 26 clusters have 14 clusters in common. We present preliminary results on the coupled X-ray and dynamical analyses of these 14 clusters.

  13. Consumers' Kansei Needs Clustering Method for Product Emotional Design Based on Numerical Design Structure Matrix and Genetic Algorithms.

    PubMed

    Yang, Yan-Pu; Chen, Deng-Kai; Gu, Rong; Gu, Yu-Feng; Yu, Sui-Huai

    2016-01-01

    Consumers' Kansei needs reflect their perception about a product and always consist of a large number of adjectives. Reducing the dimension complexity of these needs to extract primary words not only enables the target product to be explicitly positioned, but also provides a convenient design basis for designers engaging in design work. Accordingly, this study employs a numerical design structure matrix (NDSM) by parameterizing a conventional DSM and integrating genetic algorithms to find optimum Kansei clusters. A four-point scale method is applied to assign link weights of every two Kansei adjectives as values of cells when constructing an NDSM. Genetic algorithms are used to cluster the Kansei NDSM and find optimum clusters. Furthermore, the process of the proposed method is presented. The details of the proposed approach are illustrated using an example of electronic scooter for Kansei needs clustering. The case study reveals that the proposed method is promising for clustering Kansei needs adjectives in product emotional design.

  14. Consumers' Kansei Needs Clustering Method for Product Emotional Design Based on Numerical Design Structure Matrix and Genetic Algorithms

    PubMed Central

    Chen, Deng-kai; Gu, Rong; Gu, Yu-feng; Yu, Sui-huai

    2016-01-01

    Consumers' Kansei needs reflect their perception about a product and always consist of a large number of adjectives. Reducing the dimension complexity of these needs to extract primary words not only enables the target product to be explicitly positioned, but also provides a convenient design basis for designers engaging in design work. Accordingly, this study employs a numerical design structure matrix (NDSM) by parameterizing a conventional DSM and integrating genetic algorithms to find optimum Kansei clusters. A four-point scale method is applied to assign link weights of every two Kansei adjectives as values of cells when constructing an NDSM. Genetic algorithms are used to cluster the Kansei NDSM and find optimum clusters. Furthermore, the process of the proposed method is presented. The details of the proposed approach are illustrated using an example of electronic scooter for Kansei needs clustering. The case study reveals that the proposed method is promising for clustering Kansei needs adjectives in product emotional design. PMID:27630709

  15. A modified cluster-sampling method for post-disaster rapid assessment of needs.

    PubMed Central

    Malilay, J.; Flanders, W. D.; Brogan, D.

    1996-01-01

    The cluster-sampling method can be used to conduct rapid assessment of health and other needs in communities affected by natural disasters. It is modelled on WHO's Expanded Programme on Immunization method of estimating immunization coverage, but has been modified to provide (1) estimates of the population remaining in an area, and (2) estimates of the number of people in the post-disaster area with specific needs. This approach differs from that used previously in other disasters where rapid needs assessments only estimated the proportion of the population with specific needs. We propose a modified n x k survey design to estimate the remaining population, severity of damage, the proportion and number of people with specific needs, the number of damaged or destroyed and remaining housing units, and the changes in these estimates over a period of time as part of the survey. PMID:8823962

  16. Travel Time Estimation Using Freeway Point Detector Data Based on Evolving Fuzzy Neural Inference System.

    PubMed

    Tang, Jinjun; Zou, Yajie; Ash, John; Zhang, Shen; Liu, Fang; Wang, Yinhai

    2016-01-01

    Travel time is an important measurement used to evaluate the extent of congestion within road networks. This paper presents a new method to estimate the travel time based on an evolving fuzzy neural inference system. The input variables in the system are traffic flow data (volume, occupancy, and speed) collected from loop detectors located at points both upstream and downstream of a given link, and the output variable is the link travel time. A first order Takagi-Sugeno fuzzy rule set is used to complete the inference. For training the evolving fuzzy neural network (EFNN), two learning processes are proposed: (1) a K-means method is employed to partition input samples into different clusters, and a Gaussian fuzzy membership function is designed for each cluster to measure the membership degree of samples to the cluster centers. As the number of input samples increases, the cluster centers are modified and membership functions are also updated; (2) a weighted recursive least squares estimator is used to optimize the parameters of the linear functions in the Takagi-Sugeno type fuzzy rules. Testing datasets consisting of actual and simulated data are used to test the proposed method. Three common criteria including mean absolute error (MAE), root mean square error (RMSE), and mean absolute relative error (MARE) are utilized to evaluate the estimation performance. Estimation results demonstrate the accuracy and effectiveness of the EFNN method through comparison with existing methods including: multiple linear regression (MLR), instantaneous model (IM), linear model (LM), neural network (NN), and cumulative plots (CP).

  17. Travel Time Estimation Using Freeway Point Detector Data Based on Evolving Fuzzy Neural Inference System

    PubMed Central

    Tang, Jinjun; Zou, Yajie; Ash, John; Zhang, Shen; Liu, Fang; Wang, Yinhai

    2016-01-01

    Travel time is an important measurement used to evaluate the extent of congestion within road networks. This paper presents a new method to estimate the travel time based on an evolving fuzzy neural inference system. The input variables in the system are traffic flow data (volume, occupancy, and speed) collected from loop detectors located at points both upstream and downstream of a given link, and the output variable is the link travel time. A first order Takagi-Sugeno fuzzy rule set is used to complete the inference. For training the evolving fuzzy neural network (EFNN), two learning processes are proposed: (1) a K-means method is employed to partition input samples into different clusters, and a Gaussian fuzzy membership function is designed for each cluster to measure the membership degree of samples to the cluster centers. As the number of input samples increases, the cluster centers are modified and membership functions are also updated; (2) a weighted recursive least squares estimator is used to optimize the parameters of the linear functions in the Takagi-Sugeno type fuzzy rules. Testing datasets consisting of actual and simulated data are used to test the proposed method. Three common criteria including mean absolute error (MAE), root mean square error (RMSE), and mean absolute relative error (MARE) are utilized to evaluate the estimation performance. Estimation results demonstrate the accuracy and effectiveness of the EFNN method through comparison with existing methods including: multiple linear regression (MLR), instantaneous model (IM), linear model (LM), neural network (NN), and cumulative plots (CP). PMID:26829639

  18. Automated Comparative Metabolite Profiling of Large LC-ESIMS Data Sets in an ACD/MS Workbook Suite Add-in, and Data Clustering on a New Open-Source Web Platform FreeClust.

    PubMed

    Božičević, Alen; Dobrzyński, Maciej; De Bie, Hans; Gafner, Frank; Garo, Eliane; Hamburger, Matthias

    2017-12-05

    The technological development of LC-MS instrumentation has led to significant improvements of performance and sensitivity, enabling high-throughput analysis of complex samples, such as plant extracts. Most software suites allow preprocessing of LC-MS chromatograms to obtain comprehensive information on single constituents. However, more advanced processing needs, such as the systematic and unbiased comparative metabolite profiling of large numbers of complex LC-MS chromatograms remains a challenge. Currently, users have to rely on different tools to perform such data analyses. We developed a two-step protocol comprising a comparative metabolite profiling tool integrated in ACD/MS Workbook Suite, and a web platform developed in R language designed for clustering and visualization of chromatographic data. Initially, all relevant chromatographic and spectroscopic data (retention time, molecular ions with the respective ion abundance, and sample names) are automatically extracted and assembled in an Excel spreadsheet. The file is then loaded into an online web application that includes various statistical algorithms and provides the user with tools to compare and visualize the results in intuitive 2D heatmaps. We applied this workflow to LC-ESIMS profiles obtained from 69 honey samples. Within few hours of calculation with a standard PC, honey samples were preprocessed and organized in clusters based on their metabolite profile similarities, thereby highlighting the common metabolite patterns and distributions among samples. Implementation in the ACD/Laboratories software package enables ulterior integration of other analytical data, and in silico prediction tools for modern drug discovery.

  19. Recommendations and administration of the HPV vaccine to 11- to 12-year-old girls and boys: a statewide survey of Georgia vaccines for children provider practices.

    PubMed

    Luque, John S; Tarasenko, Yelena N; Dixon, Betty T; Vogel, Robert L; Tedders, Stuart H

    2014-10-01

    This study explores the prevalence and provider- and practice-related correlates of physician recommendation and administration of the quadrivalent human papillomavirus (HPV) vaccine, Gardasil, to 11- to 12-year-old girls and the intention to recommend the HPV vaccine to 11- to 12-year-old boys in Georgia. The study also describes physician knowledge about and barriers to HPV vaccination. This cross-sectional study was conducted from December 2010 to February 2011. The study sample was drawn using the Georgia Vaccines for Children (VFC) provider list as a sampling frame and probability 1-stage cluster sampling with counties as clusters. The final analytic sample was restricted to 206 provider locations. Weighted percentages and corresponding statistics were calculated accounting for selection probabilities, nonresponse, and the cluster sample design. Among Georgia VFC providers attending to 11- to 12-year-old girls, 46% had always recommended that their patients get the HPV vaccination and 41% had vaccinated their female patients. Among Georgia VFC providers attending to 11- to 12-year-old boys, 20% would always recommend that their male patients get vaccinated.Physicians most frequently endorsed costs of stocking the vaccine (73%), upfront costs (69%), vaccination (68%), and insurance reimbursements (63%) as barriers to their HPV vaccination practices. Despite the Advisory Committee on Immunization Practices' recommendations on HPV vaccination, the prevalence of recommending and administering the HPV vaccine to female and male patients, aged 11 to 12 years, by VFC providers is an ongoing challenge in Georgia.

  20. Exploring the IMF of star clusters: a joint SLUG and LEGUS effort

    NASA Astrophysics Data System (ADS)

    Ashworth, G.; Fumagalli, M.; Krumholz, M. R.; Adamo, A.; Calzetti, D.; Chandar, R.; Cignoni, M.; Dale, D.; Elmegreen, B. G.; Gallagher, J. S., III; Gouliermis, D. A.; Grasha, K.; Grebel, E. K.; Johnson, K. E.; Lee, J.; Tosi, M.; Wofford, A.

    2017-08-01

    We present the implementation of a Bayesian formalism within the Stochastically Lighting Up Galaxies (slug) stellar population synthesis code, which is designed to investigate variations in the initial mass function (IMF) of star clusters. By comparing observed cluster photometry to large libraries of clusters simulated with a continuously varying IMF, our formalism yields the posterior probability distribution function (PDF) of the cluster mass, age and extinction, jointly with the parameters describing the IMF. We apply this formalism to a sample of star clusters from the nearby galaxy NGC 628, for which broad-band photometry in five filters is available as part of the Legacy ExtraGalactic UV Survey (LEGUS). After allowing the upper-end slope of the IMF (α3) to vary, we recover PDFs for the mass, age and extinction that are broadly consistent with what is found when assuming an invariant Kroupa IMF. However, the posterior PDF for α3 is very broad due to a strong degeneracy with the cluster mass, and it is found to be sensitive to the choice of priors, particularly on the cluster mass. We find only a modest improvement in the constraining power of α3 when adding Hα photometry from the companion Hα-LEGUS survey. Conversely, Hα photometry significantly improves the age determination, reducing the frequency of multi-modal PDFs. With the aid of mock clusters, we quantify the degeneracy between physical parameters, showing how constraints on the cluster mass that are independent of photometry can be used to pin down the IMF properties of star clusters.

  1. Physiogenomic analysis of the Puerto Rican population.

    PubMed

    Ruaño, Gualberto; Duconge, Jorge; Windemuth, Andreas; Cadilla, Carmen L; Kocherla, Mohan; Villagra, David; Renta, Jessica; Holford, Theodore; Santiago-Borrero, Pedro J

    2009-04-01

    Admixture in the population of the island of Puerto Rico is of general interest with regards to pharmacogenetics to develop comprehensive strategies for personalized healthcare in Latin Americans. This research was aimed at determining the frequencies of SNPs in key physiological, pharmacological and biochemical genes to infer population structure and ancestry in the Puerto Rican population. A noninterventional, cross-sectional, retrospective study design was implemented following a controlled, stratified-by-region, random sampling protocol. The sample was based on birthrates in each region of the island of Puerto Rico, according to the 2004 National Birth Registry. Genomic DNA samples from 100 newborns were obtained from the Puerto Rico Newborn Screening Program in dried-blood spot cards. Genotyping using a physiogenomic array was performed for 332 SNPs from 196 cardiometabolic and neuroendocrine genes. Population structure was examined using a Bayesian clustering approach as well as by allelic dissimilarity as a measure of allele sharing. The Puerto Rican sample was found to be broadly heterogeneous. We observed three main clusters in the population, which we hypothesize to reflect the historical admixture in the Puerto Rican population from Amerindian, African and European ancestors. We present evidence for this interpretation by comparing allele frequencies for the three clusters with those for the same SNPs available from the International HapMap project for Asian, African and European populations. Our results demonstrate that population analysis can be performed with a physiogenomic array of cardiometabolic and neuroendocrine genes to facilitate the translation of genome diversity into personalized medicine.

  2. The utility of rural and underserved designations in geospatial assessments of distance traveled to healthcare services: implications for public health research and practice.

    PubMed

    Smith, Matthew Lee; Dickerson, Justin B; Wendel, Monica L; Ahn, Sangnam; Pulczinski, Jairus C; Drake, Kelly N; Ory, Marcia G

    2013-01-01

    Health disparities research in rural populations is based on several common taxonomies identified by geography and population density. However, little is known about the implications of different rurality definitions on public health outcomes. To help illuminate the meaning of different rural designations often used in research, service delivery, or policy reports, this study will (1) review the different definitions of rurality and their purposes; (2) identify the overlap of various rural designations in an eight-county Brazos Valley region in Central Texas; (3) describe participant characteristic profiles based on distances traveled to obtain healthcare services; and (4) examine common profile characteristics associated with each designation. Data were analyzed from a random sample from 1,958 Texas adults participating in a community assessment. K-means cluster analysis was used to identify natural groupings of individuals based on distance traveled to obtain three healthcare services: medical care, dental care, and prescription medication pick-up. Significant variation in cluster representation and resident characteristics was observed by rural designation. Given widely used taxonomies for designating areas as rural (or provider shortage) in health-related research, this study highlights differences that could influence research results and subsequent program and policy development based on rural designation.

  3. Weak lensing magnification of SpARCS galaxy clusters

    NASA Astrophysics Data System (ADS)

    Tudorica, A.; Hildebrandt, H.; Tewes, M.; Hoekstra, H.; Morrison, C. B.; Muzzin, A.; Wilson, G.; Yee, H. K. C.; Lidman, C.; Hicks, A.; Nantais, J.; Erben, T.; van der Burg, R. F. J.; Demarco, R.

    2017-12-01

    Context. Measuring and calibrating relations between cluster observables is critical for resource-limited studies. The mass-richness relation of clusters offers an observationally inexpensive way of estimating masses. Its calibration is essential for cluster and cosmological studies, especially for high-redshift clusters. Weak gravitational lensing magnification is a promising and complementary method to shear studies, that can be applied at higher redshifts. Aims: We aim to employ the weak lensing magnification method to calibrate the mass-richness relation up to a redshift of 1.4. We used the Spitzer Adaptation of the Red-Sequence Cluster Survey (SpARCS) galaxy cluster candidates (0.2 < z < 1.4) and optical data from the Canada France Hawaii Telescope (CFHT) to test whether magnification can be effectively used to constrain the mass of high-redshift clusters. Methods: Lyman-break galaxies (LBGs) selected using the u-band dropout technique and their colours were used as a background sample of sources. LBG positions were cross-correlated with the centres of the sample of SpARCS clusters to estimate the magnification signal, which was optimally-weighted using an externally-calibrated LBG luminosity function. The signal was measured for cluster sub-samples, binned in both redshift and richness. Results: We measured the cross-correlation between the positions of galaxy cluster candidates and LBGs and detected a weak lensing magnification signal for all bins at a detection significance of 2.6-5.5σ. In particular, the significance of the measurement for clusters with z> 1.0 is 4.1σ; for the entire cluster sample we obtained an average M200 of 1.28 -0.21+0.23 × 1014 M⊙. Conclusions: Our measurements demonstrated the feasibility of using weak lensing magnification as a viable tool for determining the average halo masses for samples of high redshift galaxy clusters. The results also established the success of using galaxy over-densities to select massive clusters at z > 1. Additional studies are necessary for further modelling of the various systematic effects we discussed.

  4. High Prevalence of Intermediate Leptospira spp. DNA in Febrile Humans from Urban and Rural Ecuador.

    PubMed

    Chiriboga, Jorge; Barragan, Verónica; Arroyo, Gabriela; Sosa, Andrea; Birdsell, Dawn N; España, Karool; Mora, Ana; Espín, Emilia; Mejía, María Eugenia; Morales, Melba; Pinargote, Carmina; Gonzalez, Manuel; Hartskeerl, Rudy; Keim, Paul; Bretas, Gustavo; Eisenberg, Joseph N S; Trueba, Gabriel

    2015-12-01

    Leptospira spp., which comprise 3 clusters (pathogenic, saprophytic, and intermediate) that vary in pathogenicity, infect >1 million persons worldwide each year. The disease burden of the intermediate leptospires is unclear. To increase knowledge of this cluster, we used new molecular approaches to characterize Leptospira spp. in 464 samples from febrile patients in rural, semiurban, and urban communities in Ecuador; in 20 samples from nonfebrile persons in the rural community; and in 206 samples from animals in the semiurban community. We observed a higher percentage of leptospiral DNA-positive samples from febrile persons in rural (64%) versus urban (21%) and semiurban (25%) communities; no leptospires were detected in nonfebrile persons. The percentage of intermediate cluster strains in humans (96%) was higher than that of pathogenic cluster strains (4%); strains in animal samples belonged to intermediate (49%) and pathogenic (51%) clusters. Intermediate cluster strains may be causing a substantial amount of fever in coastal Ecuador.

  5. High Prevalence of Intermediate Leptospira spp. DNA in Febrile Humans from Urban and Rural Ecuador

    PubMed Central

    Chiriboga, Jorge; Barragan, Verónica; Arroyo, Gabriela; Sosa, Andrea; Birdsell, Dawn N.; España, Karool; Mora, Ana; Espín, Emilia; Mejía, María Eugenia; Morales, Melba; Pinargote, Carmina; Gonzalez, Manuel; Hartskeerl, Rudy; Keim, Paul; Bretas, Gustavo; Eisenberg, Joseph N.S.

    2015-01-01

    Leptospira spp., which comprise 3 clusters (pathogenic, saprophytic, and intermediate) that vary in pathogenicity, infect >1 million persons worldwide each year. The disease burden of the intermediate leptospires is unclear. To increase knowledge of this cluster, we used new molecular approaches to characterize Leptospira spp. in 464 samples from febrile patients in rural, semiurban, and urban communities in Ecuador; in 20 samples from nonfebrile persons in the rural community; and in 206 samples from animals in the semiurban community. We observed a higher percentage of leptospiral DNA–positive samples from febrile persons in rural (64%) versus urban (21%) and semiurban (25%) communities; no leptospires were detected in nonfebrile persons. The percentage of intermediate cluster strains in humans (96%) was higher than that of pathogenic cluster strains (4%); strains in animal samples belonged to intermediate (49%) and pathogenic (51%) clusters. Intermediate cluster strains may be causing a substantial amount of fever in coastal Ecuador. PMID:26583534

  6. Active learning for semi-supervised clustering based on locally linear propagation reconstruction.

    PubMed

    Chang, Chin-Chun; Lin, Po-Yi

    2015-03-01

    The success of semi-supervised clustering relies on the effectiveness of side information. To get effective side information, a new active learner learning pairwise constraints known as must-link and cannot-link constraints is proposed in this paper. Three novel techniques are developed for learning effective pairwise constraints. The first technique is used to identify samples less important to cluster structures. This technique makes use of a kernel version of locally linear embedding for manifold learning. Samples neither important to locally linear propagation reconstructions of other samples nor on flat patches in the learned manifold are regarded as unimportant samples. The second is a novel criterion for query selection. This criterion considers not only the importance of a sample to expanding the space coverage of the learned samples but also the expected number of queries needed to learn the sample. To facilitate semi-supervised clustering, the third technique yields inferred must-links for passing information about flat patches in the learned manifold to semi-supervised clustering algorithms. Experimental results have shown that the learned pairwise constraints can capture the underlying cluster structures and proven the feasibility of the proposed approach. Copyright © 2014 Elsevier Ltd. All rights reserved.

  7. Manual hierarchical clustering of regional geochemical data using a Bayesian finite mixture model

    USGS Publications Warehouse

    Ellefsen, Karl J.; Smith, David

    2016-01-01

    Interpretation of regional scale, multivariate geochemical data is aided by a statistical technique called “clustering.” We investigate a particular clustering procedure by applying it to geochemical data collected in the State of Colorado, United States of America. The clustering procedure partitions the field samples for the entire survey area into two clusters. The field samples in each cluster are partitioned again to create two subclusters, and so on. This manual procedure generates a hierarchy of clusters, and the different levels of the hierarchy show geochemical and geological processes occurring at different spatial scales. Although there are many different clustering methods, we use Bayesian finite mixture modeling with two probability distributions, which yields two clusters. The model parameters are estimated with Hamiltonian Monte Carlo sampling of the posterior probability density function, which usually has multiple modes. Each mode has its own set of model parameters; each set is checked to ensure that it is consistent both with the data and with independent geologic knowledge. The set of model parameters that is most consistent with the independent geologic knowledge is selected for detailed interpretation and partitioning of the field samples.

  8. "A Richness Study of 14 Distant X-Ray Clusters from the 160 Square Degree Survey"

    NASA Technical Reports Server (NTRS)

    Jones, Christine; West, Donald (Technical Monitor)

    2001-01-01

    We have measured the surface density of galaxies toward 14 X-ray-selected cluster candidates at redshifts z(sub i) 0.46, and we show that they are associated with rich galaxy concentrations. These clusters, having X-ray luminosities of Lx(0.5-2 keV) approx. (0.5 - 2.6) x 10(exp 44) ergs/ sec are among the most distant and luminous in our 160 deg(exp 2) ROSAT Position Sensitive Proportional Counter cluster survey. We find that the clusters range between Abell richness classes 0 and 2 and have a most probable richness class of 1. We compare the richness distribution of our distant clusters to those for three samples of nearby clusters with similar X-ray luminosities. We find that the nearby and distant samples have similar richness distributions, which shows that clusters have apparently not evolved substantially in richness since redshift z=0.5. There is, however, a marginal tendency for the distant clusters to be slightly poorer than nearby clusters, although deeper multicolor data for a large sample would be required to confirm this trend. We compare the distribution of distant X-ray clusters in the L(sub X)-richness plane to the distribution of optically selected clusters from the Palomar Distant Cluster Survey. The optically selected clusters appear overly rich for their X-ray luminosities, when compared to X-ray-selected clusters. Apparently, X-ray and optical surveys do not necessarily sample identical mass concentrations at large redshifts. This may indicate the existence of a population of optically rich clusters with anomalously low X-ray emission, More likely, however, it reflects the tendency for optical surveys to select unvirialized mass concentrations, as might be expected when peering along large-scale filaments.

  9. Applying the Hájek Approach in Formula-Based Variance Estimation. Research Report. ETS RR-17-24

    ERIC Educational Resources Information Center

    Qian, Jiahe

    2017-01-01

    The variance formula derived for a two-stage sampling design without replacement employs the joint inclusion probabilities in the first-stage selection of clusters. One of the difficulties encountered in data analysis is the lack of information about such joint inclusion probabilities. One way to solve this issue is by applying Hájek's…

  10. US forests are showing increased rates of decline in response to a changing climate

    Treesearch

    Warren B. Cohen; Zhiqiang Yang; David M. Bell; Stephen V. Stehman

    2015-01-01

    How vulnerable are US forest to a changing climate? We answer this question using Landsat time series data and a unique interpretation approach, TimeSync, a plot-based Landsat visualization and data collection tool. Original analyses were based on a stratified two-stage cluster sample design that included interpretation of 3858 forested plots. From these data, we...

  11. The Hispanic Americans Baseline Alcohol Survey (HABLAS): Acculturation, Birthplace and Alcohol-Related Social Problems across Hispanic National Groups

    ERIC Educational Resources Information Center

    Caetano, Raul; Vaeth, Patrice A. C.; Rodriguez, Lori A.

    2012-01-01

    The purpose of this study was to examine the association between acculturation, birthplace, and alcohol-related social problems across Hispanic national groups. A total of 5,224 Hispanic adults (18+ years) were interviewed using a multistage cluster sample design in Miami, New York, Philadelphia, Houston, and Los Angeles. Multivariate analysis…

  12. Individual and Familial Correlates of Career Salience among Upwardly Mobile College Women. Final Report.

    ERIC Educational Resources Information Center

    Guttmacher, Mary Johnson

    A case study was conducted using a sample of 271 women selected from a state college by a stratified random cluster technique that approximates proportional representation of women in all four classes and all college majors. The data source was an extensive questionnaire designed to measure the attitudes and behavior of interest. The major…

  13. Relationship between Geography-Tourism and Tourism's Effects According to High School Students

    ERIC Educational Resources Information Center

    Koca, Nusret; Yildirim, Ramazan

    2018-01-01

    This research was designed in the screening model to determine the opinions of high school students on tourism effects and geography-tourism relations. The data were gathered from 760 students who were educated in high schools in the central district of Kütahya, identified by cluster sampling method. The data were collected with the help of a…

  14. School and Emotional Well-Being: A Transcultural Analysis on Youth in Southern Spain

    ERIC Educational Resources Information Center

    Soriano, Encarnación; Cala, Verónica C. C.

    2018-01-01

    Purpose: The purpose of this paper is to assess and compare school well-being (SW) and emotional well-being (EW) among Romanian, Moroccan and Spanish youth, to determine the degree of relation between EW and scholar well-being. Design/methodology/approach: The paper employed cross-sectional research with cluster sampling in two primary schools and…

  15. Ecological Research Division Theoretical Ecology Program. [Contains abstracts

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Not Available

    1990-10-01

    This report presents the goals of the Theoretical Ecology Program and abstracts of research in progress. Abstracts cover both theoretical research that began as part of the terrestrial ecology core program and new projects funded by the theoretical program begun in 1988. Projects have been clustered into four major categories: Ecosystem dynamics; landscape/scaling dynamics; population dynamics; and experiment/sample design.

  16. High-Risk Sexual Behavior among Students of a Minority-Serving University in a Community with a High HIV/AIDS Prevalence

    ERIC Educational Resources Information Center

    Trepka, Mary Jo; Kim, Sunny; Pekovic, Vukosava; Zamor, Peggy; Velez, Elvira; Gabaroni, Mariela V.

    2008-01-01

    Objective: The authors used a stratified cluster sampling design to inform campus sexually transmitted diseases prevention programs. Participants and Methods: They conducted a cross-sectional study of students (N = 1,130) at a large, urban, minority-serving university in South Florida using the 2004 National College Health Assessment Survey…

  17. Effects of a Worksite Tobacco Control Intervention in India: The Mumbai Worksite Tobacco Control Study, a Cluster Randomized Trial

    PubMed Central

    Sorensen, Glorian; Pednekar, Mangesh; Cordeira, Laura Shulman; Pawar, Pratibha; Nagler, Eve; Stoddard, Anne M.; Kim, Hae-Young; Gupta, Prakash C.

    2016-01-01

    Objectives We assessed a worksite intervention designed to promote tobacco control among manufacturing workers in Greater Mumbai, India. Methods We used a cluster-randomized design to test an integrated health promotion/health protection intervention, which addressed changes at the management and worker levels. Between July 2012 and July 2013, we recruited 20 worksites on a rolling basis and randomly assigned them to intervention or delayed-intervention control conditions. The follow-up survey was conducted between December 2013 and November 2014. Results The difference in 30-day quit rates between intervention and control conditions was statistically significant for production workers (OR=2.25, P=0.03), although not for the overall sample (OR=1.70; P=0.12). The intervention resulted in a doubling of the 6-month cessation rates among workers in the intervention worksites compared to those in the control, for production workers (OR=2.29; P=0.07) and for the overall sample (OR=1.81; P=0.13), but the difference did not reach statistical significance. Conclusions These findings demonstrate the potential impact of a tobacco control intervention that combined tobacco control and health protection programming within Indian manufacturing worksites. PMID:26883793

  18. The development and validity of the Salford Gait Tool: an observation-based clinical gait assessment tool.

    PubMed

    Toro, Brigitte; Nester, Christopher J; Farren, Pauline C

    2007-03-01

    To develop the construct, content, and criterion validity of the Salford Gait Tool (SF-GT) and to evaluate agreement between gait observations using the SF-GT and kinematic gait data. Tool development and comparative evaluation. University in the United Kingdom. For designing construct and content validity, convenience samples of 10 children with hemiplegic, diplegic, and quadriplegic cerebral palsy (CP) and 152 physical therapy students and 4 physical therapists were recruited. For developing criterion validity, kinematic gait data of 13 gait clusters containing 56 children with hemiplegic, diplegic, and quadriplegic CP and 11 neurologically intact children was used. For clinical evaluation, a convenience sample of 23 pediatric physical therapists participated. We developed a sagittal plane observational gait assessment tool through a series of design, test, and redesign iterations. The tool's grading system was calibrated using kinematic gait data of 13 gait clusters and was evaluated by comparing the agreement of gait observations using the SF-GT with kinematic gait data. Criterion standard kinematic gait data. There was 58% mean agreement based on grading categories and 80% mean agreement based on degree estimations evaluated with the least significant difference method. The new SF-GT has good concurrent criterion validity.

  19. Measuring Clinical Decision Support Influence on Evidence-Based Nursing Practice.

    PubMed

    Cortez, Susan; Dietrich, Mary S; Wells, Nancy

    2016-07-01

    To measure the effect of clinical decision support (CDS) on oncology nurse evidence-based practice (EBP).
. Longitudinal cluster-randomized design.
. Four distinctly separate oncology clinics associated with an academic medical center.
. The study sample was comprised of randomly selected data elements from the nursing documentation software. The data elements were patient-reported symptoms and the associated nurse interventions. The total sample observations were 600, derived from a baseline, posteducation, and postintervention sample of 200 each (100 in the intervention group and 100 in the control group for each sample).
. The cluster design was used to support randomization of the study intervention at the clinic level rather than the individual participant level to reduce possible diffusion of the study intervention. An elongated data collection cycle (11 weeks) controlled for temporary increases in nurse EBP related to the education or CDS intervention.
. The dependent variable was the nurse evidence-based documentation rate, calculated from the nurse-documented interventions. The independent variable was the CDS added to the nursing documentation software.
. The average EBP rate at baseline for the control and intervention groups was 27%. After education, the average EBP rate increased to 37%, and then decreased to 26% in the postintervention sample. Mixed-model linear statistical analysis revealed no significant interaction of group by sample. The CDS intervention did not result in an increase in nurse EBP.
. EBP education increased nurse EBP documentation rates significantly but only temporarily. Nurses may have used evidence in practice but may not have documented their interventions.
. More research is needed to understand the complex relationship between CDS, nursing practice, and nursing EBP intervention documentation. CDS may have a different effect on nurse EBP, physician EBP, and other medical professional EBP.

  20. Community detection using Kernel Spectral Clustering with memory

    NASA Astrophysics Data System (ADS)

    Langone, Rocco; Suykens, Johan A. K.

    2013-02-01

    This work is related to the problem of community detection in dynamic scenarios, which for instance arises in the segmentation of moving objects, clustering of telephone traffic data, time-series micro-array data etc. A desirable feature of a clustering model which has to capture the evolution of communities over time is the temporal smoothness between clusters in successive time-steps. In this way the model is able to track the long-term trend and in the same time it smooths out short-term variation due to noise. We use the Kernel Spectral Clustering with Memory effect (MKSC) which allows to predict cluster memberships of new nodes via out-of-sample extension and has a proper model selection scheme. It is based on a constrained optimization formulation typical of Least Squares Support Vector Machines (LS-SVM), where the objective function is designed to explicitly incorporate temporal smoothness as a valid prior knowledge. The latter, in fact, allows the model to cluster the current data well and to be consistent with the recent history. Here we propose a generalization of the MKSC model with an arbitrary memory, not only one time-step in the past. The experiments conducted on toy problems confirm our expectations: the more memory we add to the model, the smoother over time are the clustering results. We also compare with the Evolutionary Spectral Clustering (ESC) algorithm which is a state-of-the art method, and we obtain comparable or better results.

  1. The XXL survey XV: evidence for dry merger driven BCG growth in XXL-100-GC X-ray clusters

    NASA Astrophysics Data System (ADS)

    Lavoie, S.; Willis, J. P.; Démoclès, J.; Eckert, D.; Gastaldello, F.; Smith, G. P.; Lidman, C.; Adami, C.; Pacaud, F.; Pierre, M.; Clerc, N.; Giles, P.; Lieu, M.; Chiappetti, L.; Altieri, B.; Ardila, F.; Baldry, I.; Bongiorno, A.; Desai, S.; Elyiv, A.; Faccioli, L.; Gardner, B.; Garilli, B.; Groote, M. W.; Guennou, L.; Guzzo, L.; Hopkins, A. M.; Liske, J.; McGee, S.; Melnyk, O.; Owers, M. S.; Poggianti, B.; Ponman, T. J.; Scodeggio, M.; Spitler, L.; Tuffs, R. J.

    2016-11-01

    The growth of brightest cluster galaxies (BCGs) is closely related to the properties of their host cluster. We present evidence for dry mergers as the dominant source of BCG mass growth at z ≲ 1 in the XXL 100 brightest cluster sample. We use the global red sequence, Hα emission and mean star formation history to show that BCGs in the sample possess star formation levels comparable to field ellipticals of similar stellar mass and redshift. XXL 100 brightest clusters are less massive on average than those in other X-ray selected samples such as LoCuSS or HIFLUGCS. Few clusters in the sample display high central gas concentration, rendering inefficient the growth of BCGs via star formation resulting from the accretion of cool gas. Using measures of the relaxation state of their host clusters, we show that BCGs grow as relaxation proceeds. We find that the BCG stellar mass corresponds to a relatively constant fraction 1 per cent of the total cluster mass in relaxed systems. We also show that, following a cluster scale merger event, the BCG stellar mass lags behind the expected value from the Mcluster-MBCG relation but subsequently accretes stellar mass via dry mergers as the BCG and cluster evolve towards a relaxed state.

  2. Active Learning Using Hint Information.

    PubMed

    Li, Chun-Liang; Ferng, Chun-Sung; Lin, Hsuan-Tien

    2015-08-01

    The abundance of real-world data and limited labeling budget calls for active learning, an important learning paradigm for reducing human labeling efforts. Many recently developed active learning algorithms consider both uncertainty and representativeness when making querying decisions. However, exploiting representativeness with uncertainty concurrently usually requires tackling sophisticated and challenging learning tasks, such as clustering. In this letter, we propose a new active learning framework, called hinted sampling, which takes both uncertainty and representativeness into account in a simpler way. We design a novel active learning algorithm within the hinted sampling framework with an extended support vector machine. Experimental results validate that the novel active learning algorithm can result in a better and more stable performance than that achieved by state-of-the-art algorithms. We also show that the hinted sampling framework allows improving another active learning algorithm designed from the transductive support vector machine.

  3. The Atacama Cosmology Telescope: Cosmology from Galaxy Clusters Detected Via the Sunyaev-Zel'dovich Effect

    NASA Technical Reports Server (NTRS)

    Sehgal, Neelima; Trac, Hy; Acquaviva, Viviana; Ade, Peter A. R.; Aguirre, Paula; Amiri, Mandana; Appel, John W.; Barrientos, L. Felipe; Battistelli, Elia S.; Bond, J. Richard; hide

    2010-01-01

    We present constraints on cosmological parameters based on a sample of Sunyaev-Zel'dovich-selected galaxy clusters detected in a millimeter-wave survey by the Atacama Cosmology Telescope. The cluster sample used in this analysis consists of 9 optically-confirmed high-mass clusters comprising the high-significance end of the total cluster sample identified in 455 square degrees of sky surveyed during 2008 at 148 GHz. We focus on the most massive systems to reduce the degeneracy between unknown cluster astrophysics and cosmology derived from SZ surveys. We describe the scaling relation between cluster mass and SZ signal with a 4-parameter fit. Marginalizing over the values of the parameters in this fit with conservative priors gives (sigma)8 = 0.851 +/- 0.115 and w = -1.14 +/- 0.35 for a spatially-flat wCDM cosmological model with WMAP 7-year priors on cosmological parameters. This gives a modest improvement in statistical uncertainty over WMAP 7-year constraints alone. Fixing the scaling relation between cluster mass and SZ signal to a fiducial relation obtained from numerical simulations and calibrated by X-ray observations, we find (sigma)8 + 0.821 +/- 0.044 and w = -1.05 +/- 0.20. These results are consistent with constraints from WMAP 7 plus baryon acoustic oscillations plus type Ia supernova which give (sigma)8 = 0.802 +/- 0.038 and w = -0.98 +/- 0.053. A stacking analysis of the clusters in this sample compared to clusters simulated assuming the fiducial model also shows good agreement. These results suggest that, given the sample of clusters used here, both the astrophysics of massive clusters and the cosmological parameters derived from them are broadly consistent with current models.

  4. Uranium hydrogeochemical and stream sediment reconnaissance of the Arminto NTMS quadrangle, Wyoming, including concentrations of forty-three additional elements

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Morgan, T.L.

    1979-11-01

    During the summers of 1976 and 1977, 570 water and 1249 sediment samples were collected from 1517 locations within the 18,000-km/sup 2/ area of the Arminto NTMS quadrangle of central Wyoming. Water samples were collected from wells, springs, streams, and artifical ponds; sediment samples were collected from wet and dry streams, springs, and wet and dry ponds. All water samples were analyzed for 13 elements, including uranium, and each sediment sample was analyzed for 43 elements, including uranium and thorium. Uranium concentrations in water samples range from below the detection limit to 84.60 parts per billion (ppb) with a meanmore » of 4.32 ppb. All water sample types except pond water samples were considered as a single population in interpreting the data. Pond water samples were excluded due to possible concentration of uranium by evaporation. Most of the water samples containing greater than 20 ppb uranium grouped into six clusters that indicate possible areas of interest for further investigation. One cluster is associated with the Pumpkin Buttes District, and two others are near the Kaycee and Mayoworth areas of uranium mineralization. The largest cluster is located on the west side of the Powder River Basin. One cluster is located in the central Big Horn Basin and another is in the Wind River Basin; both are in areas underlain by favorable host units. Uranium concentrations in sediment samples range from 0.08 parts per million (ppm) to 115.50 ppm with a mean of 3.50 ppm. Two clusters of sediment samples over 7 ppm were delineated. The first, containing the two highest-concentration samples, corresponds with the Copper Mountain District. Many of the high uranium concentrations in samples in this cluster may be due to contamination from mining or prospecting activity upstream from the sample sites. The second cluster encompasses a wide area in the Wind River Basin along the southern boundary of the quadrangle.« less

  5. Planck/SDSS Cluster Mass and Gas Scaling Relations for a Volume-Complete redMaPPer Sample

    NASA Astrophysics Data System (ADS)

    Jimeno, Pablo; Diego, Jose M.; Broadhurst, Tom; De Martino, I.; Lazkoz, Ruth

    2018-04-01

    Using Planck satellite data, we construct Sunyaev-Zel'dovich (SZ) gas pressure profiles for a large, volume-complete sample of optically selected clusters. We have defined a sample of over 8,000 redMaPPer clusters from the Sloan Digital Sky Survey (SDSS), within the volume-complete redshift region 0.100 < z < 0.325, for which we construct SZ effect maps by stacking Planck data over the full range of richness. Dividing the sample into richness bins we simultaneously solve for the mean cluster mass in each bin together with the corresponding radial pressure profile parameters, employing an MCMC analysis. These profiles are well detected over a much wider range of cluster mass and radius than previous work, showing a clear trend towards larger break radius with increasing cluster mass. Our SZ-based masses fall ˜16% below the mass-richness relations from weak lensing, in a similar fashion as the "hydrostatic bias" related with X-ray derived masses. Finally, we derive a tight Y500-M500 relation over a wide range of cluster mass, with a power law slope equal to 1.70 ± 0.07, that agrees well with the independent slope obtained by the Planck team with an SZ-selected cluster sample, but extends to lower masses with higher precision.

  6. OMERACT-based fibromyalgia symptom subgroups: an exploratory cluster analysis.

    PubMed

    Vincent, Ann; Hoskin, Tanya L; Whipple, Mary O; Clauw, Daniel J; Barton, Debra L; Benzo, Roberto P; Williams, David A

    2014-10-16

    The aim of this study was to identify subsets of patients with fibromyalgia with similar symptom profiles using the Outcome Measures in Rheumatology (OMERACT) core symptom domains. Female patients with a diagnosis of fibromyalgia and currently meeting fibromyalgia research survey criteria completed the Brief Pain Inventory, the 30-item Profile of Mood States, the Medical Outcomes Sleep Scale, the Multidimensional Fatigue Inventory, the Multiple Ability Self-Report Questionnaire, the Fibromyalgia Impact Questionnaire-Revised (FIQ-R) and the Short Form-36 between 1 June 2011 and 31 October 2011. Hierarchical agglomerative clustering was used to identify subgroups of patients with similar symptom profiles. To validate the results from this sample, hierarchical agglomerative clustering was repeated in an external sample of female patients with fibromyalgia with similar inclusion criteria. A total of 581 females with a mean age of 55.1 (range, 20.1 to 90.2) years were included. A four-cluster solution best fit the data, and each clustering variable differed significantly (P <0.0001) among the four clusters. The four clusters divided the sample into severity levels: Cluster 1 reflects the lowest average levels across all symptoms, and cluster 4 reflects the highest average levels. Clusters 2 and 3 capture moderate symptoms levels. Clusters 2 and 3 differed mainly in profiles of anxiety and depression, with Cluster 2 having lower levels of depression and anxiety than Cluster 3, despite higher levels of pain. The results of the cluster analysis of the external sample (n = 478) looked very similar to those found in the original cluster analysis, except for a slight difference in sleep problems. This was despite having patients in the validation sample who were significantly younger (P <0.0001) and had more severe symptoms (higher FIQ-R total scores (P = 0.0004)). In our study, we incorporated core OMERACT symptom domains, which allowed for clustering based on a comprehensive symptom profile. Although our exploratory cluster solution needs confirmation in a longitudinal study, this approach could provide a rationale to support the study of individualized clinical evaluation and intervention.

  7. Prevalence of the Chloroflexi-Related SAR202 Bacterioplankton Cluster throughout the Mesopelagic Zone and Deep Ocean†

    PubMed Central

    Morris, R. M.; Rappé, M. S.; Urbach, E.; Connon, S. A.; Giovannoni, S. J.

    2004-01-01

    Since their initial discovery in samples from the north Atlantic Ocean, 16S rRNA genes related to the environmental gene clone cluster known as SAR202 have been recovered from pelagic freshwater, marine sediment, soil, and deep subsurface terrestrial environments. Together, these clones form a major, monophyletic subgroup of the phylum Chloroflexi. While members of this diverse group are consistently identified in the marine environment, there are currently no cultured representatives, and very little is known about their distribution or abundance in the world's oceans. In this study, published and newly identified SAR202-related 16S rRNA gene sequences were used to further resolve the phylogeny of this cluster and to design taxon-specific oligonucleotide probes for fluorescence in situ hybridization. Direct cell counts from the Bermuda Atlantic time series study site in the north Atlantic Ocean, the Hawaii ocean time series site in the central Pacific Ocean, and along the Newport hydroline in eastern Pacific coastal waters showed that SAR202 cluster cells were most abundant below the deep chlorophyll maximum and that they persisted to 3,600 m in the Atlantic Ocean and to 4,000 m in the Pacific Ocean, the deepest samples used in this study. On average, members of the SAR202 group accounted for 10.2% (±5.7%) of all DNA-containing bacterioplankton between 500 and 4,000 m. PMID:15128540

  8. Uncertainties in the cluster-cluster correlation function

    NASA Astrophysics Data System (ADS)

    Ling, E. N.; Frenk, C. S.; Barrow, J. D.

    1986-12-01

    The bootstrap resampling technique is applied to estimate sampling errors and significance levels of the two-point correlation functions determined for a subset of the CfA redshift survey of galaxies and a redshift sample of 104 Abell clusters. The angular correlation function for a sample of 1664 Abell clusters is also calculated. The standard errors in xi(r) for the Abell data are found to be considerably larger than quoted 'Poisson errors'. The best estimate for the ratio of the correlation length of Abell clusters (richness class R greater than or equal to 1, distance class D less than or equal to 4) to that of CfA galaxies is 4.2 + 1.4 or - 1.0 (68 percentile error). The enhancement of cluster clustering over galaxy clustering is statistically significant in the presence of resampling errors. The uncertainties found do not include the effects of possible systematic biases in the galaxy and cluster catalogs and could be regarded as lower bounds on the true uncertainty range.

  9. Sampling procedures for throughfall monitoring: A simulation study

    NASA Astrophysics Data System (ADS)

    Zimmermann, Beate; Zimmermann, Alexander; Lark, Richard Murray; Elsenbeer, Helmut

    2010-01-01

    What is the most appropriate sampling scheme to estimate event-based average throughfall? A satisfactory answer to this seemingly simple question has yet to be found, a failure which we attribute to previous efforts' dependence on empirical studies. Here we try to answer this question by simulating stochastic throughfall fields based on parameters for statistical models of large monitoring data sets. We subsequently sampled these fields with different sampling designs and variable sample supports. We evaluated the performance of a particular sampling scheme with respect to the uncertainty of possible estimated means of throughfall volumes. Even for a relative error limit of 20%, an impractically large number of small, funnel-type collectors would be required to estimate mean throughfall, particularly for small events. While stratification of the target area is not superior to simple random sampling, cluster random sampling involves the risk of being less efficient. A larger sample support, e.g., the use of trough-type collectors, considerably reduces the necessary sample sizes and eliminates the sensitivity of the mean to outliers. Since the gain in time associated with the manual handling of troughs versus funnels depends on the local precipitation regime, the employment of automatically recording clusters of long troughs emerges as the most promising sampling scheme. Even so, a relative error of less than 5% appears out of reach for throughfall under heterogeneous canopies. We therefore suspect a considerable uncertainty of input parameters for interception models derived from measured throughfall, in particular, for those requiring data of small throughfall events.

  10. Dynamics of cD Clusters of Galaxies. 4; Conclusion of a Survey of 25 Abell Clusters

    NASA Technical Reports Server (NTRS)

    Oegerle, William R.; Hill, John M.; Fisher, Richard R. (Technical Monitor)

    2001-01-01

    We present the final results of a spectroscopic study of a sample of cD galaxy clusters. The goal of this program has been to study the dynamics of the clusters, with emphasis on determining the nature and frequency of cD galaxies with peculiar velocities. Redshifts measured with the MX Spectrometer have been combined with those obtained from the literature to obtain typically 50 - 150 observed velocities in each of 25 galaxy clusters containing a central cD galaxy. We present a dynamical analysis of the final 11 clusters to be observed in this sample. All 25 clusters are analyzed in a uniform manner to test for the presence of substructure, and to determine peculiar velocities and their statistical significance for the central cD galaxy. These peculiar velocities were used to determine whether or not the central cD galaxy is at rest in the cluster potential well. We find that 30 - 50% of the clusters in our sample possess significant subclustering (depending on the cluster radius used in the analysis), which is in agreement with other studies of non-cD clusters. Hence, the dynamical state of cD clusters is not different than other present-day clusters. After careful study, four of the clusters appear to have a cD galaxy with a significant peculiar velocity. Dressler-Shectman tests indicate that three of these four clusters have statistically significant substructure within 1.5/h(sub 75) Mpc of the cluster center. The dispersion 75 of the cD peculiar velocities is 164 +41/-34 km/s around the mean cluster velocity. This represents a significant detection of peculiar cD velocities, but at a level which is far below the mean velocity dispersion for this sample of clusters. The picture that emerges is one in which cD galaxies are nearly at rest with respect to the cluster potential well, but have small residual velocities due to subcluster mergers.

  11. Clustering Methods with Qualitative Data: A Mixed Methods Approach for Prevention Research with Small Samples

    PubMed Central

    Henry, David; Dymnicki, Allison B.; Mohatt, Nathaniel; Allen, James; Kelly, James G.

    2016-01-01

    Qualitative methods potentially add depth to prevention research, but can produce large amounts of complex data even with small samples. Studies conducted with culturally distinct samples often produce voluminous qualitative data, but may lack sufficient sample sizes for sophisticated quantitative analysis. Currently lacking in mixed methods research are methods allowing for more fully integrating qualitative and quantitative analysis techniques. Cluster analysis can be applied to coded qualitative data to clarify the findings of prevention studies by aiding efforts to reveal such things as the motives of participants for their actions and the reasons behind counterintuitive findings. By clustering groups of participants with similar profiles of codes in a quantitative analysis, cluster analysis can serve as a key component in mixed methods research. This article reports two studies. In the first study, we conduct simulations to test the accuracy of cluster assignment using three different clustering methods with binary data as produced when coding qualitative interviews. Results indicated that hierarchical clustering, K-Means clustering, and latent class analysis produced similar levels of accuracy with binary data, and that the accuracy of these methods did not decrease with samples as small as 50. Whereas the first study explores the feasibility of using common clustering methods with binary data, the second study provides a “real-world” example using data from a qualitative study of community leadership connected with a drug abuse prevention project. We discuss the implications of this approach for conducting prevention research, especially with small samples and culturally distinct communities. PMID:25946969

  12. Clustering Methods with Qualitative Data: a Mixed-Methods Approach for Prevention Research with Small Samples.

    PubMed

    Henry, David; Dymnicki, Allison B; Mohatt, Nathaniel; Allen, James; Kelly, James G

    2015-10-01

    Qualitative methods potentially add depth to prevention research but can produce large amounts of complex data even with small samples. Studies conducted with culturally distinct samples often produce voluminous qualitative data but may lack sufficient sample sizes for sophisticated quantitative analysis. Currently lacking in mixed-methods research are methods allowing for more fully integrating qualitative and quantitative analysis techniques. Cluster analysis can be applied to coded qualitative data to clarify the findings of prevention studies by aiding efforts to reveal such things as the motives of participants for their actions and the reasons behind counterintuitive findings. By clustering groups of participants with similar profiles of codes in a quantitative analysis, cluster analysis can serve as a key component in mixed-methods research. This article reports two studies. In the first study, we conduct simulations to test the accuracy of cluster assignment using three different clustering methods with binary data as produced when coding qualitative interviews. Results indicated that hierarchical clustering, K-means clustering, and latent class analysis produced similar levels of accuracy with binary data and that the accuracy of these methods did not decrease with samples as small as 50. Whereas the first study explores the feasibility of using common clustering methods with binary data, the second study provides a "real-world" example using data from a qualitative study of community leadership connected with a drug abuse prevention project. We discuss the implications of this approach for conducting prevention research, especially with small samples and culturally distinct communities.

  13. X-Ray Temperatures, Luminosities, and Masses from XMM-Newton Follow-up of the First Shear-selected Galaxy Cluster Sample

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Deshpande, Amruta J.; Hughes, John P.; Wittman, David, E-mail: amrejd@physics.rutgers.edu, E-mail: jph@physics.rutgers.edu, E-mail: dwittman@physics.ucdavis.edu

    We continue the study of the first sample of shear-selected clusters from the initial 8.6 square degrees of the Deep Lens Survey (DLS); a sample with well-defined selection criteria corresponding to the highest ranked shear peaks in the survey area. We aim to characterize the weak lensing selection by examining the sample’s X-ray properties. There are multiple X-ray clusters associated with nearly all the shear peaks: 14 X-ray clusters corresponding to seven DLS shear peaks. An additional three X-ray clusters cannot be definitively associated with shear peaks, mainly due to large positional offsets between the X-ray centroid and the shearmore » peak. Here we report on the XMM-Newton properties of the 17 X-ray clusters. The X-ray clusters display a wide range of luminosities and temperatures; the L {sub X} − T {sub X} relation we determine for the shear-associated X-ray clusters is consistent with X-ray cluster samples selected without regard to dynamical state, while it is inconsistent with self-similarity. For a subset of the sample, we measure X-ray masses using temperature as a proxy, and compare to weak lensing masses determined by the DLS team. The resulting mass comparison is consistent with equality. The X-ray and weak lensing masses show considerable intrinsic scatter (∼48%), which is consistent with X-ray selected samples when their X-ray and weak lensing masses are independently determined.« less

  14. 75 FR 16424 - Proposed Information Collection; Comment Request; Census Coverage Measurement Final Housing Unit...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-04-01

    ... unit is a block cluster, which consists of one or more geographically contiguous census blocks. As in... a number of distinct processes, ranging from forming block clusters, selecting the block clusters... sample of block clusters, while the E Sample is the census of housing units and enumerations in the same...

  15. Correlates of comorbid depression, anxiety and helplessness with obsessive-compulsive disorder in Chinese adolescents.

    PubMed

    Sun, Jing; Li, Zhanjiang; Buys, Nicholas; Storch, Eric A

    2015-03-15

    Youth with obsessive-compulsive disorder (OCD) are at risk of experiencing comorbid psychiatric conditions, such as depression and anxiety. Studies of Chinese adolescents with OCD are limited. The aim of this study was to investigate the association of depression, anxiety, and helplessness with the occurrence of OCD in Chinese adolescents. This study consisted of two stages. The first stage used a cross-sectional design involving a stratified clustered non-clinical sample of 3174 secondary school students. A clinical interview procedure was then employed to diagnose OCD in students who had a Leyton 'yes' score of 15 or above. The second phase used a case-control study design to examine the relationship of OCD to depression, anxiety and helplessness in a matched sample of 288 adolescents with clinically diagnosed OCD and 246 students without OCD. Helplessness, depression and anxiety scores were directly associated with the probability of OCD caseness. Canonical correlation analysis indicated that the OCD correlated significantly with depression, anxiety, and helplessness. Cluster analysis further indicated that the degree of the OCD is also associated with severity of depression and anxiety, and the level of helplessness. These findings suggest that depression, anxiety and helplessness are important correlates of OCD in Chinese adolescents. Future studies using longitudinal and prospective designs are required to confirm these relationships as causal. Copyright © 2014 Elsevier B.V. All rights reserved.

  16. An agglomerative hierarchical clustering approach to visualisation in Bayesian clustering problems

    PubMed Central

    Dawson, Kevin J.; Belkhir, Khalid

    2009-01-01

    Clustering problems (including the clustering of individuals into outcrossing populations, hybrid generations, full-sib families and selfing lines) have recently received much attention in population genetics. In these clustering problems, the parameter of interest is a partition of the set of sampled individuals, - the sample partition. In a fully Bayesian approach to clustering problems of this type, our knowledge about the sample partition is represented by a probability distribution on the space of possible sample partitions. Since the number of possible partitions grows very rapidly with the sample size, we can not visualise this probability distribution in its entirety, unless the sample is very small. As a solution to this visualisation problem, we recommend using an agglomerative hierarchical clustering algorithm, which we call the exact linkage algorithm. This algorithm is a special case of the maximin clustering algorithm that we introduced previously. The exact linkage algorithm is now implemented in our software package Partition View. The exact linkage algorithm takes the posterior co-assignment probabilities as input, and yields as output a rooted binary tree, - or more generally, a forest of such trees. Each node of this forest defines a set of individuals, and the node height is the posterior co-assignment probability of this set. This provides a useful visual representation of the uncertainty associated with the assignment of individuals to categories. It is also a useful starting point for a more detailed exploration of the posterior distribution in terms of the co-assignment probabilities. PMID:19337306

  17. Sample size calculation in cost-effectiveness cluster randomized trials: optimal and maximin approaches.

    PubMed

    Manju, Md Abu; Candel, Math J J M; Berger, Martijn P F

    2014-07-10

    In this paper, the optimal sample sizes at the cluster and person levels for each of two treatment arms are obtained for cluster randomized trials where the cost-effectiveness of treatments on a continuous scale is studied. The optimal sample sizes maximize the efficiency or power for a given budget or minimize the budget for a given efficiency or power. Optimal sample sizes require information on the intra-cluster correlations (ICCs) for effects and costs, the correlations between costs and effects at individual and cluster levels, the ratio of the variance of effects translated into costs to the variance of the costs (the variance ratio), sampling and measuring costs, and the budget. When planning, a study information on the model parameters usually is not available. To overcome this local optimality problem, the current paper also presents maximin sample sizes. The maximin sample sizes turn out to be rather robust against misspecifying the correlation between costs and effects at the cluster and individual levels but may lose much efficiency when misspecifying the variance ratio. The robustness of the maximin sample sizes against misspecifying the ICCs depends on the variance ratio. The maximin sample sizes are robust under misspecification of the ICC for costs for realistic values of the variance ratio greater than one but not robust under misspecification of the ICC for effects. Finally, we show how to calculate optimal or maximin sample sizes that yield sufficient power for a test on the cost-effectiveness of an intervention.

  18. VizieR Online Data Catalog: LAMOST survey of star clusters in M31. II. (Chen+, 2016)

    NASA Astrophysics Data System (ADS)

    Chen, B.; Liu, X.; Xiang, M.; Yuan, H.; Huang, Y.; Shi, J.; Fan, Z.; Huo, Z.; Wang, C.; Ren, J.; Tian, Z.; Zhang, H.; Liu, G.; Cao, Z.; Zhang, Y.; Hou, Y.; Wang, Y.

    2016-09-01

    We select a sample of 306 massive star clusters observed with the Large Sky Area Multi-Object Fibre Spectroscopic Telescope (LAMOST) in the vicinity fields of M31 and M33. Massive clusters in our sample are all selected from the catalog presented in Paper I (Chen et al. 2015, Cat. J/other/RAA/15.1392), including five newly discovered clusters selected with the SDSS photometry, three newly confirmed, and 298 previously known clusters from Revised Bologna Catalogue (RBC; Galleti et al. 2012, Cat. V/143; http://www.bo.astro.it/M31/). Since then another two objects, B341 and B207, have also been observed with LAMOST, and they are included in the current analysis. The current sample does not include those listed in Paper I but is selected from Johnson et al. 2012 (Cat. J/ApJ/752/95) since most of them are young but not so massive. All objects are observed with LAMOST between 2011 September and 2014 June. Table1 lists the name, position, and radial velocity of all sample clusters analyzed in the current work. The LAMOST spectra cover the wavelength range 3700-9000Å at a resolving power of R~1800. Details about the observations and data reduction can be found in Paper I. The median signal-to-noise ratio (S/N) per pixel at 4750 and 7450Å of spectra of all clusters in the current sample are, respectively, 14 and 37. Essentially all spectra have S/N(4750Å)>5 except for the spectra of 18 clusters. The latter have S/N(7540Å)>10. Peacock et al. 2010 (Cat. J/MNRAS/402/803) retrieved images of M31 star clusters and candidates from the SDSS archive and extracted ugriz aperture photometric magnitudes from those objects using the SExtractor. They present a catalog containing homogeneous ugriz photometry of 572 star clusters and 373 candidates. Among them, 299 clusters are in our sample. (2 data files).

  19. CHEERS: The chemical evolution RGS sample

    NASA Astrophysics Data System (ADS)

    de Plaa, J.; Kaastra, J. S.; Werner, N.; Pinto, C.; Kosec, P.; Zhang, Y.-Y.; Mernier, F.; Lovisari, L.; Akamatsu, H.; Schellenberger, G.; Hofmann, F.; Reiprich, T. H.; Finoguenov, A.; Ahoranta, J.; Sanders, J. S.; Fabian, A. C.; Pols, O.; Simionescu, A.; Vink, J.; Böhringer, H.

    2017-11-01

    Context. The chemical yields of supernovae and the metal enrichment of the intra-cluster medium (ICM) are not well understood. The hot gas in clusters of galaxies has been enriched with metals originating from billions of supernovae and provides a fair sample of large-scale metal enrichment in the Universe. High-resolution X-ray spectra of clusters of galaxies provide a unique way of measuring abundances in the hot intracluster medium (ICM). The abundance measurements can provide constraints on the supernova explosion mechanism and the initial-mass function of the stellar population. This paper introduces the CHEmical Enrichment RGS Sample (CHEERS), which is a sample of 44 bright local giant ellipticals, groups, and clusters of galaxies observed with XMM-Newton. Aims: The CHEERS project aims to provide the most accurate set of cluster abundances measured in X-rays using this sample. This paper focuses specifically on the abundance measurements of O and Fe using the reflection grating spectrometer (RGS) on board XMM-Newton. We aim to thoroughly discuss the cluster to cluster abundance variations and the robustness of the measurements. Methods: We have selected the CHEERS sample such that the oxygen abundance in each cluster is detected at a level of at least 5σ in the RGS. The dispersive nature of the RGS limits the sample to clusters with sharp surface brightness peaks. The deep exposures and the size of the sample allow us to quantify the intrinsic scatter and the systematic uncertainties in the abundances using spectral modeling techniques. Results: We report the oxygen and iron abundances as measured with RGS in the core regions of all 44 clusters in the sample. We do not find a significant trend of O/Fe as a function of cluster temperature, but we do find an intrinsic scatter in the O and Fe abundances from cluster to cluster. The level of systematic uncertainties in the O/Fe ratio is estimated to be around 20-30%, while the systematic uncertainties in the absolute O and Fe abundances can be as high as 50% in extreme cases. Thanks to the high statistics of the observations, we were able to identify and correct a systematic bias in the oxygen abundance determination that was due to an inaccuracy in the spectral model. Conclusions: The lack of dependence of O/Fe on temperature suggests that the enrichment of the ICM does not depend on cluster mass and that most of the enrichment likely took place before the ICM was formed. We find that the observed scatter in the O/Fe ratio is due to a combination of intrinsic scatter in the source and systematic uncertainties in the spectral fitting, which we are unable to separate. The astrophysical source of intrinsic scatter could be due to differences in active galactic nucleus activity and ongoing star formation in the brightest cluster galaxy. The systematic scatter is due to uncertainties in the spatial line broadening, absorption column, multi-temperature structure, and the thermal plasma models.

  20. Cluster Masses Derived from X-ray and Sunyaev-Zeldovich Effect Measurements

    NASA Technical Reports Server (NTRS)

    Laroque, S.; Joy, Marshall; Bonamente, M.; Carlstrom, J.; Dawson, K.

    2003-01-01

    We infer the gas mass and total gravitational mass of 11 clusters using two different methods; analysis of X-ray data from the Chandra X-ray Observatory and analysis of centimeter-wave Sunyaev-Zel'dovich Effect (SZE) data from the BIMA and OVRO interferometers. This flux-limited sample of clusters from the BCS cluster catalogue was chosen so as to be well above the surface brightness limit of the ROSAT All Sky Survey; this is therefore an orientation unbiased sample. The gas mass fraction, f_g, is calculated for each cluster using both X-ray and SZE data, and the results are compared at a fiducial radius of r_500. Comparison of the X-ray and SZE results for this orientation unbiased sample allows us to constrain cluster systematics, such as clumping of the intracluster medium. We derive an upper limit on Omega_M assuming that the mass composition of clusters within r_500 reflects the universal mass composition Omega_M h_100 is greater than Omega _B / f-g. We also demonstrate how the mean f_g derived from the sample can be used to estimate the masses of clusters discovered by upcoming deep SZE surveys.

  1. Physical properties of star clusters in the outer LMC as observed by the DES

    DOE PAGES

    Pieres, A.; Santiago, B.; Balbinot, E.; ...

    2016-05-26

    The Large Magellanic Cloud (LMC) harbors a rich and diverse system of star clusters, whose ages, chemical abundances, and positions provide information about the LMC history of star formation. We use Science Verification imaging data from the Dark Energy Survey to increase the census of known star clusters in the outer LMC and to derive physical parameters for a large sample of such objects using a spatially and photometrically homogeneous data set. Our sample contains 255 visually identified cluster candidates, of which 109 were not listed in any previous catalog. We quantify the crowding effect for the stellar sample producedmore » by the DES Data Management pipeline and conclude that the stellar completeness is < 10% inside typical LMC cluster cores. We therefore develop a pipeline to sample and measure stellar magnitudes and positions around the cluster candidates using DAOPHOT. We also implement a maximum-likelihood method to fit individual density profiles and colour-magnitude diagrams. For 117 (from a total of 255) of the cluster candidates (28 uncatalogued clusters), we obtain reliable ages, metallicities, distance moduli and structural parameters, confirming their nature as physical systems. The distribution of cluster metallicities shows a radial dependence, with no clusters more metal-rich than [Fe/H] ~ -0.7 beyond 8 kpc from the LMC center. Furthermore, the age distribution has two peaks at ≃ 1.2 Gyr and ≃ 2.7 Gyr.« less

  2. Physical properties of star clusters in the outer LMC as observed by the DES

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pieres, A.; Santiago, B.; Balbinot, E.

    The Large Magellanic Cloud (LMC) harbors a rich and diverse system of star clusters, whose ages, chemical abundances, and positions provide information about the LMC history of star formation. We use Science Verification imaging data from the Dark Energy Survey to increase the census of known star clusters in the outer LMC and to derive physical parameters for a large sample of such objects using a spatially and photometrically homogeneous data set. Our sample contains 255 visually identified cluster candidates, of which 109 were not listed in any previous catalog. We quantify the crowding effect for the stellar sample producedmore » by the DES Data Management pipeline and conclude that the stellar completeness is < 10% inside typical LMC cluster cores. We therefore develop a pipeline to sample and measure stellar magnitudes and positions around the cluster candidates using DAOPHOT. We also implement a maximum-likelihood method to fit individual density profiles and colour-magnitude diagrams. For 117 (from a total of 255) of the cluster candidates (28 uncatalogued clusters), we obtain reliable ages, metallicities, distance moduli and structural parameters, confirming their nature as physical systems. The distribution of cluster metallicities shows a radial dependence, with no clusters more metal-rich than [Fe/H] ~ -0.7 beyond 8 kpc from the LMC center. Furthermore, the age distribution has two peaks at ≃ 1.2 Gyr and ≃ 2.7 Gyr.« less

  3. Blastodinium spp. infect copepods in the ultra-oligotrophic marine waters of the Mediterranean Sea

    NASA Astrophysics Data System (ADS)

    Alves-de-Souza, C.; Cornet, C.; Nowaczyk, A.; Gasparini, S.; Skovgaard, A.; Guillou, L.

    2011-08-01

    Blastodinium are chloroplast-containing dinoflagellates which infect a wide range of copepods. They develop inside the gut of their host, where they produce successive generations of sporocytes that are eventually expelled through the anus of the copepod. Here, we report on copepod infections in the oligotrophic to ultra-oligotrophic waters of the Mediterranean Sea sampled during the BOUM cruise. Based on a DNA-stain screening of gut contents, 16 % of copepods were possibly infected in samples from the Eastern Mediterranean infected, with up to 51 % of Corycaeidae, 33 % of Calanoida, but less than 2 % of Oithonidae and Oncaeidae. Parasites were classified into distinct morphotypes, with some tentatively assigned to species B. mangini, B. contortum, and B. cf. spinulosum. Based upon the SSU rDNA gene sequence analyses of 15 individuals, the genus Blastodinium was found to be polyphyletic, containing at least three independent clusters. The first cluster grouped all sequences retrieved from parasites of Corycaeidae and Oncaeidae during this study, and included sequences of Blastodinium mangini (the "mangini" cluster). Sequences from cells infecting Calanoida belonged to two different clusters, one including B. contortum (the "contortum" cluster), and the other uniting all B. spinulosum-like morphotypes (the "spinulosum" cluster). Cluster-specific oligonucleotidic probes were designed and tested by fluorescence in situ hybridization (FISH) in order to assess the distribution of dinospores, the Blastodinium dispersal and infecting stage. Probe-positive cells were all small thecate dinoflagellates, with lengths ranging from 7 to 18 μm. Maximal abundances of Blastodinium dinospores were detected at the Deep Chlorophyll Maximum (DCM) or slightly below. This was in contrast to distributions of autotrophic pico- and nanoplankton, microplanktonic dinoflagellates, and nauplii which showed maximal concentrations above the DCM. The distinct distribution of dinospores and nauplii argues against infection during the naupliar stage. Dinospores, described as autotrophic in the literature, may escape the severe nutrient limitation of ultra-oligotrophic ecosystems by living inside copepods.

  4. Blastodinium spp. infect copepods in the ultra-oligotrophic marine waters of the Mediterranean Sea

    NASA Astrophysics Data System (ADS)

    Alves-de-Souza, C.; Cornet, C.; Nowaczyk, A.; Gasparini, S.; Skovgaard, A.; Guillou, L.

    2011-03-01

    Blastodinium are chloroplast-containing dinoflagellates which infect a wide range of copepods. They develop inside the gut of their host, where they produce successive generations of sporocytes that are eventually expelled through the anus of the copepod. Here, we report on copepod infections in the oligotrophic to ultra-oligotrophic waters of the Mediterranean Sea sampled during the BOUM cruise. Based on a DNA-stain screening of gut contents, 16% of copepods were possibly infected in samples from the Eastern Mediterranean, with up to 51% of Corycaeidae, 33% of Calanoida, but less than 2% of Oithonidae and Oncaeidae. Parasites were classified into distinct morphotypes, with some tentatively assigned to species B. mangini, B. contortum, and B. cf. spinulosum. Based upon the SSU rDNA gene sequence analyses of 15 individuals, the genus Blastodinium was found to be polyphyletic, containing at least three independent clusters. The first cluster grouped all sequences retrieved from parasites of Corycaeidae and Oncaeidae during this study, and included sequences of Blastodinium mangini (the "mangini" cluster). Sequences from cells infecting Calanoida belonged to two different clusters, one including B. contortum (the "contortum" cluster), and the other uniting all B. spinulosum-like morphotypes (the "spinulosum" cluster). Cluster-specific oligonucleotidic probes were designed and tested by FISH in order to assess the distribution of dinospores, the Blastodinium dispersal and infecting stage. Probe-positive cells were all small thecate dinoflagellates, with lengths ranging from 7 to 18 μm. Maximal abundances of Blastodinium dinospores were detected at the Deep Chlorophyll Maximum (DCM) or slightly below. This was in contrast to distributions of autotrophic pico- and nanoplankton, microplanktonic dinoflagellates, and nauplii which showed maximal concentrations above the DCM. The distinct distributions of dinospores and nauplii argues against infection during the naupliar stage. Blastodinium, described as autotrophic in the literature, may escape the severe nutrient limitation of ultra-oligotrophic ecosystems by living inside copepods.

  5. X-ray morphological study of galaxy cluster catalogues

    NASA Astrophysics Data System (ADS)

    Democles, Jessica; Pierre, Marguerite; Arnaud, Monique

    2016-07-01

    Context : The intra-cluster medium distribution as probed by X-ray morphology based analysis gives good indication of the system dynamical state. In the race for the determination of precise scaling relations and understanding their scatter, the dynamical state offers valuable information. Method : We develop the analysis of the centroid-shift so that it can be applied to characterize galaxy cluster surveys such as the XXL survey or high redshift cluster samples. We use it together with the surface brightness concentration parameter and the offset between X-ray peak and brightest cluster galaxy in the context of the XXL bright cluster sample (Pacaud et al 2015) and a set of high redshift massive clusters detected by Planck and SPT and observed by both XMM-Newton and Chandra observatories. Results : Using the wide redshift coverage of the XXL sample, we see no trend between the dynamical state of the systems with the redshift.

  6. Planck/SDSS cluster mass and gas scaling relations for a volume-complete redMaPPer sample

    NASA Astrophysics Data System (ADS)

    Jimeno, Pablo; Diego, Jose M.; Broadhurst, Tom; De Martino, I.; Lazkoz, Ruth

    2018-07-01

    Using Planck satellite data, we construct Sunyaev-Zel'dovich (SZ) gas pressure profiles for a large, volume-complete sample of optically selected clusters. We have defined a sample of over 8000 redMaPPer clusters from the Sloan Digital Sky Survey, within the volume-complete redshift region 0.100

  7. Toward An Understanding of Cluster Evolution: A Deep X-Ray Selected Cluster Catalog from ROSAT

    NASA Technical Reports Server (NTRS)

    Jones, Christine; Oliversen, Ronald (Technical Monitor)

    2002-01-01

    In the past year, we have focussed on studying individual clusters found in this sample with Chandra, as well as using Chandra to measure the luminosity-temperature relation for a sample of distant clusters identified through the ROSAT study, and finally we are continuing our study of fossil groups. For the luminosity-temperature study, we compared a sample of nearby clusters with a sample of distant clusters and, for the first time, measured a significant change in the relation as a function of redshift (Vikhlinin et al. in final preparation for submission to Cape). We also used our ROSAT analysis to select and propose for Chandra observations of individual clusters. We are now analyzing the Chandra observations of the distant cluster A520, which appears to have undergone a recent merger. Finally, we have completed the analysis of the fossil groups identified in ROM observations. In the past few months, we have derived X-ray fluxes and luminosities as well as X-ray extents for an initial sample of 89 objects. Based on the X-ray extents and the lack of bright galaxies, we have identified 16 fossil groups. We are comparing their X-ray and optical properties with those of optically rich groups. A paper is being readied for submission (Jones, Forman, and Vikhlinin in preparation).

  8. Ecological tolerances of Miocene larger benthic foraminifera from Indonesia

    NASA Astrophysics Data System (ADS)

    Novak, Vibor; Renema, Willem

    2018-01-01

    To provide a comprehensive palaeoenvironmental reconstruction based on larger benthic foraminifera (LBF), a quantitative analysis of their assemblage composition is needed. Besides microfacies analysis which includes environmental preferences of foraminiferal taxa, statistical analyses should also be employed. Therefore, detrended correspondence analysis and cluster analysis were performed on relative abundance data of identified LBF assemblages deposited in mixed carbonate-siliciclastic (MCS) systems and blue-water (BW) settings. Studied MCS system localities include ten sections from the central part of the Kutai Basin in East Kalimantan, ranging from late Burdigalian to Serravallian age. The BW samples were collected from eleven sections of the Bulu Formation on Central Java, dated as Serravallian. Results from detrended correspondence analysis reveal significant differences between these two environmental settings. Cluster analysis produced five clusters of samples; clusters 1 and 2 comprise dominantly MCS samples, clusters 3 and 4 with dominance of BW samples, and cluster 5 showing a mixed composition with both MCS and BW samples. The results of cluster analysis were afterwards subjected to indicator species analysis resulting in the interpretation that generated three groups among LBF taxa: typical assemblage indicators, regularly occurring taxa and rare taxa. By interpreting the results of detrended correspondence analysis, cluster analysis and indicator species analysis, along with environmental preferences of identified LBF taxa, a palaeoenvironmental model is proposed for the distribution of LBF in Miocene MCS systems and adjacent BW settings of Indonesia.

  9. X-Ray Temperatures, Luminosities, and Masses from XMM-Newton Follow-upof the First Shear-selected Galaxy Cluster Sample

    NASA Astrophysics Data System (ADS)

    Deshpande, Amruta J.; Hughes, John P.; Wittman, David

    2017-04-01

    We continue the study of the first sample of shear-selected clusters from the initial 8.6 square degrees of the Deep Lens Survey (DLS); a sample with well-defined selection criteria corresponding to the highest ranked shear peaks in the survey area. We aim to characterize the weak lensing selection by examining the sample’s X-ray properties. There are multiple X-ray clusters associated with nearly all the shear peaks: 14 X-ray clusters corresponding to seven DLS shear peaks. An additional three X-ray clusters cannot be definitively associated with shear peaks, mainly due to large positional offsets between the X-ray centroid and the shear peak. Here we report on the XMM-Newton properties of the 17 X-ray clusters. The X-ray clusters display a wide range of luminosities and temperatures; the L X -T X relation we determine for the shear-associated X-ray clusters is consistent with X-ray cluster samples selected without regard to dynamical state, while it is inconsistent with self-similarity. For a subset of the sample, we measure X-ray masses using temperature as a proxy, and compare to weak lensing masses determined by the DLS team. The resulting mass comparison is consistent with equality. The X-ray and weak lensing masses show considerable intrinsic scatter (˜48%), which is consistent with X-ray selected samples when their X-ray and weak lensing masses are independently determined. Some of the data presented herein were obtained at the W.M. Keck Observatory, which is operated as a scientific partnership among the California Institute of Technology, the University of California, and the National Aeronautics and Space Administration. The Observatory was made possible by the generous financial support of the W. M. Keck Foundation.

  10. Muslim communities learning about second-hand smoke (MCLASS): study protocol for a pilot cluster randomised controlled trial

    PubMed Central

    2013-01-01

    Background In the UK, 40% of Bangladeshi and 29% of Pakistani men smoke cigarettes regularly compared to the national average of 24%. As a consequence, second-hand smoking is also widespread in their households which is a serious health hazard to non-smokers, especially children. Smoking restrictions in households can help reduce exposure to second-hand smoking. This is a pilot trial of ‘Smoke Free Homes’, an educational programme which has been adapted for use by Muslim faith leaders, in an attempt to find an innovative solution to encourage Pakistani- and Bangladeshi-origin communities to implement smoking restrictions in their homes. The primary objectives for this pilot trial are to establish the feasibility of conducting such an evaluation and provide information to inform the design of a future definitive study. Methods/Design This is a pilot cluster randomised controlled trial of ‘Smoke Free Homes’, with an embedded preliminary health economic evaluation and a qualitative analysis. The trial will be carried out in around 14 Islamic religious settings. Equal randomisation will be employed to allocate each cluster to a trial arm. The intervention group will be offered the Smoke Free Homes package (Smoke Free Homes: a resource for Muslim religious teachers), trained in its use, and will subsequently implement the package in their religious settings. The remaining clusters will not be offered the package until the completion of the study and will form the control group. At each cluster, we aim to recruit around 50 households with at least one adult resident who smokes tobacco and at least one child or a non-smoking adult. Households will complete a household survey and a non-smoking individual will provide a saliva sample which will be tested for cotinine. All participant outcomes will be measured before and after the intervention period in both arms of the trial. In addition, a purposive sample of participants and religious leaders/teachers will take part in interviews and focus groups. Discussion The results of this pilot study will inform the protocol for a definitive trial. Trial registration Current Controlled Trials ISRCTN03035510 PMID:24034853

  11. Clustering behavior in microbial communities from acute endodontic infections.

    PubMed

    Montagner, Francisco; Jacinto, Rogério C; Signoretti, Fernanda G C; Sanches, Paula F; Gomes, Brenda P F A

    2012-02-01

    Acute endodontic infections harbor heterogeneous microbial communities in both the root canal (RC) system and apical tissues. Data comparing the microbial structure and diversity in endodontic infections in related ecosystems, such as RC with necrotic pulp and acute apical abscess (AAA), are scarce in the literature. The aim of this study was to examine the presence of selected endodontic pathogens in paired samples from necrotic RC and AAA using polymerase chain reaction (PCR) followed by the construction of cluster profiles. Paired samples of RC and AAA exudates were collected from 20 subjects and analyzed by PCR for the presence of selected strict and facultative anaerobic strains. The frequency of species was compared between the RC and the AAA samples. A stringent neighboring clustering algorithm was applied to investigate the existence of similar high-order groups of samples. A dendrogram was constructed to show the arrangement of the sample groups produced by the hierarchical clustering. All samples harbored bacterial DNA. Porphyromonas endodontalis, Prevotella nigrescens, Filifactor alocis, and Tannerela forsythia were frequently detected in both RC and AAA samples. The selected anaerobic species were distributed in diverse small bacteria consortia. The samples of RC and AAA that presented at least one of the targeted microorganisms were grouped in small clusters. Anaerobic species were frequently detected in acute endodontic infections and heterogeneous microbial communities with low clustering behavior were observed in paired samples of RC and AAA. Copyright © 2012. Published by Elsevier Inc.

  12. The relationship between personality and attainment in 16-19-year-old students in a sixth form college: II: Self-perception, gender and attainment.

    PubMed

    Summerfield, M; Youngman, M

    1999-06-01

    A related paper (Summerfield & Youngman, 1999) has described the development of a scale, the Student Self-Perception Scale (SSPS) designed to explore the relationship between academic self-concept, attainment and personality in sixth form college students. The study aimed to identify groups of students exhibiting varying patterns of relationship using a range of measures including the SSPS. Issues of gender and also examined. The samples comprised a pilot sample of 152 students (aged 16-17 years from two sixth form colleges) and a main sample of 364 students (mean age, 16 yrs 10 mths range 16:0 to 18:6 years, from one sixth form college). The main sample included similar numbers of male and female students (46% male, 54% female) and ethnic minority students comprised 14% of this sample. Data comprised responses to two personality measures (the SSPS, Summerfield, 1995, and the Nowicki-Strickland Locus of Control Scale, Nowicki & Strickland, 1973), various student and tutor estimates of success, and performance data from college records. Students were classified using relocation cluster analysis and cluster differences verified using discriminant function analysis. Thirty outcome models were tested using covariance regression analysis. Eight distinct and interpretable groups, consistent with other research, were identified but the hypothesis of a positive, linear relationship between mastery and academic attainment was not sustained without qualification. Previous attainment was the major determinant of final performance. Gender variations were detected on the personality measures, particularly Confidence of outcomes, Prediction discrepancy, Passivity, Mastery, Dependency and Locus of control, and these were implicated in the cluster characteristics. The results suggest that a non-linear methodology may be required to isolate relationships between self-concept, personality and attainment, especially where gender effects may exist.

  13. Tobacco, Marijuana, and Alcohol Use in University Students: A Cluster Analysis

    PubMed Central

    Primack, Brian A.; Kim, Kevin H.; Shensa, Ariel; Sidani, Jaime E.; Barnett, Tracey E.; Switzer, Galen E.

    2012-01-01

    Objective Segmentation of populations may facilitate development of targeted substance abuse prevention programs. We aimed to partition a national sample of university students according to profiles based on substance use. Participants We used 2008–2009 data from the National College Health Assessment from the American College Health Association. Our sample consisted of 111,245 individuals from 158 institutions. Method We partitioned the sample using cluster analysis according to current substance use behaviors. We examined the association of cluster membership with individual and institutional characteristics. Results Cluster analysis yielded six distinct clusters. Three individual factors—gender, year in school, and fraternity/sorority membership—were the most strongly associated with cluster membership. Conclusions In a large sample of university students, we were able to identify six distinct patterns of substance abuse. It may be valuable to target specific populations of college-aged substance users based on individual factors. However, comprehensive intervention will require a multifaceted approach. PMID:22686360

  14. Reporting and methodological quality of sample size calculations in cluster randomized trials could be improved: a review.

    PubMed

    Rutterford, Clare; Taljaard, Monica; Dixon, Stephanie; Copas, Andrew; Eldridge, Sandra

    2015-06-01

    To assess the quality of reporting and accuracy of a priori estimates used in sample size calculations for cluster randomized trials (CRTs). We reviewed 300 CRTs published between 2000 and 2008. The prevalence of reporting sample size elements from the 2004 CONSORT recommendations was evaluated and a priori estimates compared with those observed in the trial. Of the 300 trials, 166 (55%) reported a sample size calculation. Only 36 of 166 (22%) reported all recommended descriptive elements. Elements specific to CRTs were the worst reported: a measure of within-cluster correlation was specified in only 58 of 166 (35%). Only 18 of 166 articles (11%) reported both a priori and observed within-cluster correlation values. Except in two cases, observed within-cluster correlation values were either close to or less than a priori values. Even with the CONSORT extension for cluster randomization, the reporting of sample size elements specific to these trials remains below that necessary for transparent reporting. Journal editors and peer reviewers should implement stricter requirements for authors to follow CONSORT recommendations. Authors should report observed and a priori within-cluster correlation values to enable comparisons between these over a wider range of trials. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  15. Leveraging contact network structure in the design of cluster randomized trials.

    PubMed

    Harling, Guy; Wang, Rui; Onnela, Jukka-Pekka; De Gruttola, Victor

    2017-02-01

    In settings like the Ebola epidemic, where proof-of-principle trials have provided evidence of efficacy but questions remain about the effectiveness of different possible modes of implementation, it may be useful to conduct trials that not only generate information about intervention effects but also themselves provide public health benefit. Cluster randomized trials are of particular value for infectious disease prevention research by virtue of their ability to capture both direct and indirect effects of intervention, the latter of which depends heavily on the nature of contact networks within and across clusters. By leveraging information about these networks-in particular the degree of connection across randomized units, which can be obtained at study baseline-we propose a novel class of connectivity-informed cluster trial designs that aim both to improve public health impact (speed of epidemic control) and to preserve the ability to detect intervention effects. We several designs for cluster randomized trials with staggered enrollment, in each of which the order of enrollment is based on the total number of ties (contacts) from individuals within a cluster to individuals in other clusters. Our designs can accommodate connectivity based either on the total number of external connections at baseline or on connections only to areas yet to receive the intervention. We further consider a "holdback" version of the designs in which control clusters are held back from re-randomization for some time interval. We investigate the performance of these designs in terms of epidemic control outcomes (time to end of epidemic and cumulative incidence) and power to detect intervention effect, by simulating vaccination trials during an SEIR-type epidemic outbreak using a network-structured agent-based model. We compare results to those of a traditional Stepped Wedge trial. In our simulation studies, connectivity-informed designs lead to a 20% reduction in cumulative incidence compared to comparable traditional study designs, but have little impact on epidemic length. Power to detect intervention effect is reduced in all connectivity-informed designs, but "holdback" versions provide power that is very close to that of a traditional Stepped Wedge approach. Incorporating information about cluster connectivity in the design of cluster randomized trials can increase their public health impact, especially in acute outbreak settings. Using this information helps control outbreaks-by minimizing the number of cross-cluster infections-with very modest cost in terms of power to detect effectiveness.

  16. Physiogenomic analysis of the Puerto Rican population

    PubMed Central

    Ruaño, Gualberto; Duconge, Jorge; Windemuth, Andreas; Cadilla, Carmen L; Kocherla, Mohan; Villagra, David; Renta, Jessica; Holford, Theodore; Santiago-Borrero, Pedro J

    2009-01-01

    Aims Admixture in the population of the island of Puerto Rico is of general interest with regards to pharmacogenetics to develop comprehensive strategies for personalized healthcare in Latin Americans. This research was aimed at determining the frequencies of SNPs in key physiological, pharmacological and biochemical genes to infer population structure and ancestry in the Puerto Rican population. Materials & methods A noninterventional, cross-sectional, retrospective study design was implemented following a controlled, stratified-by-region, random sampling protocol. The sample was based on birthrates in each region of the island of Puerto Rico, according to the 2004 National Birth Registry. Genomic DNA samples from 100 newborns were obtained from the Puerto Rico Newborn Screening Program in dried-blood spot cards. Genotyping using a physiogenomic array was performed for 332 SNPs from 196 cardiometabolic and neuroendocrine genes. Population structure was examined using a Bayesian clustering approach as well as by allelic dissimilarity as a measure of allele sharing. Results The Puerto Rican sample was found to be broadly heterogeneous. We observed three main clusters in the population, which we hypothesize to reflect the historical admixture in the Puerto Rican population from Amerindian, African and European ancestors. We present evidence for this interpretation by comparing allele frequencies for the three clusters with those for the same SNPs available from the International HapMap project for Asian, African and European populations. Conclusion Our results demonstrate that population analysis can be performed with a physiogenomic array of cardiometabolic and neuroendocrine genes to facilitate the translation of genome diversity into personalized medicine. PMID:19374515

  17. A PVC/polypyrrole sensor designed for beef taste detection using electrochemical methods and sensory evaluation.

    PubMed

    Zhu, Lingtao; Wang, Xiaodan; Han, Yunxiu; Cai, Yingming; Jin, Jiahui; Wang, Hongmei; Xu, Liping; Wu, Ruijia

    2018-03-01

    An electrochemical sensor for detection of beef taste was designed in this study. This sensor was based on the structure of polyvinyl chloride/polypyrrole (PVC/PPy), which was polymerized onto the surface of a platinum (Pt) electrode to form a Pt-PPy-PVC film. Detecting by electrochemical methods, the sensor was well characterized by electrochemical impedance spectroscopy (EIS) and cyclic voltammetry (CV). The sensor was applied to detect 10 rib-eye beef samples and the accuracy of the new sensor was validated by sensory evaluation and ion sensor detection. Several cluster analysis methods were used in the study to distinguish the beef samples. According to the obtained results, the designed sensor showed a high degree of association of electrochemical detection and sensory evaluation, which proved a fast and precise sensor for beef taste detection. Copyright © 2017 Elsevier Ltd. All rights reserved.

  18. Testing Gravity and Cosmic Acceleration with Galaxy Clustering

    NASA Astrophysics Data System (ADS)

    Kazin, Eyal; Tinker, J.; Sanchez, A. G.; Blanton, M.

    2012-01-01

    The large-scale structure contains vast amounts of cosmological information that can help understand the accelerating nature of the Universe and test gravity on large scales. Ongoing and future sky surveys are designed to test these using various techniques applied on clustering measurements of galaxies. We present redshift distortion measurements of the Sloan Digital Sky Survey II Luminous Red Galaxy sample. We find that when combining the normalized quadrupole Q with the projected correlation function wp(rp) along with cluster counts (Rapetti et al. 2010), results are consistent with General Relativity. The advantage of combining Q and wp is the addition of the bias information, when using the Halo Occupation Distribution framework. We also present improvements to the standard technique of measuring Hubble expansion rates H(z) and angular diameter distances DA(z) when using the baryonic acoustic feature as a standard ruler. We introduce clustering wedges as an alternative basis to the multipole expansion and show that it yields similar constraints. This alternative basis serves as a useful technique to test for systematics, and ultimately improve measurements of the cosmic acceleration.

  19. A sampling design framework for monitoring secretive marshbirds

    USGS Publications Warehouse

    Johnson, D.H.; Gibbs, J.P.; Herzog, M.; Lor, S.; Niemuth, N.D.; Ribic, C.A.; Seamans, M.; Shaffer, T.L.; Shriver, W.G.; Stehman, S.V.; Thompson, W.L.

    2009-01-01

    A framework for a sampling plan for monitoring marshbird populations in the contiguous 48 states is proposed here. The sampling universe is the breeding habitat (i.e. wetlands) potentially used by marshbirds. Selection protocols would be implemented within each of large geographical strata, such as Bird Conservation Regions. Site selection will be done using a two-stage cluster sample. Primary sampling units (PSUs) would be land areas, such as legal townships, and would be selected by a procedure such as systematic sampling. Secondary sampling units (SSUs) will be wetlands or portions of wetlands in the PSUs. SSUs will be selected by a randomized spatially balanced procedure. For analysis, the use of a variety of methods as a means of increasing confidence in conclusions that may be reached is encouraged. Additional effort will be required to work out details and implement the plan.

  20. X-ray emission from a complete sample of Abell clusters of galaxies

    NASA Astrophysics Data System (ADS)

    Briel, Ulrich G.; Henry, J. Patrick

    1993-11-01

    The ROSAT All-Sky Survey (RASS) is used to investigate the X-ray properties of a complete sample of Abell clusters with measured redshifts and accurate positions. The sample comprises the 145 clusters within a 561 square degree region at high galactic latitude. The mean redshift is 0.17. This sample is especially well suited to be studied within the RASS since the mean exposure time is higher than average and the mean galactic column density is very low. These together produce a flux limit of about 4.2 x 10-13 erg/sq cm/s in the 0.5 to 2.5 keV energy band. Sixty-six (46%) individual clusters are detected at a significance level higher than 99.7% of which 7 could be chance coincidences of background or foreground sources. At redshifts greater than 0.3 six clusters out of seven (86%) are detected at the same significance level. The detected objects show a clear X-ray luminosity -- galaxy count relation with a dispersion consistent with other external estimates of the error in the counts. By analyzing the excess of positive fluctuations of the X-ray flux at the cluster positions, compared with the fluctuations of randomly drawn background fields, it is possible to extend these results below the nominal flux limit. We find 80% of richness R greater than or = 0 and 86% of R greater than or = 1 clusters are X-ray emitters with fluxes above 1 x 10-13 erg/sq cm/s. Nearly 90% of the clusters meeting the requirements to be in Abell's statistical sample emit above the same level. We therefore conclude that almost all Abell clusters are real clusters and the Abell catalog is not strongly contaminated by projection effects. We use the Kaplan-Meier product limit estimator to calculate the cumulative X-ray luminosity function. We show that the shape of the luminosity functions are similiar for different richness classes, but the characteristic luminosities of richness 2 clusters are about twice those of richness 1 clusters which are in turn about twice those of richness 0 clusters. This result is another manifestation of the luminosity -- richness elation for Abell clusters.

  1. Cluster randomized trials utilizing primary care electronic health records: methodological issues in design, conduct, and analysis (eCRT Study).

    PubMed

    Gulliford, Martin C; van Staa, Tjeerd P; McDermott, Lisa; McCann, Gerard; Charlton, Judith; Dregan, Alex

    2014-06-11

    There is growing interest in conducting clinical and cluster randomized trials through electronic health records. This paper reports on the methodological issues identified during the implementation of two cluster randomized trials using the electronic health records of the Clinical Practice Research Datalink (CPRD). Two trials were completed in primary care: one aimed to reduce inappropriate antibiotic prescribing for acute respiratory infection; the other aimed to increase physician adherence with secondary prevention interventions after first stroke. The paper draws on documentary records and trial datasets to report on the methodological experience with respect to research ethics and research governance approval, general practice recruitment and allocation, sample size calculation and power, intervention implementation, and trial analysis. We obtained research governance approvals from more than 150 primary care organizations in England, Wales, and Scotland. There were 104 CPRD general practices recruited to the antibiotic trial and 106 to the stroke trial, with the target number of practices being recruited within six months. Interventions were installed into practice information systems remotely over the internet. The mean number of participants per practice was 5,588 in the antibiotic trial and 110 in the stroke trial, with the coefficient of variation of practice sizes being 0.53 and 0.56 respectively. Outcome measures showed substantial correlations between the 12 months before, and after intervention, with coefficients ranging from 0.42 for diastolic blood pressure to 0.91 for proportion of consultations with antibiotics prescribed, defining practice and participant eligibility for analysis requires careful consideration. Cluster randomized trials may be performed efficiently in large samples from UK general practices using the electronic health records of a primary care database. The geographical dispersal of trial sites presents a difficulty for research governance approval and intervention implementation. Pretrial data analyses should inform trial design and analysis plans. Current Controlled Trials ISRCTN 47558792 and ISRCTN 35701810 (both registered on 17 March 2010).

  2. Cluster randomized trials utilizing primary care electronic health records: methodological issues in design, conduct, and analysis (eCRT Study)

    PubMed Central

    2014-01-01

    Background There is growing interest in conducting clinical and cluster randomized trials through electronic health records. This paper reports on the methodological issues identified during the implementation of two cluster randomized trials using the electronic health records of the Clinical Practice Research Datalink (CPRD). Methods Two trials were completed in primary care: one aimed to reduce inappropriate antibiotic prescribing for acute respiratory infection; the other aimed to increase physician adherence with secondary prevention interventions after first stroke. The paper draws on documentary records and trial datasets to report on the methodological experience with respect to research ethics and research governance approval, general practice recruitment and allocation, sample size calculation and power, intervention implementation, and trial analysis. Results We obtained research governance approvals from more than 150 primary care organizations in England, Wales, and Scotland. There were 104 CPRD general practices recruited to the antibiotic trial and 106 to the stroke trial, with the target number of practices being recruited within six months. Interventions were installed into practice information systems remotely over the internet. The mean number of participants per practice was 5,588 in the antibiotic trial and 110 in the stroke trial, with the coefficient of variation of practice sizes being 0.53 and 0.56 respectively. Outcome measures showed substantial correlations between the 12 months before, and after intervention, with coefficients ranging from 0.42 for diastolic blood pressure to 0.91 for proportion of consultations with antibiotics prescribed, defining practice and participant eligibility for analysis requires careful consideration. Conclusions Cluster randomized trials may be performed efficiently in large samples from UK general practices using the electronic health records of a primary care database. The geographical dispersal of trial sites presents a difficulty for research governance approval and intervention implementation. Pretrial data analyses should inform trial design and analysis plans. Trial registration Current Controlled Trials ISRCTN 47558792 and ISRCTN 35701810 (both registered on 17 March 2010). PMID:24919485

  3. Design of the South East Asian Nutrition Survey (SEANUTS): a four-country multistage cluster design study.

    PubMed

    Schaafsma, Anne; Deurenberg, Paul; Calame, Wim; van den Heuvel, Ellen G H M; van Beusekom, Christien; Hautvast, Jo; Sandjaja; Bee Koon, Poh; Rojroongwasinkul, Nipa; Le Nguyen, Bao Khanh; Parikh, Panam; Khouw, Ilse

    2013-09-01

    Nutrition is a well-known factor in the growth, health and development of children. It is also acknowledged that worldwide many people have dietary imbalances resulting in over- or undernutrition. In 2009, the multinational food company FrieslandCampina initiated the South East Asian Nutrition Survey (SEANUTS), a combination of surveys carried out in Indonesia, Malaysia, Thailand and Vietnam, to get a better insight into these imbalances. The present study describes the general study design and methodology, as well as some problems and pitfalls encountered. In each of these countries, participants in the age range of 0·5-12 years were recruited according to a multistage cluster randomised or stratified random sampling methodology. Field teams took care of recruitment and data collection. For the health status of children, growth and body composition, physical activity, bone density, and development and cognition were measured. For nutrition, food intake and food habits were assessed by questionnaires, whereas in subpopulations blood and urine samples were collected to measure the biochemical status parameters of Fe, vitamins A and D, and DHA. In Thailand, the researchers additionally studied the lipid profile in blood, whereas in Indonesia iodine excretion in urine was analysed. Biochemical data were analysed in certified laboratories. Study protocols and methodology were aligned where practically possible. In December 2011, data collection was finalised. In total, 16,744 children participated in the present study. Information that will be very relevant for formulating nutritional health policies, as well as for designing innovative food and nutrition research and development programmes, has become available.

  4. Real-time dynamic modelling for the design of a cluster-randomized phase 3 Ebola vaccine trial in Sierra Leone.

    PubMed

    Camacho, A; Eggo, R M; Goeyvaerts, N; Vandebosch, A; Mogg, R; Funk, S; Kucharski, A J; Watson, C H; Vangeneugden, T; Edmunds, W J

    2017-01-23

    Declining incidence and spatial heterogeneity complicated the design of phase 3 Ebola vaccine trials during the tail of the 2013-16 Ebola virus disease (EVD) epidemic in West Africa. Mathematical models can provide forecasts of expected incidence through time and can account for both vaccine efficacy in participants and effectiveness in populations. Determining expected disease incidence was critical to calculating power and determining trial sample size. In real-time, we fitted, forecasted, and simulated a proposed phase 3 cluster-randomized vaccine trial for a prime-boost EVD vaccine in three candidate regions in Sierra Leone. The aim was to forecast trial feasibility in these areas through time and guide study design planning. EVD incidence was highly variable during the epidemic, especially in the declining phase. Delays in trial start date were expected to greatly reduce the ability to discern an effect, particularly as a trial with an effective vaccine would cause the epidemic to go extinct more quickly in the vaccine arm. Real-time updates of the model allowed decision-makers to determine how trial feasibility changed with time. This analysis was useful for vaccine trial planning because we simulated effectiveness as well as efficacy, which is possible with a dynamic transmission model. It contributed to decisions on choice of trial location and feasibility of the trial. Transmission models should be utilised as early as possible in the design process to provide mechanistic estimates of expected incidence, with which decisions about sample size, location, timing, and feasibility can be determined. Copyright © 2016. Published by Elsevier Ltd.

  5. Spatially explicit population estimates for black bears based on cluster sampling

    USGS Publications Warehouse

    Humm, J.; McCown, J. Walter; Scheick, B.K.; Clark, Joseph D.

    2017-01-01

    We estimated abundance and density of the 5 major black bear (Ursus americanus) subpopulations (i.e., Eglin, Apalachicola, Osceola, Ocala-St. Johns, Big Cypress) in Florida, USA with spatially explicit capture-mark-recapture (SCR) by extracting DNA from hair samples collected at barbed-wire hair sampling sites. We employed a clustered sampling configuration with sampling sites arranged in 3 × 3 clusters spaced 2 km apart within each cluster and cluster centers spaced 16 km apart (center to center). We surveyed all 5 subpopulations encompassing 38,960 km2 during 2014 and 2015. Several landscape variables, most associated with forest cover, helped refine density estimates for the 5 subpopulations we sampled. Detection probabilities were affected by site-specific behavioral responses coupled with individual capture heterogeneity associated with sex. Model-averaged bear population estimates ranged from 120 (95% CI = 59–276) bears or a mean 0.025 bears/km2 (95% CI = 0.011–0.44) for the Eglin subpopulation to 1,198 bears (95% CI = 949–1,537) or 0.127 bears/km2 (95% CI = 0.101–0.163) for the Ocala-St. Johns subpopulation. The total population estimate for our 5 study areas was 3,916 bears (95% CI = 2,914–5,451). The clustered sampling method coupled with information on land cover was efficient and allowed us to estimate abundance across extensive areas that would not have been possible otherwise. Clustered sampling combined with spatially explicit capture-recapture methods has the potential to provide rigorous population estimates for a wide array of species that are extensive and heterogeneous in their distribution.

  6. Tracing Large Scale Structure with a Redshift Survey of Rich Clusters of Galaxies

    NASA Astrophysics Data System (ADS)

    Batuski, D.; Slinglend, K.; Haase, S.; Hill, J. M.

    1993-12-01

    Rich clusters of galaxies from Abell's catalog show evidence of structure on scales of 100 Mpc and hold promise of confirming the existence of structure in the more immediate universe on scales corresponding to COBE results (i.e., on the order of 10% or more of the horizon size of the universe). However, most Abell clusters do not as yet have measured redshifts (or, in the case of most low redshift clusters, have only one or two galaxies measured), so present knowledge of their three dimensional distribution has quite large uncertainties. The shortage of measured redshifts for these clusters may also mask a problem of projection effects corrupting the membership counts for the clusters, perhaps even to the point of spurious identifications of some of the clusters themselves. Our approach in this effort has been to use the MX multifiber spectrometer to measure redshifts of at least ten galaxies in each of about 80 Abell cluster fields with richness class R>= 1 and mag10 <= 16.8. This work will result in a somewhat deeper, much more complete (and reliable) sample of positions of rich clusters. Our primary use for the sample is for two-point correlation and other studies of the large scale structure traced by these clusters. We are also obtaining enough redshifts per cluster so that a much better sample of reliable cluster velocity dispersions will be available for other studies of cluster properties. To date, we have collected such data for 40 clusters, and for most of them, we have seven or more cluster members with redshifts, allowing for reliable velocity dispersion calculations. Velocity histograms for several interesting cluster fields are presented, along with summary tables of cluster redshift results. Also, with 10 or more redshifts in most of our cluster fields (30({') } square, just about an `Abell diameter' at z ~ 0.1) we have investigated the extent of projection effects within the Abell catalog in an effort to quantify and understand how this may effect the Abell sample.

  7. The Mass Function in h+(chi) Persei

    NASA Astrophysics Data System (ADS)

    Bragg, Ann; Kenyon, Scott

    2000-08-01

    Knowledge of the stellar initial mass function (IMF) is critical to understanding star formation and galaxy evolution. Past studies of the IMF in open clusters have primarily used luminosity functions to determine mass functions, frequently in relatively sparse clusters. Our goal with this project is to derive a reliable, well- sampled IMF for a pair of very dense young clusters (h+(chi) Persei) with ages, 1-2 × 10^7 yr (e.g., Vogt A& A 11:359), where stellar evolution theory is robust. We will construct the HR diagram using both photometry and spectral types to derive more accurate stellar masses and ages than are possible using photometry alone. Results from the two clusters will be compared to examine the universality of the IMF. We currently have a spectroscopic sample covering an area within 9 arc-minutes of the center of each cluster taken with the FAST Spectrograph. The sample is complete to V=15.4 and contains ~ 1000 stars. We request 2 nights at WIYN/HYDRA to extend this sample to deeper magnitudes, allowing us to determine the IMF of the clusters to a lower limiting mass and to search for a pre-main sequence, theoretically predicted to be present for clusters of this age. Note that both clusters are contained within a single HYDRA field.

  8. Recognition of genetically modified product based on affinity propagation clustering and terahertz spectroscopy

    NASA Astrophysics Data System (ADS)

    Liu, Jianjun; Kan, Jianquan

    2018-04-01

    In this paper, based on the terahertz spectrum, a new identification method of genetically modified material by support vector machine (SVM) based on affinity propagation clustering is proposed. This algorithm mainly uses affinity propagation clustering algorithm to make cluster analysis and labeling on unlabeled training samples, and in the iterative process, the existing SVM training data are continuously updated, when establishing the identification model, it does not need to manually label the training samples, thus, the error caused by the human labeled samples is reduced, and the identification accuracy of the model is greatly improved.

  9. Open star clusters and Galactic structure

    NASA Astrophysics Data System (ADS)

    Joshi, Yogesh C.

    2018-04-01

    In order to understand the Galactic structure, we perform a statistical analysis of the distribution of various cluster parameters based on an almost complete sample of Galactic open clusters yet available. The geometrical and physical characteristics of a large number of open clusters given in the MWSC catalogue are used to study the spatial distribution of clusters in the Galaxy and determine the scale height, solar offset, local mass density and distribution of reddening material in the solar neighbourhood. We also explored the mass-radius and mass-age relations in the Galactic open star clusters. We find that the estimated parameters of the Galactic disk are largely influenced by the choice of cluster sample.

  10. ALCHEMY: a reliable method for automated SNP genotype calling for small batch sizes and highly homozygous populations

    PubMed Central

    Wright, Mark H.; Tung, Chih-Wei; Zhao, Keyan; Reynolds, Andy; McCouch, Susan R.; Bustamante, Carlos D.

    2010-01-01

    Motivation: The development of new high-throughput genotyping products requires a significant investment in testing and training samples to evaluate and optimize the product before it can be used reliably on new samples. One reason for this is current methods for automated calling of genotypes are based on clustering approaches which require a large number of samples to be analyzed simultaneously, or an extensive training dataset to seed clusters. In systems where inbred samples are of primary interest, current clustering approaches perform poorly due to the inability to clearly identify a heterozygote cluster. Results: As part of the development of two custom single nucleotide polymorphism genotyping products for Oryza sativa (domestic rice), we have developed a new genotype calling algorithm called ‘ALCHEMY’ based on statistical modeling of the raw intensity data rather than modelless clustering. A novel feature of the model is the ability to estimate and incorporate inbreeding information on a per sample basis allowing accurate genotyping of both inbred and heterozygous samples even when analyzed simultaneously. Since clustering is not used explicitly, ALCHEMY performs well on small sample sizes with accuracy exceeding 99% with as few as 18 samples. Availability: ALCHEMY is available for both commercial and academic use free of charge and distributed under the GNU General Public License at http://alchemy.sourceforge.net/ Contact: mhw6@cornell.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:20926420

  11. The biological characteristics of predominant strains of HIV-1 genotype: modeling of HIV-1 infection among men who have sex with men.

    PubMed

    Dai, Di; Shang, Hong; Han, Xiao-Xu; Zhao, Bin; Liu, Jing; Ding, Hai-Bo; Xu, Jun-Jie; Chu, Zhen-Xing

    2015-04-01

    To investigate the molecular subtypes of prevalent HIV-1 strains and characterize the genetics of dominant strains among men who have sex with men. Molecular epidemiology surveys in this study concentrated on the prevalent HIV-1 strains in Liaoning province by year. 229 adult patients infected with HIV-1 and part of a high-risk group of men who have sex with men were recruited. Reverse transcription and nested PCR amplification were performed. Sequencing reactions were conducted and edited, followed by codon-based alignment. NJ phylogenetic tree analyses detected two distinct CRF01_AE phylogenetic clusters, designated clusters 1 and 2. Clusters 1 and 2 accounted for 12.8% and 84.2% of sequences in the pol gene and 17.6% and 73.1% of sequences in the env gene, respectively. Another six samples were distributed on other phylogenetic clusters. Cluster 1 increased significantly from 5.6% to 20.0%, but cluster 2 decreased from 87.5% to 80.0%. Genetic distance analysis indicated that CRF01_AE cluster 1 in Liaoning was homologous to epidemic CRF01_AE strains, but CRF01_AE cluster 2 was different from other scattered strains. Additionally, significant differences were found in tetra-peptide motifs at the tip of V3 loop between cluster 1 and 2; however, differences in coreceptor usage were not detected. This study shows that subtype CRF01_AE strain may be the most prevalent epidemic strain in the men who have sex with men. Genetic characteristics of the subtype CRF01_AE cluster strain in Liaoning showed homology to the prevalent strains of men who have sex with men in other parts of China. © 2015 Wiley Periodicals, Inc.

  12. Clusternomics: Integrative context-dependent clustering for heterogeneous datasets

    PubMed Central

    Wernisch, Lorenz

    2017-01-01

    Integrative clustering is used to identify groups of samples by jointly analysing multiple datasets describing the same set of biological samples, such as gene expression, copy number, methylation etc. Most existing algorithms for integrative clustering assume that there is a shared consistent set of clusters across all datasets, and most of the data samples follow this structure. However in practice, the structure across heterogeneous datasets can be more varied, with clusters being joined in some datasets and separated in others. In this paper, we present a probabilistic clustering method to identify groups across datasets that do not share the same cluster structure. The proposed algorithm, Clusternomics, identifies groups of samples that share their global behaviour across heterogeneous datasets. The algorithm models clusters on the level of individual datasets, while also extracting global structure that arises from the local cluster assignments. Clusters on both the local and the global level are modelled using a hierarchical Dirichlet mixture model to identify structure on both levels. We evaluated the model both on simulated and on real-world datasets. The simulated data exemplifies datasets with varying degrees of common structure. In such a setting Clusternomics outperforms existing algorithms for integrative and consensus clustering. In a real-world application, we used the algorithm for cancer subtyping, identifying subtypes of cancer from heterogeneous datasets. We applied the algorithm to TCGA breast cancer dataset, integrating gene expression, miRNA expression, DNA methylation and proteomics. The algorithm extracted clinically meaningful clusters with significantly different survival probabilities. We also evaluated the algorithm on lung and kidney cancer TCGA datasets with high dimensionality, again showing clinically significant results and scalability of the algorithm. PMID:29036190

  13. Clusternomics: Integrative context-dependent clustering for heterogeneous datasets.

    PubMed

    Gabasova, Evelina; Reid, John; Wernisch, Lorenz

    2017-10-01

    Integrative clustering is used to identify groups of samples by jointly analysing multiple datasets describing the same set of biological samples, such as gene expression, copy number, methylation etc. Most existing algorithms for integrative clustering assume that there is a shared consistent set of clusters across all datasets, and most of the data samples follow this structure. However in practice, the structure across heterogeneous datasets can be more varied, with clusters being joined in some datasets and separated in others. In this paper, we present a probabilistic clustering method to identify groups across datasets that do not share the same cluster structure. The proposed algorithm, Clusternomics, identifies groups of samples that share their global behaviour across heterogeneous datasets. The algorithm models clusters on the level of individual datasets, while also extracting global structure that arises from the local cluster assignments. Clusters on both the local and the global level are modelled using a hierarchical Dirichlet mixture model to identify structure on both levels. We evaluated the model both on simulated and on real-world datasets. The simulated data exemplifies datasets with varying degrees of common structure. In such a setting Clusternomics outperforms existing algorithms for integrative and consensus clustering. In a real-world application, we used the algorithm for cancer subtyping, identifying subtypes of cancer from heterogeneous datasets. We applied the algorithm to TCGA breast cancer dataset, integrating gene expression, miRNA expression, DNA methylation and proteomics. The algorithm extracted clinically meaningful clusters with significantly different survival probabilities. We also evaluated the algorithm on lung and kidney cancer TCGA datasets with high dimensionality, again showing clinically significant results and scalability of the algorithm.

  14. Sustainable microbial water quality monitoring programme design using phage-lysis and multivariate techniques.

    PubMed

    Nnane, Daniel Ekane

    2011-11-15

    Contamination of surface waters is a pervasive threat to human health, hence, the need to better understand the sources and spatio-temporal variations of contaminants within river catchments. River catchment managers are required to sustainably monitor and manage the quality of surface waters. Catchment managers therefore need cost-effective low-cost long-term sustainable water quality monitoring and management designs to proactively protect public health and aquatic ecosystems. Multivariate and phage-lysis techniques were used to investigate spatio-temporal variations of water quality, main polluting chemophysical and microbial parameters, faecal micro-organisms sources, and to establish 'sentry' sampling sites in the Ouse River catchment, southeast England, UK. 350 river water samples were analysed for fourteen chemophysical and microbial water quality parameters in conjunction with the novel human-specific phages of Bacteroides GB-124 (Bacteroides GB-124). Annual, autumn, spring, summer, and winter principal components (PCs) explained approximately 54%, 75%, 62%, 48%, and 60%, respectively, of the total variance present in the datasets. Significant loadings of Escherichia coli, intestinal enterococci, turbidity, and human-specific Bacteroides GB-124 were observed in all datasets. Cluster analysis successfully grouped sampling sites into five clusters. Importantly, multivariate and phage-lysis techniques were useful in determining the sources and spatial extent of water contamination in the catchment. Though human faecal contamination was significant during dry periods, the main source of contamination was non-human. Bacteroides GB-124 could potentially be used for catchment routine microbial water quality monitoring. For a cost-effective low-cost long-term sustainable water quality monitoring design, E. coli or intestinal enterococci, turbidity, and Bacteroides GB-124 should be monitored all-year round in this river catchment. Copyright © 2011 Elsevier B.V. All rights reserved.

  15. Merging history of three bimodal clusters

    NASA Astrophysics Data System (ADS)

    Maurogordato, S.; Sauvageot, J. L.; Bourdin, H.; Cappi, A.; Benoist, C.; Ferrari, C.; Mars, G.; Houairi, K.

    2011-01-01

    We present a combined X-ray and optical analysis of three bimodal galaxy clusters selected as merging candidates at z ~ 0.1. These targets are part of MUSIC (MUlti-Wavelength Sample of Interacting Clusters), which is a general project designed to study the physics of merging clusters by means of multi-wavelength observations. Observations include spectro-imaging with XMM-Newton EPIC camera, multi-object spectroscopy (260 new redshifts), and wide-field imaging at the ESO 3.6 m and 2.2 m telescopes. We build a global picture of these clusters using X-ray luminosity and temperature maps together with galaxy density and velocity distributions. Idealized numerical simulations were used to constrain the merging scenario for each system. We show that A2933 is very likely an equal-mass advanced pre-merger ~200 Myr before the core collapse, while A2440 and A2384 are post-merger systems (~450 Myr and ~1.5 Gyr after core collapse, respectively). In the case of A2384, we detect a spectacular filament of galaxies and gas spreading over more than 1 h-1 Mpc, which we infer to have been stripped during the previous collision. The analysis of the MUSIC sample allows us to outline some general properties of merging clusters: a strong luminosity segregation of galaxies in recent post-mergers; the existence of preferential axes - corresponding to the merging directions - along which the BCGs and structures on various scales are aligned; the concomitance, in most major merger cases, of secondary merging or accretion events, with groups infalling onto the main cluster, and in some cases the evidence of previous merging episodes in one of the main components. These results are in good agreement with the hierarchical scenario of structure formation, in which clusters are expected to form by successive merging events, and matter is accreted along large-scale filaments. Based on data obtained with the European Southern Observatory, Chile (programs 072.A-0595, 075.A-0264, and 079.A-0425).Tables 5-7 are only available in electronic form at the CDS via anonymous ftp to cdsarc.u-strasbg.fr (130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/525/A79

  16. Spectroscopic studies of clusterization of methanol molecules isolated in a nitrogen matrix

    NASA Astrophysics Data System (ADS)

    Vaskivskyi, Ye.; Doroshenko, I.; Chernolevska, Ye.; Pogorelov, V.; Pitsevich, G.

    2017-12-01

    IR absorption spectra of methanol isolated in a nitrogen matrix are recorded at temperatures ranging from 9 to 34 K. The changes in the spectra with increasing matrix temperature are analyzed. Based on quantum-chemical calculations of the geometric and spectral parameters of different methanol clusters, the observed absorption bands are identified. The cluster composition of the sample is determined at each temperature. It is shown that as the matrix is heated there is a redistribution among the different cluster structures in the sample, from smaller to larger clusters.

  17. Integrated Theory of Planned Behavior with Extrinsic Motivation to Predict Intention Not to Use Illicit Drugs by Fifth-Grade Students in Taiwan

    ERIC Educational Resources Information Center

    Liao, Jung-Yu; Chang, Li-Chun; Hsu, Hsiao-Pei; Huang, Chiu-Mieh; Huang, Su-Fei; Guo, Jong-Long

    2017-01-01

    This study assessed the effects of a model that integrated the theory of planned behavior (TPB) with extrinsic motivation (EM) in predicting the intentions of fifth-grade students to not use illicit drugs. A cluster-sampling design was adopted in a cross-sectional survey (N = 571). The structural equation modeling results showed that the model…

  18. Using Cluster Analysis and ICP-MS to Identify Groups of Ecstasy Tablets in Sao Paulo State, Brazil.

    PubMed

    Maione, Camila; de Oliveira Souza, Vanessa Cristina; Togni, Loraine Rezende; da Costa, José Luiz; Campiglia, Andres Dobal; Barbosa, Fernando; Barbosa, Rommel Melgaço

    2017-11-01

    The variations found in the elemental composition in ecstasy samples result in spectral profiles with useful information for data analysis, and cluster analysis of these profiles can help uncover different categories of the drug. We provide a cluster analysis of ecstasy tablets based on their elemental composition. Twenty-five elements were determined by ICP-MS in tablets apprehended by Sao Paulo's State Police, Brazil. We employ the K-means clustering algorithm along with C4.5 decision tree to help us interpret the clustering results. We found a better number of two clusters within the data, which can refer to the approximated number of sources of the drug which supply the cities of seizures. The C4.5 model was capable of differentiating the ecstasy samples from the two clusters with high prediction accuracy using the leave-one-out cross-validation. The model used only Nd, Ni, and Pb concentration values in the classification of the samples. © 2017 American Academy of Forensic Sciences.

  19. EVIDENCE FOR THE UNIVERSALITY OF PROPERTIES OF RED-SEQUENCE GALAXIES IN X-RAY- AND RED-SEQUENCE-SELECTED CLUSTERS AT z ∼ 1

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Foltz, R.; Wilson, G.; DeGroot, A.

    We study the slope, intercept, and scatter of the color–magnitude and color–mass relations for a sample of 10 infrared red-sequence-selected clusters at z ∼ 1. The quiescent galaxies in these clusters formed the bulk of their stars above z ≳ 3 with an age spread Δt ≳ 1 Gyr. We compare UVJ color–color and spectroscopic-based galaxy selection techniques, and find a 15% difference in the galaxy populations classified as quiescent by these methods. We compare the color–magnitude relations from our red-sequence selected sample with X-ray- and photometric-redshift-selected cluster samples of similar mass and redshift. Within uncertainties, we are unable tomore » detect any difference in the ages and star formation histories of quiescent cluster members in clusters selected by different methods, suggesting that the dominant quenching mechanism is insensitive to cluster baryon partitioning at z ∼ 1.« less

  20. Measuring consistent masses for 25 Milky Way globular clusters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kimmig, Brian; Seth, Anil; Ivans, Inese I.

    2015-02-01

    We present central velocity dispersions, masses, mass-to-light ratios (M/Ls ), and rotation strengths for 25 Galactic globular clusters (GCs). We derive radial velocities of 1951 stars in 12 GCs from single order spectra taken with Hectochelle on the MMT telescope. To this sample we add an analysis of available archival data of individual stars. For the full set of data we fit King models to derive consistent dynamical parameters for the clusters. We find good agreement between single-mass King models and the observed radial dispersion profiles. The large, uniform sample of dynamical masses we derive enables us to examine trendsmore » of M/L with cluster mass and metallicity. The overall values of M/L and the trends with mass and metallicity are consistent with existing measurements from a large sample of M31 clusters. This includes a clear trend of increasing M/L with cluster mass and lower than expected M/Ls for the metal-rich clusters. We find no clear trend of increasing rotation with increasing cluster metallicity suggested in previous work.« less

  1. The Observations of Redshift Evolution in Large Scale Environments (ORELSE) Survey

    NASA Astrophysics Data System (ADS)

    Squires, Gordon K.; Lubin, L. M.; Gal, R. R.

    2007-05-01

    We present the motivation, design, and latest results from the Observations of Redshift Evolution in Large Scale Environments (ORELSE) Survey, a systematic search for structure on scales greater than 10 Mpc around 20 known galaxy clusters at z > 0.6. When complete, the survey will cover nearly 5 square degrees, all targeted at high-density regions, making it complementary and comparable to field surveys such as DEEP2, GOODS, and COSMOS. For the survey, we are using the Large Format Camera on the Palomar 5-m and SuPRIME-Cam on the Subaru 8-m to obtain optical/near-infrared imaging of an approximately 30 arcmin region around previously studied high-redshift clusters. Colors are used to identify likely member galaxies which are targeted for follow-up spectroscopy with the DEep Imaging Multi-Object Spectrograph on the Keck 10-m. This technique has been used to identify successfully the Cl 1604 supercluster at z = 0.9, a large scale structure containing at least eight clusters (Gal & Lubin 2004; Gal, Lubin & Squires 2005). We present the most recent structures to be photometrically and spectroscopically confirmed through this program, discuss the properties of the member galaxies as a function of environment, and describe our planned multi-wavelength (radio, mid-IR, and X-ray) observations of these systems. The goal of this survey is to identify and examine a statistical sample of large scale structures during an active period in the assembly history of the most massive clusters. With such a sample, we can begin to constrain large scale cluster dynamics and determine the effect of the larger environment on galaxy evolution.

  2. Galaxy Cluster Mass Reconstruction Project – III. The impact of dynamical substructure on cluster mass estimates

    DOE PAGES

    Old, L.; Wojtak, R.; Pearce, F. R.; ...

    2017-12-20

    With the advent of wide-field cosmological surveys, we are approaching samples of hundreds of thousands of galaxy clusters. While such large numbers will help reduce statistical uncertainties, the control of systematics in cluster masses is crucial. Here we examine the effects of an important source of systematic uncertainty in galaxy-based cluster mass estimation techniques: the presence of significant dynamical substructure. Dynamical substructure manifests as dynamically distinct subgroups in phase-space, indicating an ‘unrelaxed’ state. This issue affects around a quarter of clusters in a generally selected sample. We employ a set of mock clusters whose masses have been measured homogeneously withmore » commonly used galaxy-based mass estimation techniques (kinematic, richness, caustic, radial methods). We use these to study how the relation between observationally estimated and true cluster mass depends on the presence of substructure, as identified by various popular diagnostics. We find that the scatter for an ensemble of clusters does not increase dramatically for clusters with dynamical substructure. However, we find a systematic bias for all methods, such that clusters with significant substructure have higher measured masses than their relaxed counterparts. This bias depends on cluster mass: the most massive clusters are largely unaffected by the presence of significant substructure, but masses are significantly overestimated for lower mass clusters, by ~ 10 percent at 10 14 and ≳ 20 percent for ≲ 10 13.5. Finally, the use of cluster samples with different levels of substructure can therefore bias certain cosmological parameters up to a level comparable to the typical uncertainties in current cosmological studies.« less

  3. Galaxy Cluster Mass Reconstruction Project – III. The impact of dynamical substructure on cluster mass estimates

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Old, L.; Wojtak, R.; Pearce, F. R.

    With the advent of wide-field cosmological surveys, we are approaching samples of hundreds of thousands of galaxy clusters. While such large numbers will help reduce statistical uncertainties, the control of systematics in cluster masses is crucial. Here we examine the effects of an important source of systematic uncertainty in galaxy-based cluster mass estimation techniques: the presence of significant dynamical substructure. Dynamical substructure manifests as dynamically distinct subgroups in phase-space, indicating an ‘unrelaxed’ state. This issue affects around a quarter of clusters in a generally selected sample. We employ a set of mock clusters whose masses have been measured homogeneously withmore » commonly used galaxy-based mass estimation techniques (kinematic, richness, caustic, radial methods). We use these to study how the relation between observationally estimated and true cluster mass depends on the presence of substructure, as identified by various popular diagnostics. We find that the scatter for an ensemble of clusters does not increase dramatically for clusters with dynamical substructure. However, we find a systematic bias for all methods, such that clusters with significant substructure have higher measured masses than their relaxed counterparts. This bias depends on cluster mass: the most massive clusters are largely unaffected by the presence of significant substructure, but masses are significantly overestimated for lower mass clusters, by ~ 10 percent at 10 14 and ≳ 20 percent for ≲ 10 13.5. Finally, the use of cluster samples with different levels of substructure can therefore bias certain cosmological parameters up to a level comparable to the typical uncertainties in current cosmological studies.« less

  4. HICOSMO - cosmology with a complete sample of galaxy clusters - I. Data analysis, sample selection and luminosity-mass scaling relation

    NASA Astrophysics Data System (ADS)

    Schellenberger, G.; Reiprich, T. H.

    2017-08-01

    The X-ray regime, where the most massive visible component of galaxy clusters, the intracluster medium, is visible, offers directly measured quantities, like the luminosity, and derived quantities, like the total mass, to characterize these objects. The aim of this project is to analyse a complete sample of galaxy clusters in detail and constrain cosmological parameters, like the matter density, Ωm, or the amplitude of initial density fluctuations, σ8. The purely X-ray flux-limited sample (HIFLUGCS) consists of the 64 X-ray brightest galaxy clusters, which are excellent targets to study the systematic effects, that can bias results. We analysed in total 196 Chandra observations of the 64 HIFLUGCS clusters, with a total exposure time of 7.7 Ms. Here, we present our data analysis procedure (including an automated substructure detection and an energy band optimization for surface brightness profile analysis) that gives individually determined, robust total mass estimates. These masses are tested against dynamical and Planck Sunyaev-Zeldovich (SZ) derived masses of the same clusters, where good overall agreement is found with the dynamical masses. The Planck SZ masses seem to show a mass-dependent bias to our hydrostatic masses; possible biases in this mass-mass comparison are discussed including the Planck selection function. Furthermore, we show the results for the (0.1-2.4) keV luminosity versus mass scaling relation. The overall slope of the sample (1.34) is in agreement with expectations and values from literature. Splitting the sample into galaxy groups and clusters reveals, even after a selection bias correction, that galaxy groups exhibit a significantly steeper slope (1.88) compared to clusters (1.06).

  5. Tigers on trails: occupancy modeling for cluster sampling.

    PubMed

    Hines, J E; Nichols, J D; Royle, J A; MacKenzie, D I; Gopalaswamy, A M; Kumar, N Samba; Karanth, K U

    2010-07-01

    Occupancy modeling focuses on inference about the distribution of organisms over space, using temporal or spatial replication to allow inference about the detection process. Inference based on spatial replication strictly requires that replicates be selected randomly and with replacement, but the importance of these design requirements is not well understood. This paper focuses on an increasingly popular sampling design based on spatial replicates that are not selected randomly and that are expected to exhibit Markovian dependence. We develop two new occupancy models for data collected under this sort of design, one based on an underlying Markov model for spatial dependence and the other based on a trap response model with Markovian detections. We then simulated data under the model for Markovian spatial dependence and fit the data to standard occupancy models and to the two new models. Bias of occupancy estimates was substantial for the standard models, smaller for the new trap response model, and negligible for the new spatial process model. We also fit these models to data from a large-scale tiger occupancy survey recently conducted in Karnataka State, southwestern India. In addition to providing evidence of a positive relationship between tiger occupancy and habitat, model selection statistics and estimates strongly supported the use of the model with Markovian spatial dependence. This new model provides another tool for the decomposition of the detection process, which is sometimes needed for proper estimation and which may also permit interesting biological inferences. In addition to designs employing spatial replication, we note the likely existence of temporal Markovian dependence in many designs using temporal replication. The models developed here will be useful either directly, or with minor extensions, for these designs as well. We believe that these new models represent important additions to the suite of modeling tools now available for occupancy estimation in conservation monitoring. More generally, this work represents a contribution to the topic of cluster sampling for situations in which there is a need for specific modeling (e.g., reflecting dependence) for the distribution of the variable(s) of interest among subunits.

  6. The interpoint distance distribution as a descriptor of point patterns, with an application to spatial disease clustering.

    PubMed

    Bonetti, Marco; Pagano, Marcello

    2005-03-15

    The topic of this paper is the distribution of the distance between two points distributed independently in space. We illustrate the use of this interpoint distance distribution to describe the characteristics of a set of points within some fixed region. The properties of its sample version, and thus the inference about this function, are discussed both in the discrete and in the continuous setting. We illustrate its use in the detection of spatial clustering by application to a well-known leukaemia data set, and report on the results of a simulation experiment designed to study the power characteristics of the methods within that study region and in an artificial homogenous setting. Copyright (c) 2004 John Wiley & Sons, Ltd.

  7. Genome sequence of a cluster A13 mycobacteriophage detected in Mycobacterium phlei over a half century ago.

    PubMed

    Marton, Szilvia; Fehér, Enikő; Horváth, Balázs; Háber, Katalin; Somogyi, Pál; Minárovits, János; Bányai, Krisztián

    2016-01-01

    A phage infecting Mycobacterium phlei was isolated in 1958 from a soil sample in Hungary. Some physicochemical and biological properties of the virus were described in independent studies over the years. Here, we report the genome sequence of this early mycobacteriophage isolate. The Phlei phage genome measured 50,418 bp, had a GC content of 60.1 % and was predicted to encode 81 proteins and three tRNAs. Phylogeny of the tape measure protein revealed genetic relatedness to other early isolates of mycobacteriophages within subcluster A2. The genomic organization and genetic relationships to other strains showed that the Phlei phage belongs to a novel genetic cluster, designated A13.

  8. Star-Forming Galaxies in the Hercules Cluster: Hα Imaging of A2151

    NASA Astrophysics Data System (ADS)

    Cedrés, Bernabé; Iglesias-Páramo, Jorge; Vílchez, José Manuel; Reverte, Daniel; Petropoulou, Vasiliki; Hernández-Fernández, Jonathan

    2009-09-01

    This paper presents the first results of an Hα imaging survey of galaxies in the central regions of the A2151 cluster. A total of 50 sources were detected in Hα, from which 41 were classified as secure members of the cluster and 2 as likely members based on spectroscopic and photometric redshift considerations. The remaining seven galaxies were classified as background contaminants and thus excluded from our study on the Hα properties of the cluster. The morphologies of the 43 Hα selected galaxies range from grand design spirals and interacting galaxies to blue compacts and tidal dwarfs or isolated extragalactic H II regions, spanning a range of magnitudes of -21 <= MB <= -12.5 mag. From these 43 galaxies, 7 have been classified as active galactic nucleus (AGN) candidates. These AGN candidates follow the L(Hα) versus MB relationship of the normal galaxies, implying that the emission associated with the nuclear engine has a rather secondary impact on the total Hα emission of these galaxies. A comparison with the clusters Coma and A1367 and a sample of field galaxies has shown the presence of cluster galaxies with L(Hα) lower than expected for their MB , a consequence of the cluster environment. This fact results in differences in the L(Hα) versus EW(Hα) and L(Hα) distributions of the clusters with respect to the field, and in cluster-to-cluster variations of these quantities, which we propose are driven by a global cluster property as the total mass. In addition, the cluster Hα emitting galaxies tend to avoid the central regions of the clusters, again with different intensity depending on the cluster total mass. For the particular case of A2151, we find that most Hα emitting galaxies are located close to the regions with the higher galaxy density, offset from the main X-ray peak. Overall, we conclude that both the global cluster environment and the cluster merging history play a non-negligible role in the integral star formation properties of clusters of galaxies.

  9. STAR-FORMING GALAXIES IN THE HERCULES CLUSTER: H{alpha} IMAGING OF A2151

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cedres, Bernabe; Iglesias-Paramo, Jorge; VIlchez, Jose Manuel

    2009-09-15

    This paper presents the first results of an H{alpha} imaging survey of galaxies in the central regions of the A2151 cluster. A total of 50 sources were detected in H{alpha}, from which 41 were classified as secure members of the cluster and 2 as likely members based on spectroscopic and photometric redshift considerations. The remaining seven galaxies were classified as background contaminants and thus excluded from our study on the H{alpha} properties of the cluster. The morphologies of the 43 H{alpha} selected galaxies range from grand design spirals and interacting galaxies to blue compacts and tidal dwarfs or isolated extragalacticmore » H II regions, spanning a range of magnitudes of -21 {<=} M{sub B} {<=} -12.5 mag. From these 43 galaxies, 7 have been classified as active galactic nucleus (AGN) candidates. These AGN candidates follow the L(H{alpha}) versus M{sub B} relationship of the normal galaxies, implying that the emission associated with the nuclear engine has a rather secondary impact on the total H{alpha} emission of these galaxies. A comparison with the clusters Coma and A1367 and a sample of field galaxies has shown the presence of cluster galaxies with L(H{alpha}) lower than expected for their M{sub B} , a consequence of the cluster environment. This fact results in differences in the L(H{alpha}) versus EW(H{alpha}) and L(H{alpha}) distributions of the clusters with respect to the field, and in cluster-to-cluster variations of these quantities, which we propose are driven by a global cluster property as the total mass. In addition, the cluster H{alpha} emitting galaxies tend to avoid the central regions of the clusters, again with different intensity depending on the cluster total mass. For the particular case of A2151, we find that most H{alpha} emitting galaxies are located close to the regions with the higher galaxy density, offset from the main X-ray peak. Overall, we conclude that both the global cluster environment and the cluster merging history play a non-negligible role in the integral star formation properties of clusters of galaxies.« less

  10. Geographical Segregation of the Neurotoxin-Producing Cyanobacterium Anabaena circinalis

    PubMed Central

    Beltran, E. Carolina; Neilan, Brett A.

    2000-01-01

    Blooms of the cyanobacterium Anabaena circinalis are a major worldwide problem due to their production of a range of toxins, in particular the neurotoxins anatoxin-a and paralytic shellfish poisons (PSPs). Although there is a worldwide distribution of A. circinalis, there is a geographical segregation of neurotoxin production. American and European isolates of A. circinalis produce only anatoxin-a, while Australian isolates exclusively produce PSPs. The reason for this geographical segregation of neurotoxin production by A. circinalis is unknown. The phylogenetic structure of A. circinalis was determined by analyzing 16S rRNA gene sequences. A. circinalis was found to form a monophyletic group of international distribution. However, the PSP- and non-PSP-producing A. circinalis formed two distinct 16S rRNA gene clusters. A molecular probe was designed, allowing the identification of A. circinalis from cultured and uncultured environmental samples. In addition, probes targeting the predominantly PSP-producing or non-PSP-producing clusters were designed for the characterization of A. circinalis isolates as potential PSP producers. PMID:11010900

  11. An Analysis of Rich Cluster Redshift Survey Data for Large Scale Structure Studies

    NASA Astrophysics Data System (ADS)

    Slinglend, K.; Batuski, D.; Haase, S.; Hill, J.

    1994-12-01

    The results from the COBE satellite show the existence of structure on scales on the order of 10% or more of the horizon scale of the universe. Rich clusters of galaxies from Abell's catalog show evidence of structure on scales of 100 Mpc and may hold the promise of confirming structure on the scale of the COBE result. However, many Abell clusters have zero or only one measured redshift, so present knowledge of their three dimensional distribution has quite large uncertainties. The shortage of measured redshifts for these clusters may also mask a problem of projection effects corrupting the membership counts for the clusters. Our approach in this effort has been to use the MX multifiber spectrometer on the Steward 2.3m to measure redshifts of at least ten galaxies in each of 80 Abell cluster fields with richness class R>= 1 and mag10 <= 16.8 (estimated z<= 0.12) and zero or one measured redshifts. This work will result in a deeper, more complete (and reliable) sample of positions of rich clusters. Our primary intent for the sample is for two-point correlation and other studies of the large scale structure traced by these clusters in an effort to constrain theoretical models for structure formation. We are also obtaining enough redshifts per cluster so that a much better sample of reliable cluster velocity dispersions will be available for other studies of cluster properties. To date, we have collected such data for 64 clusters, and for most of them, we have seven or more cluster members with redshifts, allowing for reliable velocity dispersion calculations. Velocity histograms and stripe density plots for several interesting cluster fields are presented, along with summary tables of cluster redshift results. Also, with 10 or more redshifts in most of our cluster fields (30({') } square, just about an `Abell diameter' at z ~ 0.1) we have investigated the extent of projection effects within the Abell catalog in an effort to quantify and understand how this may effect the Abell sample.

  12. THE SWIFT AGN AND CLUSTER SURVEY. II. CLUSTER CONFIRMATION WITH SDSS DATA

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Griffin, Rhiannon D.; Dai, Xinyu; Kochanek, Christopher S.

    2016-01-15

    We study 203 (of 442) Swift AGN and Cluster Survey extended X-ray sources located in the SDSS DR8 footprint to search for galaxy over-densities in three-dimensional space using SDSS galaxy photometric redshifts and positions near the Swift cluster candidates. We find 104 Swift clusters with a >3σ galaxy over-density. The remaining targets are potentially located at higher redshifts and require deeper optical follow-up observations for confirmation as galaxy clusters. We present a series of cluster properties including the redshift, brightest cluster galaxy (BCG) magnitude, BCG-to-X-ray center offset, optical richness, and X-ray luminosity. We also detect red sequences in ∼85% ofmore » the 104 confirmed clusters. The X-ray luminosity and optical richness for the SDSS confirmed Swift clusters are correlated and follow previously established relations. The distribution of the separations between the X-ray centroids and the most likely BCG is also consistent with expectation. We compare the observed redshift distribution of the sample with a theoretical model, and find that our sample is complete for z ≲ 0.3 and is still 80% complete up to z ≃ 0.4, consistent with the SDSS survey depth. These analysis results suggest that our Swift cluster selection algorithm has yielded a statistically well-defined cluster sample for further study of cluster evolution and cosmology. We also match our SDSS confirmed Swift clusters to existing cluster catalogs, and find 42, 23, and 1 matches in optical, X-ray, and Sunyaev–Zel’dovich catalogs, respectively, and so the majority of these clusters are new detections.« less

  13. The effect of Fisher information matrix approximation methods in population optimal design calculations.

    PubMed

    Strömberg, Eric A; Nyberg, Joakim; Hooker, Andrew C

    2016-12-01

    With the increasing popularity of optimal design in drug development it is important to understand how the approximations and implementations of the Fisher information matrix (FIM) affect the resulting optimal designs. The aim of this work was to investigate the impact on design performance when using two common approximations to the population model and the full or block-diagonal FIM implementations for optimization of sampling points. Sampling schedules for two example experiments based on population models were optimized using the FO and FOCE approximations and the full and block-diagonal FIM implementations. The number of support points was compared between the designs for each example experiment. The performance of these designs based on simulation/estimations was investigated by computing bias of the parameters as well as through the use of an empirical D-criterion confidence interval. Simulations were performed when the design was computed with the true parameter values as well as with misspecified parameter values. The FOCE approximation and the Full FIM implementation yielded designs with more support points and less clustering of sample points than designs optimized with the FO approximation and the block-diagonal implementation. The D-criterion confidence intervals showed no performance differences between the full and block diagonal FIM optimal designs when assuming true parameter values. However, the FO approximated block-reduced FIM designs had higher bias than the other designs. When assuming parameter misspecification in the design evaluation, the FO Full FIM optimal design was superior to the FO block-diagonal FIM design in both of the examples.

  14. Profiling Local Optima in K-Means Clustering: Developing a Diagnostic Technique

    ERIC Educational Resources Information Center

    Steinley, Douglas

    2006-01-01

    Using the cluster generation procedure proposed by D. Steinley and R. Henson (2005), the author investigated the performance of K-means clustering under the following scenarios: (a) different probabilities of cluster overlap; (b) different types of cluster overlap; (c) varying samples sizes, clusters, and dimensions; (d) different multivariate…

  15. Cluster lot quality assurance sampling: effect of increasing the number of clusters on classification precision and operational feasibility.

    PubMed

    Okayasu, Hiromasa; Brown, Alexandra E; Nzioki, Michael M; Gasasira, Alex N; Takane, Marina; Mkanda, Pascal; Wassilak, Steven G F; Sutter, Roland W

    2014-11-01

    To assess the quality of supplementary immunization activities (SIAs), the Global Polio Eradication Initiative (GPEI) has used cluster lot quality assurance sampling (C-LQAS) methods since 2009. However, since the inception of C-LQAS, questions have been raised about the optimal balance between operational feasibility and precision of classification of lots to identify areas with low SIA quality that require corrective programmatic action. To determine if an increased precision in classification would result in differential programmatic decision making, we conducted a pilot evaluation in 4 local government areas (LGAs) in Nigeria with an expanded LQAS sample size of 16 clusters (instead of the standard 6 clusters) of 10 subjects each. The results showed greater heterogeneity between clusters than the assumed standard deviation of 10%, ranging from 12% to 23%. Comparing the distribution of 4-outcome classifications obtained from all possible combinations of 6-cluster subsamples to the observed classification of the 16-cluster sample, we obtained an exact match in classification in 56% to 85% of instances. We concluded that the 6-cluster C-LQAS provides acceptable classification precision for programmatic action. Considering the greater resources required to implement an expanded C-LQAS, the improvement in precision was deemed insufficient to warrant the effort. Published by Oxford University Press on behalf of the Infectious Diseases Society of America 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  16. Clustering on very small scales from a large sample of confirmed quasar pairs: does quasar clustering track from Mpc to kpc scales?

    NASA Astrophysics Data System (ADS)

    Eftekharzadeh, S.; Myers, A. D.; Hennawi, J. F.; Djorgovski, S. G.; Richards, G. T.; Mahabal, A. A.; Graham, M. J.

    2017-06-01

    We present the most precise estimate to date of the clustering of quasars on very small scales, based on a sample of 47 binary quasars with magnitudes of g < 20.85 and proper transverse separations of ˜25 h-1 kpc. Our sample of binary quasars, which is about six times larger than any previous spectroscopically confirmed sample on these scales, is targeted using a kernel density estimation (KDE) technique applied to Sloan Digital Sky Survey (SDSS) imaging over most of the SDSS area. Our sample is 'complete' in that all of the KDE target pairs with 17.0 ≲ R ≲ 36.2 h-1 kpc in our area of interest have been spectroscopically confirmed from a combination of previous surveys and our own long-slit observational campaign. We catalogue 230 candidate quasar pairs with angular separations of <8 arcsec, from which our binary quasars were identified. We determine the projected correlation function of quasars (\\bar{W}_p) in four bins of proper transverse scale over the range 17.0 ≲ R ≲ 36.2 h-1 kpc. The implied small-scale quasar clustering amplitude from the projected correlation function, integrated across our entire redshift range, is A = 24.1 ± 3.6 at ˜26.6 h-1 kpc. Our sample is the first spectroscopically confirmed sample of quasar pairs that is sufficiently large to study how quasar clustering evolves with redshift at ˜25 h-1 kpc. We find that empirical descriptions of how quasar clustering evolves with redshift at ˜25 h-1 Mpc also adequately describe the evolution of quasar clustering at ˜25 h-1 kpc.

  17. Online clustering algorithms for radar emitter classification.

    PubMed

    Liu, Jun; Lee, Jim P Y; Senior; Li, Lingjie; Luo, Zhi-Quan; Wong, K Max

    2005-08-01

    Radar emitter classification is a special application of data clustering for classifying unknown radar emitters from received radar pulse samples. The main challenges of this task are the high dimensionality of radar pulse samples, small sample group size, and closely located radar pulse clusters. In this paper, two new online clustering algorithms are developed for radar emitter classification: One is model-based using the Minimum Description Length (MDL) criterion and the other is based on competitive learning. Computational complexity is analyzed for each algorithm and then compared. Simulation results show the superior performance of the model-based algorithm over competitive learning in terms of better classification accuracy, flexibility, and stability.

  18. Magnetic signature of overbank sediment in industry impacted floodplains identified by data mining methods

    NASA Astrophysics Data System (ADS)

    Chudaničová, Monika; Hutchinson, Simon M.

    2016-11-01

    Our study attempts to identify a characteristic magnetic signature of overbank sediments exhibiting anthropogenically induced magnetic enhancement and thereby to distinguish them from unenhanced sediments with weak magnetic background values, using a novel approach based on data mining methods, thus providing a mean of rapid pollution determination. Data were obtained from 539 bulk samples from vertical profiles through overbank sediment, collected on seven rivers in the eastern Czech Republic and three rivers in northwest England. k-Means clustering and hierarchical clustering methods, paired group (UPGMA) and Ward's method, were used to divide the samples to natural groups according to their attributes. Interparametric ratios: SIRM/χ; SIRM/ARM; and S-0.1T were chosen as attributes for analyses making the resultant model more widely applicable as magnetic concentration values can differ by two orders. Division into three clusters appeared to be optimal and corresponded to inherent clusters in the data scatter. Clustering managed to separate samples with relatively weak anthropogenically induced enhancement, relatively strong anthropogenically induced enhancement and samples lacking enhancement. To describe the clusters explicitly and thus obtain a discrete magnetic signature, classification rules (JRip method) and decision trees (J4.8 and Simple Cart methods) were used. Samples lacking anthropogenic enhancement typically exhibited an S-0.1T < c. 0.5, SIRM/ARM < c. 150 and SIRM/χ < c. 6000 A m-1. Samples with magnetic enhancement all exhibited an S-0.1T > 0.5. Samples with relatively stronger anthropogenic enhancement were unequivocally distinguished from the samples with weaker enhancement by an SIRM/ARM > c. 150. Samples with SIRM/ARM in a range c. 126-150 were classified as relatively strongly enhanced when their SIRM/χ > 18 000 A m-1 and relatively less enhanced when their SIRM/χ < 18 000 A m-1. An additional rule was arbitrary added to exclude samples with χfd% > 6 per cent from anthropogenically enhanced clusters as samples with natural magnetic enhancement. The characteristics of the clusters resulted mainly from the relationship between SIRM/ARM and the S-0.1T, and SIRM/χ and the S-0.1T. Both SIRM/ARM and SIRM/χ increase with increasing S-0.1T values reflecting a greater level of anthropogenic magnetic particles. Overall, data mining methods demonstrated good potential for utilization in environmental magnetism.

  19. Unsupervised active learning based on hierarchical graph-theoretic clustering.

    PubMed

    Hu, Weiming; Hu, Wei; Xie, Nianhua; Maybank, Steve

    2009-10-01

    Most existing active learning approaches are supervised. Supervised active learning has the following problems: inefficiency in dealing with the semantic gap between the distribution of samples in the feature space and their labels, lack of ability in selecting new samples that belong to new categories that have not yet appeared in the training samples, and lack of adaptability to changes in the semantic interpretation of sample categories. To tackle these problems, we propose an unsupervised active learning framework based on hierarchical graph-theoretic clustering. In the framework, two promising graph-theoretic clustering algorithms, namely, dominant-set clustering and spectral clustering, are combined in a hierarchical fashion. Our framework has some advantages, such as ease of implementation, flexibility in architecture, and adaptability to changes in the labeling. Evaluations on data sets for network intrusion detection, image classification, and video classification have demonstrated that our active learning framework can effectively reduce the workload of manual classification while maintaining a high accuracy of automatic classification. It is shown that, overall, our framework outperforms the support-vector-machine-based supervised active learning, particularly in terms of dealing much more efficiently with new samples whose categories have not yet appeared in the training samples.

  20. Self-similarity of temperature profiles in distant galaxy clusters: the quest for a universal law

    NASA Astrophysics Data System (ADS)

    Baldi, A.; Ettori, S.; Molendi, S.; Gastaldello, F.

    2012-09-01

    Context. We present the XMM-Newton temperature profiles of 12 bright (LX > 4 × 1044 erg s-1) clusters of galaxies at 0.4 < z < 0.9, having an average temperature in the range 5 ≲ kT ≲ 11 keV. Aims: The main goal of this paper is to study for the first time the temperature profiles of a sample of high-redshift clusters, to investigate their properties, and to define a universal law to describe the temperature radial profiles in galaxy clusters as a function of both cosmic time and their state of relaxation. Methods: We performed a spatially resolved spectral analysis, using Cash statistics, to measure the temperature in the intracluster medium at different radii. Results: We extracted temperature profiles for the clusters in our sample, finding that all profiles are declining toward larger radii. The normalized temperature profiles (normalized by the mean temperature T500) are found to be generally self-similar. The sample was subdivided into five cool-core (CC) and seven non cool-core (NCC) clusters by introducing a pseudo-entropy ratio σ = (TIN/TOUT) × (EMIN/EMOUT)-1/3 and defining the objects with σ < 0.6 as CC clusters and those with σ ≥ 0.6 as NCC clusters. The profiles of CC and NCC clusters differ mainly in the central regions, with the latter exhibiting a slightly flatter central profile. A significant dependence of the temperature profiles on the pseudo-entropy ratio σ is detected by fitting a function of r and σ, showing an indication that the outer part of the profiles becomes steeper for higher values of σ (i.e. transitioning toward the NCC clusters). No significant evidence of redshift evolution could be found within the redshift range sampled by our clusters (0.4 < z < 0.9). A comparison of our high-z sample with intermediate clusters at 0.1 < z < 0.3 showed how the CC and NCC cluster temperature profiles have experienced some sort of evolution. This can happen because higher z clusters are at a less advanced stage of their formation and did not have enough time to create a relaxed structure, which is characterized by a central temperature dip in CC clusters and by flatter profiles in NCC clusters. Conclusions: This is the first time that a systematic study of the temperature profiles of galaxy clusters at z > 0.4 has been attempted. We were able to define the closest possible relation to a universal law for the temperature profiles of galaxy clusters at 0.1 < z < 0.9, showing a dependence on both the relaxation state of the clusters and the redshift. Appendix A is only available in electronic form at http://www.aanda.org

  1. Genetic diversity and divergence at the Arbutus unedo L. (Ericaceae) westernmost distribution limit.

    PubMed

    Ribeiro, Maria Margarida; Piotti, Andrea; Ricardo, Alexandra; Gaspar, Daniel; Costa, Rita; Parducci, Laura; Vendramin, Giovanni Giuseppe

    2017-01-01

    Mediterranean forests are fragile ecosystems vulnerable to recent global warming and reduction of precipitation, and a long-term negative effect is expected on vegetation with increasing drought and in areas burnt by fires. We investigated the spatial distribution of genetic variation of Arbutus unedo in the western Iberia Peninsula, using plastid markers with conservation and provenance regions design purposes. This species is currently undergoing an intense domestication process in the region, and, like other species, is increasingly under the threat from climate change, habitat fragmentation and wildfires. We sampled 451 trees from 15 natural populations from different ecological conditions spanning the whole species' distribution range in the region. We applied Bayesian analysis and identified four clusters (north, centre, south, and a single-population cluster). Hierarchical AMOVA showed higher differentiation among clusters than among populations within clusters. The relatively low within-clusters differentiation can be explained by a common postglacial history of nearby populations. The genetic structure found, supported by the few available palaeobotanical records, cannot exclude the hypothesis of two independent A. unedo refugia in western Iberia Peninsula during the Last Glacial Maximum. Based on the results we recommend a conservation strategy by selecting populations for conservation based on their allelic richness and diversity and careful seed transfer consistent with current species' genetic structure.

  2. Genetic diversity and divergence at the Arbutus unedo L. (Ericaceae) westernmost distribution limit

    PubMed Central

    Ribeiro, Maria Margarida; Piotti, Andrea; Ricardo, Alexandra; Gaspar, Daniel; Costa, Rita; Parducci, Laura; Vendramin, Giovanni Giuseppe

    2017-01-01

    Mediterranean forests are fragile ecosystems vulnerable to recent global warming and reduction of precipitation, and a long-term negative effect is expected on vegetation with increasing drought and in areas burnt by fires. We investigated the spatial distribution of genetic variation of Arbutus unedo in the western Iberia Peninsula, using plastid markers with conservation and provenance regions design purposes. This species is currently undergoing an intense domestication process in the region, and, like other species, is increasingly under the threat from climate change, habitat fragmentation and wildfires. We sampled 451 trees from 15 natural populations from different ecological conditions spanning the whole species’ distribution range in the region. We applied Bayesian analysis and identified four clusters (north, centre, south, and a single-population cluster). Hierarchical AMOVA showed higher differentiation among clusters than among populations within clusters. The relatively low within-clusters differentiation can be explained by a common postglacial history of nearby populations. The genetic structure found, supported by the few available palaeobotanical records, cannot exclude the hypothesis of two independent A. unedo refugia in western Iberia Peninsula during the Last Glacial Maximum. Based on the results we recommend a conservation strategy by selecting populations for conservation based on their allelic richness and diversity and careful seed transfer consistent with current species’ genetic structure. PMID:28384294

  3. U.S. consumer demand for restaurant calorie information: targeting demographic and behavioral segments in labeling initiatives.

    PubMed

    Kolodinsky, Jane; Reynolds, Travis William; Cannella, Mark; Timmons, David; Bromberg, Daniel

    2009-01-01

    To identify different segments of U.S. consumers based on food choices, exercise patterns, and desire for restaurant calorie labeling. Using a stratified (by region) random sample of the U.S. population, trained interviewers collected data for this cross-sectional study through telephone surveys. Center for Rural Studies U.S. national health survey. The final sample included 580 responses (22% response rate); data were weighted to be representative of age and gender characteristics of the U.S. population. Self-reported behaviors related to food choices, exercise patterns, desire for calorie information in restaurants, and sample demographics. Clusters were identified using Schwartz Bayesian criteria. Impacts of demographic characteristics on cluster membership were analyzed using bivariate tests of association and multinomial logit regression. Cluster analysis revealed three clusters based on respondents' food choices, activity levels, and desire for restaurant labeling. Two clusters, comprising three quarters of the sample, desired calorie labeling in restaurants. The remaining cluster opposed restaurant labeling. Demographic variables significantly predicting cluster membership included region of residence (p < .10), income (p < .05), gender (p < .01), and age (p < .10). Though limited by a low response and potential self-reporting bias in the phone survey, this study suggests that several groups are likely to benefit from restaurant calorie labeling. Specific demographic clusters could be targeted through labeling initiatives.

  4. Testing the accuracy of clustering redshifts with simulations

    NASA Astrophysics Data System (ADS)

    Scottez, V.; Benoit-Lévy, A.; Coupon, J.; Ilbert, O.; Mellier, Y.

    2018-03-01

    We explore the accuracy of clustering-based redshift inference within the MICE2 simulation. This method uses the spatial clustering of galaxies between a spectroscopic reference sample and an unknown sample. This study give an estimate of the reachable accuracy of this method. First, we discuss the requirements for the number objects in the two samples, confirming that this method does not require a representative spectroscopic sample for calibration. In the context of next generation of cosmological surveys, we estimated that the density of the Quasi Stellar Objects in BOSS allows us to reach 0.2 per cent accuracy in the mean redshift. Secondly, we estimate individual redshifts for galaxies in the densest regions of colour space ( ˜ 30 per cent of the galaxies) without using the photometric redshifts procedure. The advantage of this procedure is threefold. It allows: (i) the use of cluster-zs for any field in astronomy, (ii) the possibility to combine photo-zs and cluster-zs to get an improved redshift estimation, (iii) the use of cluster-z to define tomographic bins for weak lensing. Finally, we explore this last option and build five cluster-z selected tomographic bins from redshift 0.2 to 1. We found a bias on the mean redshift estimate of 0.002 per bin. We conclude that cluster-z could be used as a primary redshift estimator by next generation of cosmological surveys.

  5. Genetic variation, population structure and linkage disequilibrium in Switchgrass with ISSR, SCoT and EST-SSR markers.

    PubMed

    Zhang, Yu; Yan, Haidong; Jiang, Xiaomei; Wang, Xiaoli; Huang, Linkai; Xu, Bin; Zhang, Xinquan; Zhang, Lexin

    2016-01-01

    To evaluate genetic variation, population structure, and the extent of linkage disequilibrium (LD), 134 switchgrass ( Panicum virgatum L.) samples were analyzed with 51 markers, including 16 ISSRs, 20 SCoTs, and 15 EST-SSRs. In this study, a high level of genetic variation was observed in the switchgrass samples and they had an average Nei's gene diversity index (H) of 0.311. A total of 793 bands were obtained, of which 708 (89.28 %) were polymorphic. Using a parameter marker index (MI), the efficiency of the three types of markers (ISSR, SCoT, and EST-SSR) in the study were compared and we found that SCoT had a higher marker efficiency than the other two markers. The 134 switchgrass samples could be divided into two sub-populations based on STRUCTURE, UPGMA clustering, and principal coordinate analyses (PCA), and upland and lowland ecotypes could be separated by UPGMA clustering and PCA analyses. Linkage disequilibrium analysis revealed an average r 2 of 0.035 across all 51 markers, indicating a trend of higher LD in sub-population 2 than that in sub-population 1 ( P  < 0.01). The population structure revealed in this study will guide the design of future association studies using these switchgrass samples.

  6. Social Network Clustering and the Spread of HIV/AIDS Among Persons Who Inject Drugs in 2 Cities in the Philippines.

    PubMed

    Verdery, Ashton M; Siripong, Nalyn; Pence, Brian W

    2017-09-01

    The Philippines has seen rapid increases in HIV prevalence among people who inject drugs. We study 2 neighboring cities where a linked HIV epidemic differed in timing of onset and levels of prevalence. In Cebu, prevalence rose rapidly from below 1% to 54% between 2009 and 2011 and remained high through 2013. In nearby Mandaue, HIV remained below 4% through 2011 then rose rapidly to 38% by 2013. We hypothesize that infection prevalence differences in these cities may owe to aspects of social network structure, specifically levels of network clustering. Building on previous research, we hypothesize that higher levels of network clustering are associated with greater epidemic potential. Data were collected with respondent-driven sampling among men who inject drugs in Cebu and Mandaue in 2013. We first examine sample composition using estimators for population means. We then apply new estimators of network clustering in respondent-driven sampling data to examine associations with HIV prevalence. Samples in both cities were comparable in composition by age, education, and injection locations. Dyadic needle-sharing levels were also similar between the 2 cities, but network clustering in the needle-sharing network differed dramatically. We found higher clustering in Cebu than Mandaue, consistent with expectations that higher clustering is associated with faster epidemic spread. This article is the first to apply estimators of network clustering to empirical respondent-driven samples, and it offers suggestive evidence that researchers should pay greater attention to network structure's role in HIV transmission dynamics.

  7. A clustering algorithm for sample data based on environmental pollution characteristics

    NASA Astrophysics Data System (ADS)

    Chen, Mei; Wang, Pengfei; Chen, Qiang; Wu, Jiadong; Chen, Xiaoyun

    2015-04-01

    Environmental pollution has become an issue of serious international concern in recent years. Among the receptor-oriented pollution models, CMB, PMF, UNMIX, and PCA are widely used as source apportionment models. To improve the accuracy of source apportionment and classify the sample data for these models, this study proposes an easy-to-use, high-dimensional EPC algorithm that not only organizes all of the sample data into different groups according to the similarities in pollution characteristics such as pollution sources and concentrations but also simultaneously detects outliers. The main clustering process consists of selecting the first unlabelled point as the cluster centre, then assigning each data point in the sample dataset to its most similar cluster centre according to both the user-defined threshold and the value of similarity function in each iteration, and finally modifying the clusters using a method similar to k-Means. The validity and accuracy of the algorithm are tested using both real and synthetic datasets, which makes the EPC algorithm practical and effective for appropriately classifying sample data for source apportionment models and helpful for better understanding and interpreting the sources of pollution.

  8. Beam Design and User Scheduling for Nonorthogonal Multiple Access With Multiple Antennas Based on Pareto Optimality

    NASA Astrophysics Data System (ADS)

    Seo, Junyeong; Sung, Youngchul

    2018-06-01

    In this paper, an efficient transmit beam design and user scheduling method is proposed for multi-user (MU) multiple-input single-output (MISO) non-orthogonal multiple access (NOMA) downlink, based on Pareto-optimality. The proposed beam design and user scheduling method groups simultaneously-served users into multiple clusters with practical two users in each cluster, and then applies spatical zeroforcing (ZF) across clusters to control inter-cluster interference (ICI) and Pareto-optimal beam design with successive interference cancellation (SIC) to two users in each cluster to remove interference to strong users and leverage signal-to-interference-plus-noise ratios (SINRs) of interference-experiencing weak users. The proposed method has flexibility to control the rates of strong and weak users and numerical results show that the proposed method yields good performance.

  9. CyTOF workflow: differential discovery in high-throughput high-dimensional cytometry datasets

    PubMed Central

    Nowicka, Malgorzata; Krieg, Carsten; Weber, Lukas M.; Hartmann, Felix J.; Guglietta, Silvia; Becher, Burkhard; Levesque, Mitchell P.; Robinson, Mark D.

    2017-01-01

    High dimensional mass and flow cytometry (HDCyto) experiments have become a method of choice for high throughput interrogation and characterization of cell populations.Here, we present an R-based pipeline for differential analyses of HDCyto data, largely based on Bioconductor packages. We computationally define cell populations using FlowSOM clustering, and facilitate an optional but reproducible strategy for manual merging of algorithm-generated clusters. Our workflow offers different analysis paths, including association of cell type abundance with a phenotype or changes in signaling markers within specific subpopulations, or differential analyses of aggregated signals. Importantly, the differential analyses we show are based on regression frameworks where the HDCyto data is the response; thus, we are able to model arbitrary experimental designs, such as those with batch effects, paired designs and so on. In particular, we apply generalized linear mixed models to analyses of cell population abundance or cell-population-specific analyses of signaling markers, allowing overdispersion in cell count or aggregated signals across samples to be appropriately modeled. To support the formal statistical analyses, we encourage exploratory data analysis at every step, including quality control (e.g. multi-dimensional scaling plots), reporting of clustering results (dimensionality reduction, heatmaps with dendrograms) and differential analyses (e.g. plots of aggregated signals). PMID:28663787

  10. Effect Sizes in Cluster-Randomized Designs

    ERIC Educational Resources Information Center

    Hedges, Larry V.

    2007-01-01

    Multisite research designs involving cluster randomization are becoming increasingly important in educational and behavioral research. Researchers would like to compute effect size indexes based on the standardized mean difference to compare the results of cluster-randomized studies (and corresponding quasi-experiments) with other studies and to…

  11. Cosmology with XMM galaxy clusters: the X-CLASS/GROND catalogue and photometric redshifts

    NASA Astrophysics Data System (ADS)

    Ridl, J.; Clerc, N.; Sadibekova, T.; Faccioli, L.; Pacaud, F.; Greiner, J.; Krühler, T.; Rau, A.; Salvato, M.; Menzel, M.-L.; Steinle, H.; Wiseman, P.; Nandra, K.; Sanders, J.

    2017-06-01

    The XMM Cluster Archive Super Survey (X-CLASS) is a serendipitously detected X-ray-selected sample of 845 galaxy clusters based on 2774 XMM archival observations and covering an approximately 90 deg2 spread across the high-Galactic latitude (|b| > 20°) sky. The primary goal of this survey is to produce a well-selected sample of galaxy clusters on which cosmological analyses can be performed. This paper presents the photometric redshift follow-up of a high signal-to-noise ratio subset of 265 of these clusters with declination δ < +20° with Gamma-Ray Burst Optical and Near-Infrared Detector (GROND), a 7-channel (grizJHK) simultaneous imager on the MPG 2.2-m telescope at the ESO La Silla Observatory. We use a newly developed technique based on the red sequence colour-redshift relation, enhanced with information coming from the X-ray detection to provide photometric redshifts for this sample. We determine photometric redshifts for 232 clusters, finding a median redshift of z = 0.39 with an accuracy of Δz = 0.02(1 + z) when compared to a sample of 76 spectroscopically confirmed clusters. We also compute X-ray luminosities for the entire sample and find a median bolometric luminosity of 7.2 × 1043 erg s-1 and a median temperature of 2.9 keV. We compare our results to those of the XMM-XCS and XMM-XXL surveys, finding good agreement in both samples. The X-CLASS catalogue is available online at http://xmm-lss.in2p3.fr:8080/l4sdb/.

  12. Changes in cluster magnetism and suppression of local superconductivity in amorphous FeCrB alloy irradiated by Ar+ ions

    NASA Astrophysics Data System (ADS)

    Okunev, V. D.; Samoilenko, Z. A.; Szymczak, H.; Szewczyk, A.; Szymczak, R.; Lewandowski, S. J.; Aleshkevych, P.; Malinowski, A.; Gierłowski, P.; Więckowski, J.; Wolny-Marszałek, M.; Jeżabek, M.; Varyukhin, V. N.; Antoshina, I. A.

    2016-02-01

    We show that сluster magnetism in ferromagnetic amorphous Fe67Cr18B15 alloy is related to the presence of large, D=150-250 Å, α-(Fe Cr) clusters responsible for basic changes in cluster magnetism, small, D=30-100 Å, α-(Fe, Cr) and Fe3B clusters and subcluster atomic α-(Fe, Cr, B) groupings, D=10-20 Å, in disordered intercluster medium. For initial sample and irradiated one (Φ=1.5×1018 ions/cm2) superconductivity exists in the cluster shells of metallic α-(Fe, Cr) phase where ferromagnetism of iron is counterbalanced by antiferromagnetism of chromium. At Φ=3×1018 ions/cm2, the internal stresses intensify and the process of iron and chromium phase separation, favorable for mesoscopic superconductivity, changes for inverse one promoting more homogeneous distribution of iron and chromium in the clusters as well as gigantic (twice as much) increase in density of the samples. As a result, in the cluster shells ferromagnetism is restored leading to the increase in magnetization of the sample and suppression of local superconductivity. For initial samples, the temperature dependence of resistivity ρ(T) T2 is determined by the electron scattering on quantum defects. In strongly inhomogeneous samples, after irradiation by fluence Φ=1.5×1018 ions/cm2, the transition to a dependence ρ(T) T1/2 is caused by the effects of weak localization. In more homogeneous samples, at Φ=3×1018 ions/cm2, a return to the dependence ρ(T) T2 is observed.

  13. DAFi: A directed recursive data filtering and clustering approach for improving and interpreting data clustering identification of cell populations from polychromatic flow cytometry data.

    PubMed

    Lee, Alexandra J; Chang, Ivan; Burel, Julie G; Lindestam Arlehamn, Cecilia S; Mandava, Aishwarya; Weiskopf, Daniela; Peters, Bjoern; Sette, Alessandro; Scheuermann, Richard H; Qian, Yu

    2018-04-17

    Computational methods for identification of cell populations from polychromatic flow cytometry data are changing the paradigm of cytometry bioinformatics. Data clustering is the most common computational approach to unsupervised identification of cell populations from multidimensional cytometry data. However, interpretation of the identified data clusters is labor-intensive. Certain types of user-defined cell populations are also difficult to identify by fully automated data clustering analysis. Both are roadblocks before a cytometry lab can adopt the data clustering approach for cell population identification in routine use. We found that combining recursive data filtering and clustering with constraints converted from the user manual gating strategy can effectively address these two issues. We named this new approach DAFi: Directed Automated Filtering and Identification of cell populations. Design of DAFi preserves the data-driven characteristics of unsupervised clustering for identifying novel cell subsets, but also makes the results interpretable to experimental scientists through mapping and merging the multidimensional data clusters into the user-defined two-dimensional gating hierarchy. The recursive data filtering process in DAFi helped identify small data clusters which are otherwise difficult to resolve by a single run of the data clustering method due to the statistical interference of the irrelevant major clusters. Our experiment results showed that the proportions of the cell populations identified by DAFi, while being consistent with those by expert centralized manual gating, have smaller technical variances across samples than those from individual manual gating analysis and the nonrecursive data clustering analysis. Compared with manual gating segregation, DAFi-identified cell populations avoided the abrupt cut-offs on the boundaries. DAFi has been implemented to be used with multiple data clustering methods including K-means, FLOCK, FlowSOM, and the ClusterR package. For cell population identification, DAFi supports multiple options including clustering, bisecting, slope-based gating, and reversed filtering to meet various autogating needs from different scientific use cases. © 2018 International Society for Advancement of Cytometry. © 2018 International Society for Advancement of Cytometry.

  14. Variation in Research Designs Used to Test the Effectiveness of Dissemination and Implementation Strategies: A Review.

    PubMed

    Mazzucca, Stephanie; Tabak, Rachel G; Pilar, Meagan; Ramsey, Alex T; Baumann, Ana A; Kryzer, Emily; Lewis, Ericka M; Padek, Margaret; Powell, Byron J; Brownson, Ross C

    2018-01-01

    The need for optimal study designs in dissemination and implementation (D&I) research is increasingly recognized. Despite the wide range of study designs available for D&I research, we lack understanding of the types of designs and methodologies that are routinely used in the field. This review assesses the designs and methodologies in recently proposed D&I studies and provides resources to guide design decisions. We reviewed 404 study protocols published in the journal Implementation Science from 2/2006 to 9/2017. Eligible studies tested the efficacy or effectiveness of D&I strategies (i.e., not effectiveness of the underlying clinical or public health intervention); had a comparison by group and/or time; and used ≥1 quantitative measure. Several design elements were extracted: design category (e.g., randomized); design type [e.g., cluster randomized controlled trial (RCT)]; data type (e.g., quantitative); D&I theoretical framework; levels of treatment assignment, intervention, and measurement; and country in which the research was conducted. Each protocol was double-coded, and discrepancies were resolved through discussion. Of the 404 protocols reviewed, 212 (52%) studies tested one or more implementation strategy across 208 manuscripts, therefore meeting inclusion criteria. Of the included studies, 77% utilized randomized designs, primarily cluster RCTs. The use of alternative designs (e.g., stepped wedge) increased over time. Fewer studies were quasi-experimental (17%) or observational (6%). Many study design categories (e.g., controlled pre-post, matched pair cluster design) were represented by only one or two studies. Most articles proposed quantitative and qualitative methods (61%), with the remaining 39% proposing only quantitative. Half of protocols (52%) reported using a theoretical framework to guide the study. The four most frequently reported frameworks were Consolidated Framework for Implementing Research and RE-AIM ( n  = 16 each), followed by Promoting Action on Research Implementation in Health Services and Theoretical Domains Framework ( n  = 12 each). While several novel designs for D&I research have been proposed (e.g., stepped wedge, adaptive designs), the majority of the studies in our sample employed RCT designs. Alternative study designs are increasing in use but may be underutilized for a variety of reasons, including preference of funders or lack of awareness of these designs. Promisingly, the prevalent use of quantitative and qualitative methods together reflects methodological innovation in newer D&I research.

  15. Density-based clustering of small peptide conformations sampled from a molecular dynamics simulation.

    PubMed

    Kim, Minkyoung; Choi, Seung-Hoon; Kim, Junhyoung; Choi, Kihang; Shin, Jae-Min; Kang, Sang-Kee; Choi, Yun-Jaie; Jung, Dong Hyun

    2009-11-01

    This study describes the application of a density-based algorithm to clustering small peptide conformations after a molecular dynamics simulation. We propose a clustering method for small peptide conformations that enables adjacent clusters to be separated more clearly on the basis of neighbor density. Neighbor density means the number of neighboring conformations, so if a conformation has too few neighboring conformations, then it is considered as noise or an outlier and is excluded from the list of cluster members. With this approach, we can easily identify clusters in which the members are densely crowded in the conformational space, and we can safely avoid misclustering individual clusters linked by noise or outliers. Consideration of neighbor density significantly improves the efficiency of clustering of small peptide conformations sampled from molecular dynamics simulations and can be used for predicting peptide structures.

  16. The Tehran Eye Study: research design and eye examination protocol.

    PubMed

    Hashemi, Hassan; Fotouhi, Akbar; Mohammad, Kazem

    2003-07-15

    Visual impairment has a profound impact on society. The majority of visually impaired people live in developing countries, and since most disorders leading to visual impairment are preventable or curable, their control is a priority in these countries. Considering the complicated epidemiology of visual impairment and the wide variety of factors involved, region specific intervention strategies are required for every community. Therefore, providing appropriate data is one of the first steps in these communities, as it is in Iran. The objectives of this study are to describe the prevalence and causes of visual impairment in the population of Tehran city; the prevalence of refractive errors, lens opacity, ocular hypertension, and color blindness in this population, and also the familial aggregation of refractive errors, lens opacity, ocular hypertension, and color blindness within the study sample. Through a population-based, cross-sectional study, a total of 5300 Tehran citizens will be selected from 160 clusters using a stratified cluster random sampling strategy. The eligible people will be enumerated through a door-to-door household survey in the selected clusters and will be invited. All participants will be transferred to a clinic for measurements of uncorrected, best corrected and presenting visual acuity; manifest, subjective and cycloplegic refraction; color vision test; Goldmann applanation tonometry; examination of the external eye, anterior segment, media, and fundus; and an interview about demographic characteristics and history of eye diseases, eye trauma, diabetes mellitus, high blood pressure, and ophthalmologic cares. The study design and eye examination protocol are described. We expect that findings from the TES will show the status of visual problems and their causes in the community. This study can highlight the people who should be targeted by visual impairment prevention programs.

  17. Effect of W self-implantation and He plasma exposure on early-stage defect and bubble formation in tungsten

    NASA Astrophysics Data System (ADS)

    Thompson, M.; Drummond, D.; Sullivan, J.; Elliman, R.; Kluth, P.; Kirby, N.; Riley, D.; Corr, C. S.

    2018-06-01

    To determine the effect of pre-existing defects on helium-vacancy cluster nucleation and growth, tungsten samples were self-implanted with 1 MeV tungsten ions at varying fluences to induce radiation damage, then subsequently exposed to helium plasma in the MAGPIE linear plasma device. Positron annihilation lifetime spectroscopy was performed both immediately after self-implantation, and again after plasma exposure. After self-implantation vacancies clusters were not observed near the sample surface (<30 nm). At greater depths (30–150 nm) vacancy clusters formed, and were found to increase in size with increasing W-ion fluence. After helium plasma exposure in the MAGPIE linear plasma device at ~300 K with a fluence of 1023 He-m‑2, deep (30–150 nm) vacancy clusters showed similar positron lifetimes, while shallow (<30 nm) clusters were not observed. The intensity of positron lifetime signals fell for most samples after plasma exposure, indicating that defects were filling with helium. The absence of shallow clusters indicates that helium requires pre-existing defects in order to drive vacancy cluster growth at 300 K. Further samples that had not been pre-damaged with W-ions were also exposed to helium plasma in MAGPIE across fluences from 1  ×  1022 to 1.2  ×  1024 He-m‑2. Samples exposed to fluences up to 1  ×  1023 He-m‑2 showed no signs of damage. Fluences of 5  ×  1023 He-m‑2 and higher showed significant helium-cluster formation within the first 30 nm, with positron lifetimes in the vicinity 0.5–0.6 ns. The sample temperature was significantly higher for these higher fluence exposures (~400 K) due to plasma heating. This higher temperature likely enhanced bubble formation by significantly increasing the rate interstitial helium clusters generate vacancies, which is we suspect is the rate-limiting step for helium-vacancy cluster/bubble nucleation in the absence of pre-existing defects.

  18. Formation of metallic clusters in oxide insulators by means of ion beam mixing

    NASA Astrophysics Data System (ADS)

    Talut, G.; Potzger, K.; Mücklich, A.; Zhou, Shengqiang

    2008-04-01

    The intermixing and near-interface cluster formation of Pt and FePt thin films deposited on different oxide surfaces by means of Pt+ ion irradiation and subsequent annealing was investigated. Irradiated as well as postannealed samples were investigated using high resolution transmission electron microscopy. In MgO and Y :ZrO2 covered with Pt, crystalline clusters with mean sizes of 2 and 3.5nm were found after the Pt+ irradiations with 8×1015 and 2×1016cm-2 and subsequent annealing, respectively. In MgO samples covered with FePt, clusters with mean sizes of 1 and 2nm were found after the Pt+ irradiations with 8×1015 and 2×1016cm-2 and subsequent annealing, respectively. In Y :ZrO2 samples covered with FePt, clusters up to 5nm in size were found after the Pt+ irradiation with 2×1016cm-2 and subsequent annealing. In LaAlO3 the irradiation was accompanied by a full amorphization of the host matrix and appearance of embedded clusters of different sizes. The determination of the lattice constant and thus the kind of the clusters in samples covered by FePt was hindered due to strong deviation of the electron beam by the ferromagnetic FePt.

  19. the-wizz: clustering redshift estimation for everyone

    NASA Astrophysics Data System (ADS)

    Morrison, C. B.; Hildebrandt, H.; Schmidt, S. J.; Baldry, I. K.; Bilicki, M.; Choi, A.; Erben, T.; Schneider, P.

    2017-05-01

    We present the-wizz, an open source and user-friendly software for estimating the redshift distributions of photometric galaxies with unknown redshifts by spatially cross-correlating them against a reference sample with known redshifts. The main benefit of the-wizz is in separating the angular pair finding and correlation estimation from the computation of the output clustering redshifts allowing anyone to create a clustering redshift for their sample without the intervention of an 'expert'. It allows the end user of a given survey to select any subsample of photometric galaxies with unknown redshifts, match this sample's catalogue indices into a value-added data file and produce a clustering redshift estimation for this sample in a fraction of the time it would take to run all the angular correlations needed to produce a clustering redshift. We show results with this software using photometric data from the Kilo-Degree Survey (KiDS) and spectroscopic redshifts from the Galaxy and Mass Assembly survey and the Sloan Digital Sky Survey. The results we present for KiDS are consistent with the redshift distributions used in a recent cosmic shear analysis from the survey. We also present results using a hybrid machine learning-clustering redshift analysis that enables the estimation of clustering redshifts for individual galaxies. the-wizz can be downloaded at http://github.com/morriscb/The-wiZZ/.

  20. The Hip Impact Protection Project: Design and Methods

    PubMed Central

    Barton, Bruce A; Birge, Stanley J; Magaziner, Jay; Zimmerman, Sheryl; Ball, Linda; Brown, Kathleen M; Kiel, Douglas P

    2013-01-01

    Background Nearly 340,000 hip fractures occur each year in the U.S. With current demographic trends, the number of hip fractures is expected to double at least in the next 40 years. Purpose The Hip Impact Protection Project (HIP PRO) was designed to investigate the efficacy and safety of hip protectors in an elderly nursing home population. This paper describes the innovative clustered matched-pair research design used in HIP PRO to overcome the inherent limitations of clustered randomization. Methods Three clinical centers recruited 37 nursing homes to participate in HIP PRO. They were randomized so that the participating residents in that home received hip protectors for either the right or left hip. Informed consent was obtained from either the resident or the resident's responsible party. The target sample size was 580 residents with replacement if they dropped out, had a hip fracture, or died. One of the advantages of the HIP PRO study design was that each resident was his/her own case and control, eliminating imbalances, and there was no confusion over which residents wore pads (or on which hip). Limitations Generalizability of the findings may be limited. Adherence was higher in this study than in other studies because of: (1) the use of a run-in period, (2) staff incentives, and (3) the frequency of adherence assessments. The use of a single pad is not analogous to pad use in the real world and may have caused unanticipated changes in behavior. Fall assessment was not feasible, limiting the ability to analyze fractures as a function of falls. Finally, hip protector designs continue to evolve so that the results generated using this pad may not be applicable to other pad designs. However, information about factors related to adherence will be useful for future studies. Conclusions The clustered matched-pair study design avoided the major problem with previous cluster-randomized investigations of this question – unbalanced risk factors between the experimental group and the control group. Because each resident served as his/her own control, the effects of unbalanced risk factors on treatment effect were virtually eliminated. In addition, the use of frequent adherence assessments allowed us to study the effect of various demographic and environmental factors on adherence, which was vital for the assessment of efficacy. PMID:18697849

  1. 75 FR 7464 - Energy Efficient Building Systems Regional Innovation Cluster Initiative-Joint Federal Funding...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-02-19

    ... DEPARTMENT OF ENERGY Energy Efficient Building Systems Regional Innovation Cluster Initiative... Energy Efficient Building Systems Regional Innovation Cluster Initiative. A single proposal submitted by... systems design. The DOE funded Energy Efficient Building Systems Design Hub (the ``Hub'') will serve as a...

  2. Effect Sizes in Three-Level Cluster-Randomized Experiments

    ERIC Educational Resources Information Center

    Hedges, Larry V.

    2011-01-01

    Research designs involving cluster randomization are becoming increasingly important in educational and behavioral research. Many of these designs involve two levels of clustering or nesting (students within classes and classes within schools). Researchers would like to compute effect size indexes based on the standardized mean difference to…

  3. S-CNN: Subcategory-aware convolutional networks for object detection.

    PubMed

    Chen, Tao; Lu, Shijian; Fan, Jiayuan

    2017-09-26

    The marriage between the deep convolutional neural network (CNN) and region proposals has made breakthroughs for object detection in recent years. While the discriminative object features are learned via a deep CNN for classification, the large intra-class variation and deformation still limit the performance of the CNN based object detection. We propose a subcategory-aware CNN (S-CNN) to solve the object intra-class variation problem. In the proposed technique, the training samples are first grouped into multiple subcategories automatically through a novel instance sharing maximum margin clustering process. A multi-component Aggregated Channel Feature (ACF) detector is then trained to produce more latent training samples, where each ACF component corresponds to one clustered subcategory. The produced latent samples together with their subcategory labels are further fed into a CNN classifier to filter out false proposals for object detection. An iterative learning algorithm is designed for the joint optimization of image subcategorization, multi-component ACF detector, and subcategory-aware CNN classifier. Experiments on INRIA Person dataset, Pascal VOC 2007 dataset and MS COCO dataset show that the proposed technique clearly outperforms the state-of-the-art methods for generic object detection.

  4. Technical support for creating an artificial intelligence system for feature extraction and experimental design

    NASA Technical Reports Server (NTRS)

    Glick, B. J.

    1985-01-01

    Techniques for classifying objects into groups or clases go under many different names including, most commonly, cluster analysis. Mathematically, the general problem is to find a best mapping of objects into an index set consisting of class identifiers. When an a priori grouping of objects exists, the process of deriving the classification rules from samples of classified objects is known as discrimination. When such rules are applied to objects of unknown class, the process is denoted classification. The specific problem addressed involves the group classification of a set of objects that are each associated with a series of measurements (ratio, interval, ordinal, or nominal levels of measurement). Each measurement produces one variable in a multidimensional variable space. Cluster analysis techniques are reviewed and methods for incuding geographic location, distance measures, and spatial pattern (distribution) as parameters in clustering are examined. For the case of patterning, measures of spatial autocorrelation are discussed in terms of the kind of data (nominal, ordinal, or interval scaled) to which they may be applied.

  5. Clinical interpretation of the Spinal Cord Injury Functional Index (SCI-FI)

    PubMed Central

    Fyffe, Denise; Kalpakjian, Claire Z.; Slavin, Mary; Kisala, Pamela; Ni, Pengsheng; Kirshblum, Steven C.; Tulsky, David S.; Jette, Alan M.

    2016-01-01

    Objective: To provide validation of functional ability levels for the Spinal Cord Injury – Functional Index (SCI-FI). Design: Cross-sectional. Setting: Inpatient rehabilitation hospital and community settings. Participants: A sample of 855 individuals with traumatic spinal cord injury enrolled in 6 rehabilitation centers participating in the National Spinal Cord Injury Model Systems Network. Interventions: Not Applicable. Main Outcome Measures: Spinal Cord Injury-Functional Index (SCI-FI). Results: Cluster analyses identified three distinct groups that represent low, mid-range and high SCI-FI functional ability levels. Comparison of clusters on personal and other injury characteristics suggested some significant differences between groups. Conclusions: These results strongly support the use of SCI-FI functional ability levels to document the perceived functional abilities of persons with SCI. Results of the cluster analysis suggest that the SCI-FI functional ability levels capture function by injury characteristics. Clinical implications regarding tracking functional activity trajectories during follow-up visits are discussed. PMID:26781769

  6. Genetic characterization of Vibrio vulnificus strains isolated from oyster samples in Mexico.

    PubMed

    Guerrero, Abraham; Gómez Gil Rodríguez, Bruno; Wong-Chang, Irma; Lizárraga-Partida, Marcial Leonardo

    2015-01-01

    Vibrio vulnificus strains were isolated from oysters that were collected at the main seafood market in Mexico City. Strains were characterized with regard to vvhA, vcg genotype, PFGE, multilocus sequence typing (MLST), and rtxA1. Analyses included a comparison with rtxA1 reference sequences. Environmental (vcgE) and clinical (vcgC) genotypes were isolated at nearly equal percentages. PFGE had high heterogeneity, but the strains clustered by vcgE or vcgC genotype. Select housekeeping genes for MLST and primers that were designed for rtxA1 domains divided the strains into two clusters according to the E or C genotype. Reference rtxA1 sequences and those from this study were also clustered according to genotype. These results confirm that this genetic dimorphism is not limited to vcg genotyping, as other studies have reported. Some environmental C genotype strains had high similarity to reference strains, which have been reported to be virulent, indicating a potential risk for oyster consumers in Mexico City.

  7. Efficacy of a strategy for implementing a guideline for the control of cardiovascular risk in a primary healthcare setting: the SIRVA2 study a controlled, blinded community intervention trial randomised by clusters

    PubMed Central

    2011-01-01

    This work describes the methodology used to assess a strategy for implementing clinical practice guidelines (CPG) for cardiovascular risk control in a health area of Madrid. Background The results on clinical practice of introducing CPGs have been little studied in Spain. The strategy used to implement a CPG is known to influence its final use. Strategies based on the involvement of opinion leaders and that are easily executed appear to be among the most successful. Aim The main aim of the present work was to compare the effectiveness of two strategies for implementing a CPG designed to reduce cardiovascular risk in the primary healthcare setting, measured in terms of improvements in the recording of calculated cardiovascular risk or specific risk factors in patients' medical records, the control of cardiovascular risk factors, and the incidence of cardiovascular events. Methods This study involved a controlled, blinded community intervention in which the 21 health centres of the Number 2 Health Area of Madrid were randomly assigned by clusters to be involved in either a proposed CPG implementation strategy to reduce cardiovascular risk, or the normal dissemination strategy. The study subjects were patients ≥ 45 years of age whose health cards showed them to belong to the studied health area. The main variable examined was the proportion of patients whose medical histories included the calculation of their cardiovascular risk or that explicitly mentioned the presence of variables necessary for its calculation. The sample size was calculated for a comparison of proportions with alpha = 0.05 and beta = 0.20, and assuming that the intervention would lead to a 15% increase in the measured variables. Corrections were made for the design effect, assigning a sample size to each cluster proportional to the size of the population served by the corresponding health centre, and assuming losses of 20%. This demanded a final sample size of 620 patients. Data were analysed using summary measures for each cluster, both in making estimates and for hypothesis testing. Analysis of the variables was made on an intention-to-treat basis. Trial Registration ClinicalTrials.gov: NCT01270022 PMID:21504570

  8. Reproducibility of Cognitive Profiles in Psychosis Using Cluster Analysis.

    PubMed

    Lewandowski, Kathryn E; Baker, Justin T; McCarthy, Julie M; Norris, Lesley A; Öngür, Dost

    2018-04-01

    Cognitive dysfunction is a core symptom dimension that cuts across the psychoses. Recent findings support classification of patients along the cognitive dimension using cluster analysis; however, data-derived groupings may be highly determined by sampling characteristics and the measures used to derive the clusters, and so their interpretability must be established. We examined cognitive clusters in a cross-diagnostic sample of patients with psychosis and associations with clinical and functional outcomes. We then compared our findings to a previous report of cognitive clusters in a separate sample using a different cognitive battery. Participants with affective or non-affective psychosis (n=120) and healthy controls (n=31) were administered the MATRICS Consensus Cognitive Battery, and clinical and community functioning assessments. Cluster analyses were performed on cognitive variables, and clusters were compared on demographic, cognitive, and clinical measures. Results were compared to findings from our previous report. A four-cluster solution provided a good fit to the data; profiles included a neuropsychologically normal cluster, a globally impaired cluster, and two clusters of mixed profiles. Cognitive burden was associated with symptom severity and poorer community functioning. The patterns of cognitive performance by cluster were highly consistent with our previous findings. We found evidence of four cognitive subgroups of patients with psychosis, with cognitive profiles that map closely to those produced in our previous work. Clusters were associated with clinical and community variables and a measure of premorbid functioning, suggesting that they reflect meaningful groupings: replicable, and related to clinical presentation and functional outcomes. (JINS, 2018, 24, 382-390).

  9. A Comparison of Seventh Grade Thai Students' Reading Comprehension and Motivation to Read English through Applied Instruction Based on the Genre-Based Approach and the Teacher's Manual

    ERIC Educational Resources Information Center

    Sawangsamutchai, Yutthasak; Rattanavich, Saowalak

    2016-01-01

    The objective of this research is to compare the English reading comprehension and motivation to read of seventh grade Thai students taught with applied instruction through the genre-based approach and teachers' manual. A randomized pre-test post-test control group design was used through the cluster random sampling technique. The data were…

  10. Aircraft ride quality controller design using new robust root clustering theory for linear uncertain systems

    NASA Technical Reports Server (NTRS)

    Yedavalli, R. K.

    1992-01-01

    The aspect of controller design for improving the ride quality of aircraft in terms of damping ratio and natural frequency specifications on the short period dynamics is addressed. The controller is designed to be robust with respect to uncertainties in the real parameters of the control design model such as uncertainties in the dimensional stability derivatives, imperfections in actuator/sensor locations and possibly variations in flight conditions, etc. The design is based on a new robust root clustering theory developed by the author by extending the nominal root clustering theory of Gutman and Jury to perturbed matrices. The proposed methodology allows to get an explicit relationship between the parameters of the root clustering region and the uncertainty radius of the parameter space. The current literature available for robust stability becomes a special case of this unified theory. The bounds derived on the parameter perturbation for robust root clustering are then used in selecting the robust controller.

  11. A comparison of confidence interval methods for the intraclass correlation coefficient in community-based cluster randomization trials with a binary outcome.

    PubMed

    Braschel, Melissa C; Svec, Ivana; Darlington, Gerarda A; Donner, Allan

    2016-04-01

    Many investigators rely on previously published point estimates of the intraclass correlation coefficient rather than on their associated confidence intervals to determine the required size of a newly planned cluster randomized trial. Although confidence interval methods for the intraclass correlation coefficient that can be applied to community-based trials have been developed for a continuous outcome variable, fewer methods exist for a binary outcome variable. The aim of this study is to evaluate confidence interval methods for the intraclass correlation coefficient applied to binary outcomes in community intervention trials enrolling a small number of large clusters. Existing methods for confidence interval construction are examined and compared to a new ad hoc approach based on dividing clusters into a large number of smaller sub-clusters and subsequently applying existing methods to the resulting data. Monte Carlo simulation is used to assess the width and coverage of confidence intervals for the intraclass correlation coefficient based on Smith's large sample approximation of the standard error of the one-way analysis of variance estimator, an inverted modified Wald test for the Fleiss-Cuzick estimator, and intervals constructed using a bootstrap-t applied to a variance-stabilizing transformation of the intraclass correlation coefficient estimate. In addition, a new approach is applied in which clusters are randomly divided into a large number of smaller sub-clusters with the same methods applied to these data (with the exception of the bootstrap-t interval, which assumes large cluster sizes). These methods are also applied to a cluster randomized trial on adolescent tobacco use for illustration. When applied to a binary outcome variable in a small number of large clusters, existing confidence interval methods for the intraclass correlation coefficient provide poor coverage. However, confidence intervals constructed using the new approach combined with Smith's method provide nominal or close to nominal coverage when the intraclass correlation coefficient is small (<0.05), as is the case in most community intervention trials. This study concludes that when a binary outcome variable is measured in a small number of large clusters, confidence intervals for the intraclass correlation coefficient may be constructed by dividing existing clusters into sub-clusters (e.g. groups of 5) and using Smith's method. The resulting confidence intervals provide nominal or close to nominal coverage across a wide range of parameters when the intraclass correlation coefficient is small (<0.05). Application of this method should provide investigators with a better understanding of the uncertainty associated with a point estimator of the intraclass correlation coefficient used for determining the sample size needed for a newly designed community-based trial. © The Author(s) 2015.

  12. The properties of the disk system of globular clusters

    NASA Technical Reports Server (NTRS)

    Armandroff, Taft E.

    1989-01-01

    A large refined data sample is used to study the properties and origin of the disk system of globular clusters. A scale height for the disk cluster system of 800-1500 pc is found which is consistent with scale-height determinations for samples of field stars identified with the Galactic thick disk. A rotational velocity of 193 + or - 29 km/s and a line-of-sight velocity dispersion of 59 + or - 14 km/s have been found for the metal-rich clusters.

  13. The X-ray luminosity functions of Abell clusters from the Einstein Cluster Survey

    NASA Technical Reports Server (NTRS)

    Burg, R.; Giacconi, R.; Forman, W.; Jones, C.

    1994-01-01

    We have derived the present epoch X-ray luminosity function of northern Abell clusters using luminosities from the Einstein Cluster Survey. The sample is sufficiently large that we can determine the luminosity function for each richness class separately with sufficient precision to study and compare the different luminosity functions. We find that, within each richness class, the range of X-ray luminosity is quite large and spans nearly a factor of 25. Characterizing the luminosity function for each richness class with a Schechter function, we find that the characteristic X-ray luminosity, L(sub *), scales with richness class as (L(sub *) varies as N(sub*)(exp gamma), where N(sub *) is the corrected, mean number of galaxies in a richness class, and the best-fitting exponent is gamma = 1.3 +/- 0.4. Finally, our analysis suggests that there is a lower limit to the X-ray luminosity of clusters which is determined by the integrated emission of the cluster member galaxies, and this also scales with richness class. The present sample forms a baseline for testing cosmological evolution of Abell-like clusters when an appropriate high-redshift cluster sample becomes available.

  14. An Archival Search For Young Globular Clusters in Galaxies

    NASA Astrophysics Data System (ADS)

    Whitmore, Brad

    1995-07-01

    One of the most intriguing results from HST has been the discovery of ultraluminous star clusters in interacting and merging galaxies. These clusters have the luminosities, colors, and sizes that would be expected of young globular clusters produced by the interaction. We propose to use the data in the HST Archive to determine how prevalent this phenomena is, and to determine whether similar clusters are produced in other environments. Three samples will be extracted and studied in a systematic and consistent manner: 1} interacting and merging galaxies, 2} starburst galaxies, 3} a control sample of ``normal'' galaxies. A preliminary search of the archives shows that there are at least 20 galaxies in each of these samples, and the number will grow by about 50 observations become available. The data will be used to determine the luminosity function, color histogram , spatial distribution, and structural properties of the clusters using the same techniques employed in our study of NGC 7252 {``Atoms -for-Peace'' galaxy} and NGC 4038/4039 {``The Antennae''}. Our ultimate goals are: 1} to understand how globular clusters form, and 2} to use the clusters as evolutionary tracers to unravel the histories of interacting galaxies.

  15. Herschel And Alma Observations Of The Ism In Massive High-Redshift Galaxy Clusters

    NASA Astrophysics Data System (ADS)

    Wu, John F.; Aguirre, Paula; Baker, Andrew J.; Devlin, Mark J.; Hilton, Matt; Hughes, John P.; Infante, Leopoldo; Lindner, Robert R.; Sifón, Cristóbal

    2017-06-01

    The Sunyaev-Zel'dovich effect (SZE) can be used to select samples of galaxy clusters that are essentially mass-limited out to arbitrarily high redshifts. I will present results from an investigation of the star formation properties of galaxies in four massive clusters, extending to z 1, which were selected on the basis of their SZE decrements in the Atacama Cosmology Telescope (ACT) survey. All four clusters have been imaged with Herschel/PACS (tracing star formation rate) and two with ALMA (tracing dust and cold gas mass); newly discovered ALMA CO(4-3) and [CI] line detections expand an already large sample of spectroscopically confirmed cluster members. Star formation rate appears to anti-correlate with environmental density, but this trend vanishes after controlling for stellar mass. Elevated star formation and higher CO excitation are seen in "El Gordo," a violent cluster merger, relative to a virialized cluster at a similar high (z 1) redshift. Also exploiting ATCA 2.1 GHz observations to identify radio-loud active galactic nuclei (AGN) in our sample, I will use these data to develop a coherent picture of how environment influences galaxies' ISM properties and evolution in the most massive clusters at early cosmic times.

  16. Identifying optimal threshold statistics for elimination of hookworm using a stochastic simulation model.

    PubMed

    Truscott, James E; Werkman, Marleen; Wright, James E; Farrell, Sam H; Sarkar, Rajiv; Ásbjörnsdóttir, Kristjana; Anderson, Roy M

    2017-06-30

    There is an increased focus on whether mass drug administration (MDA) programmes alone can interrupt the transmission of soil-transmitted helminths (STH). Mathematical models can be used to model these interventions and are increasingly being implemented to inform investigators about expected trial outcome and the choice of optimum study design. One key factor is the choice of threshold for detecting elimination. However, there are currently no thresholds defined for STH regarding breaking transmission. We develop a simulation of an elimination study, based on the DeWorm3 project, using an individual-based stochastic disease transmission model in conjunction with models of MDA, sampling, diagnostics and the construction of study clusters. The simulation is then used to analyse the relationship between the study end-point elimination threshold and whether elimination is achieved in the long term within the model. We analyse the quality of a range of statistics in terms of the positive predictive values (PPV) and how they depend on a range of covariates, including threshold values, baseline prevalence, measurement time point and how clusters are constructed. End-point infection prevalence performs well in discriminating between villages that achieve interruption of transmission and those that do not, although the quality of the threshold is sensitive to baseline prevalence and threshold value. Optimal post-treatment prevalence threshold value for determining elimination is in the range 2% or less when the baseline prevalence range is broad. For multiple clusters of communities, both the probability of elimination and the ability of thresholds to detect it are strongly dependent on the size of the cluster and the size distribution of the constituent communities. Number of communities in a cluster is a key indicator of probability of elimination and PPV. Extending the time, post-study endpoint, at which the threshold statistic is measured improves PPV value in discriminating between eliminating clusters and those that bounce back. The probability of elimination and PPV are very sensitive to baseline prevalence for individual communities. However, most studies and programmes are constructed on the basis of clusters. Since elimination occurs within smaller population sub-units, the construction of clusters introduces new sensitivities for elimination threshold values to cluster size and the underlying population structure. Study simulation offers an opportunity to investigate key sources of sensitivity for elimination studies and programme designs in advance and to tailor interventions to prevailing local or national conditions.

  17. Weak-lensing mass calibration of the Atacama Cosmology Telescope equatorial Sunyaev-Zeldovich cluster sample with the Canada-France-Hawaii telescope stripe 82 survey

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Battaglia, N.; Miyatake, H.; Hasselfield, M.

    Mass calibration uncertainty is the largest systematic effect for using clusters of galaxies to constrain cosmological parameters. We present weak lensing mass measurements from the Canada-France-Hawaii Telescope Stripe 82 Survey for galaxy clusters selected through their high signal-to-noise thermal Sunyaev-Zeldovich (tSZ) signal measured with the Atacama Cosmology Telescope (ACT). For a sample of 9 ACT clusters with a tSZ signal-to-noise greater than five the average weak lensing mass is (4.8±0.8) ×10{sup 14} M{sub ⊙}, consistent with the tSZ mass estimate of (4.70±1.0) ×10{sup 14} M{sub ⊙} which assumes a universal pressure profile for the cluster gas. Our results are consistentmore » with previous weak-lensing measurements of tSZ-detected clusters from the Planck satellite. When comparing our results, we estimate the Eddington bias correction for the sample intersection of Planck and weak-lensing clusters which was previously excluded.« less

  18. Implementation of authentic assessment in the project based learning to improve student's concept mastering

    NASA Astrophysics Data System (ADS)

    Sambeka, Yana; Nahadi, Sriyati, Siti

    2017-05-01

    The study aimed to obtain the scientific information about increase of student's concept mastering in project based learning that used authentic assessment. The research was conducted in May 2016 at one of junior high school in Bandung in the academic year of 2015/2016. The research method was weak experiment with the one-group pretest-posttest design. The sample was taken by random cluster sampling technique and the sample was 24 students. Data collected through instruments, i.e. written test, observation sheet, and questionnaire sheet. Student's concept mastering test obtained N-Gain of 0.236 with the low category. Based on the result of paired sample t-test showed that implementation of authentic assessment in the project based learning increased student's concept mastering significantly, (sig<0.05).

  19. The Morphologies and Alignments of Gas, Mass, and the Central Galaxies of CLASH Clusters of Galaxies

    NASA Astrophysics Data System (ADS)

    Donahue, Megan; Ettori, Stefano; Rasia, Elena; Sayers, Jack; Zitrin, Adi; Meneghetti, Massimo; Voit, G. Mark; Golwala, Sunil; Czakon, Nicole; Yepes, Gustavo; Baldi, Alessandro; Koekemoer, Anton; Postman, Marc

    2016-03-01

    Morphology is often used to infer the state of relaxation of galaxy clusters. The regularity, symmetry, and degree to which a cluster is centrally concentrated inform quantitative measures of cluster morphology. The Cluster Lensing and Supernova survey with Hubble Space Telescope (CLASH) used weak and strong lensing to measure the distribution of matter within a sample of 25 clusters, 20 of which were deemed to be “relaxed” based on their X-ray morphology and alignment of the X-ray emission with the Brightest Cluster Galaxy. Toward a quantitative characterization of this important sample of clusters, we present uniformly estimated X-ray morphological statistics for all 25 CLASH clusters. We compare X-ray morphologies of CLASH clusters with those identically measured for a large sample of simulated clusters from the MUSIC-2 simulations, selected by mass. We confirm a threshold in X-ray surface brightness concentration of C ≳ 0.4 for cool-core clusters, where C is the ratio of X-ray emission inside 100 h70-1 kpc compared to inside 500 {h}70-1 kpc. We report and compare morphologies of these clusters inferred from Sunyaev-Zeldovich Effect (SZE) maps of the hot gas and in from projected mass maps based on strong and weak lensing. We find a strong agreement in alignments of the orientation of major axes for the lensing, X-ray, and SZE maps of nearly all of the CLASH clusters at radii of 500 kpc (approximately 1/2 R500 for these clusters). We also find a striking alignment of clusters shapes at the 500 kpc scale, as measured with X-ray, SZE, and lensing, with that of the near-infrared stellar light at 10 kpc scales for the 20 “relaxed” clusters. This strong alignment indicates a powerful coupling between the cluster- and galaxy-scale galaxy formation processes.

  20. Multiwavelength study of X-ray luminous clusters in the Hyper Suprime-Cam Subaru Strategic Program S16A field

    NASA Astrophysics Data System (ADS)

    Miyaoka, Keita; Okabe, Nobuhiro; Kitaguchi, Takao; Oguri, Masamune; Fukazawa, Yasushi; Mandelbaum, Rachel; Medezinski, Elinor; Babazaki, Yasunori; Nishizawa, Atsushi J.; Hamana, Takashi; Lin, Yen-Ting; Akamatsu, Hiroki; Chiu, I.-Non; Fujita, Yutaka; Ichinohe, Yuto; Komiyama, Yutaka; Sasaki, Toru; Takizawa, Motokazu; Ueda, Shutaro; Umetsu, Keiichi; Coupon, Jean; Hikage, Chiaki; Hoshino, Akio; Leauthaud, Alexie; Matsushita, Kyoko; Mitsuishi, Ikuyuki; Miyatake, Hironao; Miyazaki, Satoshi; More, Surhud; Nakazawa, Kazuhiro; Ota, Naomi; Sato, Kousuke; Spergel, David; Tamura, Takayuki; Tanaka, Masayuki; Tanaka, Manobu M.; Utsumi, Yousuke

    2018-01-01

    We present a joint X-ray, optical, and weak-lensing analysis for X-ray luminous galaxy clusters selected from the MCXC (Meta-Catalog of X-Ray Detected Clusters of Galaxies) cluster catalog in the Hyper Suprime-Cam Subaru Strategic Program (HSC-SSP) survey field with S16A data. As a pilot study for a series of papers, we measure hydrostatic equilibrium (HE) masses using XMM-Newton data for four clusters in the current coverage area out of a sample of 22 MCXC clusters. We additionally analyze a non-MCXC cluster associated with one MCXC cluster. We show that HE masses for the MCXC clusters are correlated with cluster richness from the CAMIRA catalog, while that for the non-MCXC cluster deviates from the scaling relation. The mass normalization of the relationship between cluster richness and HE mass is compatible with one inferred by matching CAMIRA cluster abundance with a theoretical halo mass function. The mean gas mass fraction based on HE masses for the MCXC clusters is = 0.125 ± 0.012 at spherical overdensity Δ = 500, which is ˜80%-90% of the cosmic mean baryon fraction, Ωb/Ωm, measured by cosmic microwave background experiments. We find that the mean baryon fraction estimated from X-ray and HSC-SSP optical data is comparable to Ωb/Ωm. A weak-lensing shear catalog of background galaxies, combined with photometric redshifts, is currently available only for three clusters in our sample. Hydrostatic equilibrium masses roughly agree with weak-lensing masses, albeit with large uncertainty. This study demonstrates that further multiwavelength study for a large sample of clusters using X-ray, HSC-SSP optical, and weak-lensing data will enable us to understand cluster physics and utilize cluster-based cosmology.

  1. A microfluidic device for label-free, physical capture of circulating tumor cell-clusters

    PubMed Central

    Sarioglu, A. Fatih; Aceto, Nicola; Kojic, Nikola; Donaldson, Maria C.; Zeinali, Mahnaz; Hamza, Bashar; Engstrom, Amanda; Zhu, Huili; Sundaresan, Tilak K.; Miyamoto, David T.; Luo, Xi; Bardia, Aditya; Wittner, Ben S.; Ramaswamy, Sridhar; Shioda, Toshi; Ting, David T.; Stott, Shannon L.; Kapur, Ravi; Maheswaran, Shyamala; Haber, Daniel A.; Toner, Mehmet

    2015-01-01

    Cancer cells metastasize through the bloodstream either as single migratory circulating tumor cells (CTCs) or as multicellular groupings (CTC-clusters). Existing technologies for CTC enrichment are designed primarily to isolate single CTCs, and while CTC-clusters are detectable in some cases, their true prevalence and significance remain to be determined. Here, we developed a microchip technology (Cluster-Chip) specifically designed to capture CTC-clusters independent of tumor-specific markers from unprocessed blood. CTC-clusters are isolated through specialized bifurcating traps under low shear-stress conditions that preserve their integrity and even two-cell clusters are captured efficiently. Using the Cluster-Chip, we identify CTC-clusters in 30–40% of patients with metastatic cancers of the breast, prostate and melanoma. RNA sequencing of CTC-clusters confirms their tumor origin and identifies leukocytes within the clusters as tissue-derived macrophages. Together, the development of a device for efficient capture of CTC-clusters will enable detailed characterization of their biological properties and role in cancer metastasis. PMID:25984697

  2. Mutation Clusters from Cancer Exome.

    PubMed

    Kakushadze, Zura; Yu, Willie

    2017-08-15

    We apply our statistically deterministic machine learning/clustering algorithm *K-means (recently developed in https://ssrn.com/abstract=2908286) to 10,656 published exome samples for 32 cancer types. A majority of cancer types exhibit a mutation clustering structure. Our results are in-sample stable. They are also out-of-sample stable when applied to 1389 published genome samples across 14 cancer types. In contrast, we find in- and out-of-sample instabilities in cancer signatures extracted from exome samples via nonnegative matrix factorization (NMF), a computationally-costly and non-deterministic method. Extracting stable mutation structures from exome data could have important implications for speed and cost, which are critical for early-stage cancer diagnostics, such as novel blood-test methods currently in development.

  3. Mutation Clusters from Cancer Exome

    PubMed Central

    Kakushadze, Zura; Yu, Willie

    2017-01-01

    We apply our statistically deterministic machine learning/clustering algorithm *K-means (recently developed in https://ssrn.com/abstract=2908286) to 10,656 published exome samples for 32 cancer types. A majority of cancer types exhibit a mutation clustering structure. Our results are in-sample stable. They are also out-of-sample stable when applied to 1389 published genome samples across 14 cancer types. In contrast, we find in- and out-of-sample instabilities in cancer signatures extracted from exome samples via nonnegative matrix factorization (NMF), a computationally-costly and non-deterministic method. Extracting stable mutation structures from exome data could have important implications for speed and cost, which are critical for early-stage cancer diagnostics, such as novel blood-test methods currently in development. PMID:28809811

  4. a Snapshot Survey of X-Ray Selected Central Cluster Galaxies

    NASA Astrophysics Data System (ADS)

    Edge, Alastair

    1999-07-01

    Central cluster galaxies are the most massive stellar systems known and have been used as standard candles for many decades. Only recently have central cluster galaxies been recognised to exhibit a wide variety of small scale {<100 pc} features that can only be reliably detected with HST resolution. The most intriguing of these are dust lanes which have been detected in many central cluster galaxies. Dust is not expected to survive long in the hostile cluster environment unless shielded by the ISM of a disk galaxy or very dense clouds of cold gas. WFPC2 snapshot images of a representative subset of the central cluster galaxies from an X-ray selected cluster sample would provide important constraints on the formation and evolution of dust in cluster cores that cannot be obtained from ground-based observations. In addition, these images will allow the AGN component, the frequency of multiple nuclei, and the amount of massive-star formation in central cluster galaxies to be ass es sed. The proposed HST observatio ns would also provide high-resolution images of previously unresolved gravitational arcs in the most massive clusters in our sample resulting in constraints on the shape of the gravitational potential of these systems. This project will complement our extensive multi-frequency work on this sample that includes optical spectroscopy and photometry, VLA and X-ray images for the majority of the 210 targets.

  5. Cosmological Constraints from Galaxy Clustering and the Mass-to-number Ratio of Galaxy Clusters

    NASA Astrophysics Data System (ADS)

    Tinker, Jeremy L.; Sheldon, Erin S.; Wechsler, Risa H.; Becker, Matthew R.; Rozo, Eduardo; Zu, Ying; Weinberg, David H.; Zehavi, Idit; Blanton, Michael R.; Busha, Michael T.; Koester, Benjamin P.

    2012-01-01

    We place constraints on the average density (Ω m ) and clustering amplitude (σ8) of matter using a combination of two measurements from the Sloan Digital Sky Survey: the galaxy two-point correlation function, wp (rp ), and the mass-to-galaxy-number ratio within galaxy clusters, M/N, analogous to cluster M/L ratios. Our wp (rp ) measurements are obtained from DR7 while the sample of clusters is the maxBCG sample, with cluster masses derived from weak gravitational lensing. We construct nonlinear galaxy bias models using the Halo Occupation Distribution (HOD) to fit both wp (rp ) and M/N for different cosmological parameters. HOD models that match the same two-point clustering predict different numbers of galaxies in massive halos when Ω m or σ8 is varied, thereby breaking the degeneracy between cosmology and bias. We demonstrate that this technique yields constraints that are consistent and competitive with current results from cluster abundance studies, without the use of abundance information. Using wp (rp ) and M/N alone, we find Ω0.5 m σ8 = 0.465 ± 0.026, with individual constraints of Ω m = 0.29 ± 0.03 and σ8 = 0.85 ± 0.06. Combined with current cosmic microwave background data, these constraints are Ω m = 0.290 ± 0.016 and σ8 = 0.826 ± 0.020. All errors are 1σ. The systematic uncertainties that the M/N technique are most sensitive to are the amplitude of the bias function of dark matter halos and the possibility of redshift evolution between the SDSS Main sample and the maxBCG cluster sample. Our derived constraints are insensitive to the current level of uncertainties in the halo mass function and in the mass-richness relation of clusters and its scatter, making the M/N technique complementary to cluster abundances as a method for constraining cosmology with future galaxy surveys.

  6. Dependence of the clustering properties of galaxies on stellar velocity dispersion in the Main galaxy sample of SDSS DR10

    NASA Astrophysics Data System (ADS)

    Deng, Xin-Fa; Song, Jun; Chen, Yi-Qing; Jiang, Peng; Ding, Ying-Ping

    2014-08-01

    Using two volume-limited Main galaxy samples of the Sloan Digital Sky Survey Data Release 10 (SDSS DR10), we investigate the dependence of the clustering properties of galaxies on stellar velocity dispersion by cluster analysis. It is found that in the luminous volume-limited Main galaxy sample, except at r=1.2, richer and larger systems can be more easily formed in the large stellar velocity dispersion subsample, while in the faint volume-limited Main galaxy sample, at r≥0.9, an opposite trend is observed. According to statistical analyses of the multiplicity functions, we conclude in two volume-limited Main galaxy samples: small stellar velocity dispersion galaxies preferentially form isolated galaxies, close pairs and small group, while large stellar velocity dispersion galaxies preferentially inhabit the dense groups and clusters. However, we note the difference between two volume-limited Main galaxy samples: in the faint volume-limited Main galaxy sample, at r≥0.9, the small stellar velocity dispersion subsample has a higher proportion of galaxies in superclusters ( n≥200) than the large stellar velocity dispersion subsample.

  7. See Change: the Supernova Sample from the Supernova Cosmology Project High Redshift Cluster Supernova Survey

    NASA Astrophysics Data System (ADS)

    Hayden, Brian; Perlmutter, Saul; Boone, Kyle; Nordin, Jakob; Rubin, David; Lidman, Chris; Deustua, Susana E.; Fruchter, Andrew S.; Aldering, Greg Scott; Brodwin, Mark; Cunha, Carlos E.; Eisenhardt, Peter R.; Gonzalez, Anthony H.; Jee, James; Hildebrandt, Hendrik; Hoekstra, Henk; Santos, Joana; Stanford, S. Adam; Stern, Daniel; Fassbender, Rene; Richard, Johan; Rosati, Piero; Wechsler, Risa H.; Muzzin, Adam; Willis, Jon; Boehringer, Hans; Gladders, Michael; Goobar, Ariel; Amanullah, Rahman; Hook, Isobel; Huterer, Dragan; Huang, Xiaosheng; Kim, Alex G.; Kowalski, Marek; Linder, Eric; Pain, Reynald; Saunders, Clare; Suzuki, Nao; Barbary, Kyle H.; Rykoff, Eli S.; Meyers, Joshua; Spadafora, Anthony L.; Sofiatti, Caroline; Wilson, Gillian; Rozo, Eduardo; Hilton, Matt; Ruiz-Lapuente, Pilar; Luther, Kyle; Yen, Mike; Fagrelius, Parker; Dixon, Samantha; Williams, Steven

    2017-01-01

    The Supernova Cosmology Project has finished executing a large (174 orbits, cycles 22-23) Hubble Space Telescope program, which has measured ~30 type Ia Supernovae above z~1 in the highest-redshift, most massive galaxy clusters known to date. Our SN Ia sample closely matches our pre-survey predictions; this sample will improve the constraint by a factor of 3 on the Dark Energy equation of state above z~1, allowing an unprecedented probe of Dark Energy time variation. When combined with the improved cluster mass calibration from gravitational lensing provided by the deep WFC3-IR observations of the clusters, See Change will triple the Dark Energy Task Force Figure of Merit. With the primary observing campaign completed, we present the preliminary supernova sample and our path forward to the supernova cosmology results. We also compare the number of SNe Ia discovered in each cluster with our pre-survey expectations based on cluster mass and SFR estimates. Our extensive HST and ground-based campaign has already produced unique results; we have confirmed several of the highest redshift cluster members known to date, confirmed the redshift of one of the most massive galaxy clusters at z~1.2 expected across the entire sky, and characterized one of the most extreme starburst environments yet known in a z~1.7 cluster. We have also discovered a lensed SN Ia at z=2.22 magnified by a factor of ~2.7, which is the highest spectroscopic redshift SN Ia currently known.

  8. OGLE Collection of Star Clusters. New Objects in the Outskirts of the Large Magellanic Cloud

    NASA Astrophysics Data System (ADS)

    Sitek, M.; Szymański, M. K.; Skowron, D. M.; Udalski, A.; Kostrzewa-Rutkowska, Z.; Skowron, J.; Karczmarek, P.; Cieślar, M.; Wyrzykowski, Ł.; Kozłowski, S.; Pietrukowicz, P.; Soszyński, I.; Mróz, P.; Pawlak, M.; Poleski, R.; Ulaczyk, K.

    2016-09-01

    The Magellanic System (MS), consisting of the Large Magellanic Cloud (LMC), the Small Magellanic Cloud (SMC) and the Magellanic Bridge (MBR), contains diverse sample of star clusters. Their spatial distribution, ages and chemical abundances may provide important information about the history of formation of the whole System. We use deep photometric maps derived from the images collected during the fourth phase of the Optical Gravitational Lensing Experiment (OGLE-IV) to construct the most complete catalog of star clusters in the Large Magellanic Cloud using the homogeneous photometric data. In this paper we present the collection of star clusters found in the area of about 225 square degrees in the outer regions of the LMC. Our sample contains 679 visually identified star cluster candidates, 226 of which were not listed in any of the previously published catalogs. The new clusters are mainly young small open clusters or clusters similar to associations.

  9. Characterization of Oxygen Defect Clusters in UO2+ x Using Neutron Scattering and PDF Analysis.

    PubMed

    Ma, Yue; Garcia, Philippe; Lechelle, Jacques; Miard, Audrey; Desgranges, Lionel; Baldinozzi, Gianguido; Simeone, David; Fischer, Henry E

    2018-06-18

    In hyper-stoichiometric uranium oxide, both neutron diffraction work and, more recently, theoretical analyses report the existence of clusters such as the 2:2:2 cluster, comprising two anion vacancies and two types of anion interstitials. However, little is known about whether there exists a region of low deviation-from-stoichiometry in which defects remain isolated, or indeed whether at high deviation-from-stoichiometry defect clusters prevail that contain more excess oxygen atoms than the di-interstitial cluster. In this study, we report pair distribution function (PDF) analyses of UO 2 and UO 2+ x ( x ≈ 0.007 and x ≈ 0.16) samples obtained from high-temperature in situ neutron scattering experiments. PDF refinement for the lower deviation from stoichiometry sample suggests the system is too dilute to differentiate between isolated defects and di-interstitial clusters. For the UO 2.16 sample, several defect structures are tested, and it is found that the data are best represented assuming the presence of center-occupied cuboctahedra.

  10. A good mass proxy for galaxy clusters with XMM-Newton

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhao, Hai-Hui; Jia, Shu-Mei; Chen, Yong

    2013-12-01

    We use a sample of 39 galaxy clusters at redshift z < 0.1 observed by XMM-Newton to investigate the relations between X-ray observables and total mass. Based on central cooling time and central temperature drop, the clusters in this sample are divided into two groups: 25 cool core clusters and 14 non-cool core clusters, respectively. We study the scaling relations of L {sub bol}-M {sub 500}, M {sub 500}-T, M {sub 500}-M {sub g}, and M {sub 500}-Y {sub X}, and also the influences of cool core on these relations. The results show that the M {sub 500}-Y {sub X}more » relation has a slope close to the standard self-similar value, has the smallest scatter and does not vary with the cluster sample. Moreover, the M {sub 500}-Y {sub X} relation is not affected by the cool core. Thus, the parameter of Y{sub X} may be the best mass indicator.« less

  11. Novel approaches to pin cluster synchronization on complex dynamical networks in Lur'e forms

    NASA Astrophysics Data System (ADS)

    Tang, Ze; Park, Ju H.; Feng, Jianwen

    2018-04-01

    This paper investigates the cluster synchronization of complex dynamical networks consisted of identical or nonidentical Lur'e systems. Due to the special topology structure of the complex networks and the existence of stochastic perturbations, a kind of randomly occurring pinning controller is designed which not only synchronizes all Lur'e systems in the same cluster but also decreases the negative influence among different clusters. Firstly, based on an extended integral inequality, the convex combination theorem and S-procedure, the conditions for cluster synchronization of identical Lur'e networks are derived in a convex domain. Secondly, randomly occurring adaptive pinning controllers with two independent Bernoulli stochastic variables are designed and then sufficient conditions are obtained for the cluster synchronization on complex networks consisted of nonidentical Lur'e systems. In addition, suitable control gains for successful cluster synchronization of nonidentical Lur'e networks are acquired by designing some adaptive updating laws. Finally, we present two numerical examples to demonstrate the validity of the control scheme and the theoretical analysis.

  12. The association between content of the elements S, Cl, K, Fe, Cu, Zn and Br in normal and cirrhotic liver tissue from Danes and Greenlandic Inuit examined by dual hierarchical clustering analysis.

    PubMed

    Laursen, Jens; Milman, Nils; Pind, Niels; Pedersen, Henrik; Mulvad, Gert

    2014-01-01

    Meta-analysis of previous studies evaluating associations between content of elements sulphur (S), chlorine (Cl), potassium (K), iron (Fe), copper (Cu), zinc (Zn) and bromine (Br) in normal and cirrhotic autopsy liver tissue samples. Normal liver samples from 45 Greenlandic Inuit, median age 60 years and from 71 Danes, median age 61 years. Cirrhotic liver samples from 27 Danes, median age 71 years. Element content was measured using X-ray fluorescence spectrometry. Dual hierarchical clustering analysis, creating a dual dendrogram, one clustering element contents according to calculated similarities, one clustering elements according to correlation coefficients between the element contents, both using Euclidian distance and Ward Procedure. One dendrogram separated subjects in 7 clusters showing no differences in ethnicity, gender or age. The analysis discriminated between elements in normal and cirrhotic livers. The other dendrogram clustered elements in four clusters: sulphur and chlorine; copper and bromine; potassium and zinc; iron. There were significant correlations between the elements in normal liver samples: S was associated with Cl, K, Br and Zn; Cl with S and Br; K with S, Br and Zn; Cu with Br. Zn with S and K. Br with S, Cl, K and Cu. Fe did not show significant associations with any other element. In contrast to simple statistical methods, which analyses content of elements separately one by one, dual hierarchical clustering analysis incorporates all elements at the same time and can be used to examine the linkage and interplay between multiple elements in tissue samples. Copyright © 2013 Elsevier GmbH. All rights reserved.

  13. The MUSIC of CLASH: Predictions on the Concentration-Mass Relation

    NASA Astrophysics Data System (ADS)

    Meneghetti, M.; Rasia, E.; Vega, J.; Merten, J.; Postman, M.; Yepes, G.; Sembolini, F.; Donahue, M.; Ettori, S.; Umetsu, K.; Balestra, I.; Bartelmann, M.; Benítez, N.; Biviano, A.; Bouwens, R.; Bradley, L.; Broadhurst, T.; Coe, D.; Czakon, N.; De Petris, M.; Ford, H.; Giocoli, C.; Gottlöber, S.; Grillo, C.; Infante, L.; Jouvel, S.; Kelson, D.; Koekemoer, A.; Lahav, O.; Lemze, D.; Medezinski, E.; Melchior, P.; Mercurio, A.; Molino, A.; Moscardini, L.; Monna, A.; Moustakas, J.; Moustakas, L. A.; Nonino, M.; Rhodes, J.; Rosati, P.; Sayers, J.; Seitz, S.; Zheng, W.; Zitrin, A.

    2014-12-01

    We present an analysis of the MUSIC-2 N-body/hydrodynamical simulations aimed at estimating the expected concentration-mass relation for the CLASH (Cluster Lensing and Supernova Survey with Hubble) cluster sample. We study nearly 1,400 halos simulated at high spatial and mass resolution. We study the shape of both their density and surface-density profiles and fit them with a variety of radial functions, including the Navarro-Frenk-White (NFW), the generalized NFW, and the Einasto density profiles. We derive concentrations and masses from these fits. We produce simulated Chandra observations of the halos, and we use them to identify objects resembling the X-ray morphologies and masses of the clusters in the CLASH X-ray-selected sample. We also derive a concentration-mass relation for strong-lensing clusters. We find that the sample of simulated halos that resembles the X-ray morphology of the CLASH clusters is composed mainly of relaxed halos, but it also contains a significant fraction of unrelaxed systems. For such a heterogeneous sample we measure an average two-dimensional concentration that is ~11% higher than is found for the full sample of simulated halos. After accounting for projection and selection effects, the average NFW concentrations of CLASH clusters are expected to be intermediate between those predicted in three dimensions for relaxed and super-relaxed halos. Matching the simulations to the individual CLASH clusters on the basis of the X-ray morphology, we expect that the NFW concentrations recovered from the lensing analysis of the CLASH clusters are in the range [3-6], with an average value of 3.87 and a standard deviation of 0.61.

  14. The music of clash: predictions on the concentration-mass relation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Meneghetti, M.; Rasia, E.; Vega, J.

    We present an analysis of the MUSIC-2 N-body/hydrodynamical simulations aimed at estimating the expected concentration-mass relation for the CLASH (Cluster Lensing and Supernova Survey with Hubble) cluster sample. We study nearly 1,400 halos simulated at high spatial and mass resolution. We study the shape of both their density and surface-density profiles and fit them with a variety of radial functions, including the Navarro-Frenk-White (NFW), the generalized NFW, and the Einasto density profiles. We derive concentrations and masses from these fits. We produce simulated Chandra observations of the halos, and we use them to identify objects resembling the X-ray morphologies andmore » masses of the clusters in the CLASH X-ray-selected sample. We also derive a concentration-mass relation for strong-lensing clusters. We find that the sample of simulated halos that resembles the X-ray morphology of the CLASH clusters is composed mainly of relaxed halos, but it also contains a significant fraction of unrelaxed systems. For such a heterogeneous sample we measure an average two-dimensional concentration that is ∼11% higher than is found for the full sample of simulated halos. After accounting for projection and selection effects, the average NFW concentrations of CLASH clusters are expected to be intermediate between those predicted in three dimensions for relaxed and super-relaxed halos. Matching the simulations to the individual CLASH clusters on the basis of the X-ray morphology, we expect that the NFW concentrations recovered from the lensing analysis of the CLASH clusters are in the range [3-6], with an average value of 3.87 and a standard deviation of 0.61.« less

  15. Denaturing gradient gel electrophoresis profiles of bacteria from the saliva of twenty four different individuals form clusters that showed no relationship to the yeasts present.

    PubMed

    M Weerasekera, Manjula; H Sissons, Chris; Wong, Lisa; A Anderson, Sally; R Holmes, Ann; D Cannon, Richard

    2017-10-01

    The aim was to investigate the relationship between groups of bacteria identified by cluster analysis of the DGGE fingerprints and the amounts and diversity of yeast present. Bacterial and yeast populations in saliva samples from 24 adults were analysed using denaturing gradient gel electrophoresis (DGGE) of the bacteria present and by yeast culture. Eubacterial DGGE banding patterns showed considerable variation between individuals. Seventy one different amplicon bands were detected, the band number per saliva sample ranged from 21 to 39 (mean±SD=29.3±4.9). Cluster and principal component analysis of the bacterial DGGE patterns yielded three major clusters containing 20 of the samples. Seventeen of the 24 (71%) saliva samples were yeast positive with concentrations up to 10 3 cfu/mL. Candida albicans was the predominant species in saliva samples although six other yeast species, including Candida dubliniensis, Candida tropicalis, Candida krusei, Candida guilliermondii, Candida rugosa and Saccharomyces cerevisiae, were identified. The presence, concentration, and species of yeast in samples showed no clear relationship to the bacterial clusters. Despite indications of in vitro bacteria-yeast interactions, there was a lack of association between the presence, identity and diversity of yeasts and the bacterial DGGE fingerprint clusters in saliva. This suggests significant ecological individual-specificity of these associations in highly complex in vivo oral biofilm systems under normal oral conditions. Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. Differences in soil biological activity by terrain types at the sub-field scale in central Iowa US

    DOE PAGES

    Kaleita, Amy L.; Schott, Linda R.; Hargreaves, Sarah K.; ...

    2017-07-07

    Soil microbial communities are structured by biogeochemical processes that occur at many different spatial scales, which makes soil sampling difficult. Because soil microbial communities are important in nutrient cycling and soil fertility, it is important to understand how microbial communities function within the heterogeneous soil landscape. In this study, a self-organizing map was used to determine whether landscape data can be used to characterize the distribution of microbial biomass and activity in order to provide an improved understanding of soil microbial community function. Points within a row crop field in south-central Iowa were clustered via a self-organizing map using sixmore » landscape properties into three separate landscape clusters. Twelve sampling locations per cluster were chosen for a total of 36 locations. After the soil samples were collected, the samples were then analysed for various metabolic indicators, such as nitrogen and carbon mineralization, extractable organic carbon, microbial biomass, etc. It was found that sampling locations located in the potholes and toe slope positions had significantly greater microbial biomass nitrogen and carbon, total carbon, total nitrogen and extractable organic carbon than the other two landscape position clusters, while locations located on the upslope did not differ significantly from the other landscape clusters. However, factors such as nitrate, ammonia, and nitrogen and carbon mineralization did not differ significantly across the landscape. Altogether, this research demonstrates the effectiveness of a terrain-based clustering method for guiding soil sampling of microbial communities.« less

  17. Differences in soil biological activity by terrain types at the sub-field scale in central Iowa US

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kaleita, Amy L.; Schott, Linda R.; Hargreaves, Sarah K.

    Soil microbial communities are structured by biogeochemical processes that occur at many different spatial scales, which makes soil sampling difficult. Because soil microbial communities are important in nutrient cycling and soil fertility, it is important to understand how microbial communities function within the heterogeneous soil landscape. In this study, a self-organizing map was used to determine whether landscape data can be used to characterize the distribution of microbial biomass and activity in order to provide an improved understanding of soil microbial community function. Points within a row crop field in south-central Iowa were clustered via a self-organizing map using sixmore » landscape properties into three separate landscape clusters. Twelve sampling locations per cluster were chosen for a total of 36 locations. After the soil samples were collected, the samples were then analysed for various metabolic indicators, such as nitrogen and carbon mineralization, extractable organic carbon, microbial biomass, etc. It was found that sampling locations located in the potholes and toe slope positions had significantly greater microbial biomass nitrogen and carbon, total carbon, total nitrogen and extractable organic carbon than the other two landscape position clusters, while locations located on the upslope did not differ significantly from the other landscape clusters. However, factors such as nitrate, ammonia, and nitrogen and carbon mineralization did not differ significantly across the landscape. Altogether, this research demonstrates the effectiveness of a terrain-based clustering method for guiding soil sampling of microbial communities.« less

  18. Dark Energy Survey Year 1 Results: galaxy mock catalogues for BAO

    NASA Astrophysics Data System (ADS)

    Avila, S.; Crocce, M.; Ross, A. J.; García-Bellido, J.; Percival, W. J.; Banik, N.; Camacho, H.; Kokron, N.; Chan, K. C.; Andrade-Oliveira, F.; Gomes, R.; Gomes, D.; Lima, M.; Rosenfeld, R.; Salvador, A. I.; Friedrich, O.; Abdalla, F. B.; Annis, J.; Benoit-Lévy, A.; Bertin, E.; Brooks, D.; Carrasco Kind, M.; Carretero, J.; Castander, F. J.; Cunha, C. E.; da Costa, L. N.; Davis, C.; De Vicente, J.; Doel, P.; Fosalba, P.; Frieman, J.; Gerdes, D. W.; Gruen, D.; Gruendl, R. A.; Gutierrez, G.; Hartley, W. G.; Hollowood, D.; Honscheid, K.; James, D. J.; Kuehn, K.; Kuropatkin, N.; Miquel, R.; Plazas, A. A.; Sanchez, E.; Scarpine, V.; Schindler, R.; Schubnell, M.; Sevilla-Noarbe, I.; Smith, M.; Sobreira, F.; Suchyta, E.; Swanson, M. E. C.; Tarle, G.; Thomas, D.; Walker, A. R.; Dark Energy Survey Collaboration

    2018-05-01

    Mock catalogues are a crucial tool in the analysis of galaxy surveys data, both for the accurate computation of covariance matrices, and for the optimisation of analysis methodology and validation of data sets. In this paper, we present a set of 1800 galaxy mock catalogues designed to match the Dark Energy Survey Year-1 BAO sample (Crocce et al. 2017) in abundance, observational volume, redshift distribution and uncertainty, and redshift dependent clustering. The simulated samples were built upon HALOGEN (Avila et al. 2015) halo catalogues, based on a 2LPT density field with an empirical halo bias. For each of them, a lightcone is constructed by the superposition of snapshots in the redshift range 0.45 < z < 1.4. Uncertainties introduced by so-called photometric redshifts estimators were modelled with a double-skewed-Gaussian curve fitted to the data. We populate halos with galaxies by introducing a hybrid Halo Occupation Distribution - Halo Abundance Matching model with two free parameters. These are adjusted to achieve a galaxy bias evolution b(zph) that matches the data at the 1-σ level in the range 0.6 < zph < 1.0. We further analyse the galaxy mock catalogues and compare their clustering to the data using the angular correlation function w(θ), the comoving transverse separation clustering ξμ < 0.8(s⊥) and the angular power spectrum Cℓ, finding them in agreement. This is the first large set of three-dimensional {ra,dec,z} galaxy mock catalogues able to simultaneously accurately reproduce the photometric redshift uncertainties and the galaxy clustering.

  19. mcrA-Targeted Real-Time Quantitative PCR Method To Examine Methanogen Communities▿

    PubMed Central

    Steinberg, Lisa M.; Regan, John M.

    2009-01-01

    Methanogens are of great importance in carbon cycling and alternative energy production, but quantitation with culture-based methods is time-consuming and biased against methanogen groups that are difficult to cultivate in a laboratory. For these reasons, methanogens are typically studied through culture-independent molecular techniques. We developed a SYBR green I quantitative PCR (qPCR) assay to quantify total numbers of methyl coenzyme M reductase α-subunit (mcrA) genes. TaqMan probes were also designed to target nine different phylogenetic groups of methanogens in qPCR assays. Total mcrA and mcrA levels of different methanogen phylogenetic groups were determined from six samples: four samples from anaerobic digesters used to treat either primarily cow or pig manure and two aliquots from an acidic peat sample stored at 4°C or 20°C. Only members of the Methanosaetaceae, Methanosarcina, Methanobacteriaceae, and Methanocorpusculaceae and Fen cluster were detected in the environmental samples. The three samples obtained from cow manure digesters were dominated by members of the genus Methanosarcina, whereas the sample from the pig manure digester contained detectable levels of only members of the Methanobacteriaceae. The acidic peat samples were dominated by both Methanosarcina spp. and members of the Fen cluster. In two of the manure digester samples only one methanogen group was detected, but in both of the acidic peat samples and two of the manure digester samples, multiple methanogen groups were detected. The TaqMan qPCR assays were successfully able to determine the environmental abundance of different phylogenetic groups of methanogens, including several groups with few or no cultivated members. PMID:19447957

  20. The clustering-based case-based reasoning for imbalanced business failure prediction: a hybrid approach through integrating unsupervised process with supervised process

    NASA Astrophysics Data System (ADS)

    Li, Hui; Yu, Jun-Ling; Yu, Le-An; Sun, Jie

    2014-05-01

    Case-based reasoning (CBR) is one of the main forecasting methods in business forecasting, which performs well in prediction and holds the ability of giving explanations for the results. In business failure prediction (BFP), the number of failed enterprises is relatively small, compared with the number of non-failed ones. However, the loss is huge when an enterprise fails. Therefore, it is necessary to develop methods (trained on imbalanced samples) which forecast well for this small proportion of failed enterprises and performs accurately on total accuracy meanwhile. Commonly used methods constructed on the assumption of balanced samples do not perform well in predicting minority samples on imbalanced samples consisting of the minority/failed enterprises and the majority/non-failed ones. This article develops a new method called clustering-based CBR (CBCBR), which integrates clustering analysis, an unsupervised process, with CBR, a supervised process, to enhance the efficiency of retrieving information from both minority and majority in CBR. In CBCBR, various case classes are firstly generated through hierarchical clustering inside stored experienced cases, and class centres are calculated out by integrating cases information in the same clustered class. When predicting the label of a target case, its nearest clustered case class is firstly retrieved by ranking similarities between the target case and each clustered case class centre. Then, nearest neighbours of the target case in the determined clustered case class are retrieved. Finally, labels of the nearest experienced cases are used in prediction. In the empirical experiment with two imbalanced samples from China, the performance of CBCBR was compared with the classical CBR, a support vector machine, a logistic regression and a multi-variant discriminate analysis. The results show that compared with the other four methods, CBCBR performed significantly better in terms of sensitivity for identifying the minority samples and generated high total accuracy meanwhile. The proposed approach makes CBR useful in imbalanced forecasting.

  1. Epidemiology of multiple childhood traumatic events: child abuse, parental psychopathology, and other family-level stressors.

    PubMed

    Menard, C B; Bandeen-Roche, K J; Chilcoat, H D

    2004-11-01

    Multiple family-level childhood stressors are common and are correlated. It is unknown if clusters of commonly co-occurring stressors are identifiable. The study was designed to explore family-level stressor clustering in the general population, to estimate the prevalence of exposure classes, and to examine the correlation of sociodemographic characteristics with class prevalence. Data were collected from an epidemiological sample and analyzed using latent class regression. A six-class solution was identified. Classes were characterized by low risk (prevalence=23%), universal high risk (7 %), family conflict (11 %), household substance problems (22 %), non-nuclear family structure (24 %), parent's mental illness (13 %). Class prevalence varied with race and welfare status, not gender. Interventions for childhood stressors are person-focused; the analytic approach may uniquely inform resource allocation.

  2. Point process statistics in atom probe tomography.

    PubMed

    Philippe, T; Duguay, S; Grancher, G; Blavette, D

    2013-09-01

    We present a review of spatial point processes as statistical models that we have designed for the analysis and treatment of atom probe tomography (APT) data. As a major advantage, these methods do not require sampling. The mean distance to nearest neighbour is an attractive approach to exhibit a non-random atomic distribution. A χ(2) test based on distance distributions to nearest neighbour has been developed to detect deviation from randomness. Best-fit methods based on first nearest neighbour distance (1 NN method) and pair correlation function are presented and compared to assess the chemical composition of tiny clusters. Delaunay tessellation for cluster selection has been also illustrated. These statistical tools have been applied to APT experiments on microelectronics materials. Copyright © 2012 Elsevier B.V. All rights reserved.

  3. Design of an audio advertisement dataset

    NASA Astrophysics Data System (ADS)

    Fu, Yutao; Liu, Jihong; Zhang, Qi; Geng, Yuting

    2015-12-01

    Since more and more advertisements swarm into radios, it is necessary to establish an audio advertising dataset which could be used to analyze and classify the advertisement. A method of how to establish a complete audio advertising dataset is presented in this paper. The dataset is divided into four different kinds of advertisements. Each advertisement's sample is given in *.wav file format, and annotated with a txt file which contains its file name, sampling frequency, channel number, broadcasting time and its class. The classifying rationality of the advertisements in this dataset is proved by clustering the different advertisements based on Principal Component Analysis (PCA). The experimental results show that this audio advertisement dataset offers a reliable set of samples for correlative audio advertisement experimental studies.

  4. Differentials in colostrum feeding among lactating women of block RS Pura of J and K: A lesson for nursing practice.

    PubMed

    Raina, Sunil Kumar; Mengi, Vijay; Singh, Gurdeep

    2012-07-01

    Breast feeding is universally and traditionally practicised in India. Experts advocate breast feeding as the best method of feeding young infants. To assess the role of various factors in determining colostrum feeding in block R. S. Pura of district Jammu. A stratified two-stage design with villages as the primary sampling unit and lactating mothers as secondary sampling unit. Villages were divided into different clusters on the basis of population and sampling units were selected by a simple random technique. Breastfeeding is almost universal in R. S. Pura. Differentials in discarding the first milk were not found to be important among various socioeconomic groups and the phenomenon appeared more general than specific.

  5. Cluster and principal component analysis based on SSR markers of Amomum tsao-ko in Jinping County of Yunnan Province

    NASA Astrophysics Data System (ADS)

    Ma, Mengli; Lei, En; Meng, Hengling; Wang, Tiantao; Xie, Linyan; Shen, Dong; Xianwang, Zhou; Lu, Bingyue

    2017-08-01

    Amomum tsao-ko is a commercial plant that used for various purposes in medicinal and food industries. For the present investigation, 44 germplasm samples were collected from Jinping County of Yunnan Province. Clusters analysis and 2-dimensional principal component analysis (PCA) was used to represent the genetic relations among Amomum tsao-ko by using simple sequence repeat (SSR) markers. Clustering analysis clearly distinguished the samples groups. Two major clusters were formed; first (Cluster I) consisted of 34 individuals, the second (Cluster II) consisted of 10 individuals, Cluster I as the main group contained multiple sub-clusters. PCA also showed 2 groups: PCA Group 1 included 29 individuals, PCA Group 2 included 12 individuals, consistent with the results of cluster analysis. The purpose of the present investigation was to provide information on genetic relationship of Amomum tsao-ko germplasm resources in main producing areas, also provide a theoretical basis for the protection and utilization of Amomum tsao-ko resources.

  6. The kinematics of dense clusters of galaxies. II - The distribution of velocity dispersions

    NASA Technical Reports Server (NTRS)

    Zabludoff, Ann I.; Geller, Margaret J.; Huchra, John P.; Ramella, Massimo

    1993-01-01

    From the survey of 31 Abell R above 1 cluster fields within z of 0.02-0.05, we extract 25 dense clusters with velocity dispersions omicron above 300 km/s and with number densities exceeding the mean for the Great Wall of galaxies by one deviation. From the CfA Redshift Survey (in preparation), we obtain an approximately volume-limited catalog of 31 groups with velocity dispersions above 100 km/s and with the same number density limit. We combine these well-defined samples to obtain the distribution of cluster velocity dispersions. The group sample enables us to correct for incompleteness in the Abell catalog at low velocity dispersions. The clusters from the Abell cluster fields populate the high dispersion tail. For systems with velocity dispersions above 700 km/s, approximately the median for R = 1 clusters, the group and cluster abundances are consistent. The combined distribution is consistent with cluster X-ray temperature functions.

  7. The cluster-cluster correlation function. [of galaxies

    NASA Technical Reports Server (NTRS)

    Postman, M.; Geller, M. J.; Huchra, J. P.

    1986-01-01

    The clustering properties of the Abell and Zwicky cluster catalogs are studied using the two-point angular and spatial correlation functions. The catalogs are divided into eight subsamples to determine the dependence of the correlation function on distance, richness, and the method of cluster identification. It is found that the Corona Borealis supercluster contributes significant power to the spatial correlation function to the Abell cluster sample with distance class of four or less. The distance-limited catalog of 152 Abell clusters, which is not greatly affected by a single system, has a spatial correlation function consistent with the power law Xi(r) = 300r exp -1.8. In both the distance class four or less and distance-limited samples the signal in the spatial correlation function is a power law detectable out to 60/h Mpc. The amplitude of Xi(r) for clusters of richness class two is about three times that for richness class one clusters. The two-point spatial correlation function is sensitive to the use of estimated redshifts.

  8. Edge Principal Components and Squash Clustering: Using the Special Structure of Phylogenetic Placement Data for Sample Comparison

    PubMed Central

    Matsen IV, Frederick A.; Evans, Steven N.

    2013-01-01

    Principal components analysis (PCA) and hierarchical clustering are two of the most heavily used techniques for analyzing the differences between nucleic acid sequence samples taken from a given environment. They have led to many insights regarding the structure of microbial communities. We have developed two new complementary methods that leverage how this microbial community data sits on a phylogenetic tree. Edge principal components analysis enables the detection of important differences between samples that contain closely related taxa. Each principal component axis is a collection of signed weights on the edges of the phylogenetic tree, and these weights are easily visualized by a suitable thickening and coloring of the edges. Squash clustering outputs a (rooted) clustering tree in which each internal node corresponds to an appropriate “average” of the original samples at the leaves below the node. Moreover, the length of an edge is a suitably defined distance between the averaged samples associated with the two incident nodes, rather than the less interpretable average of distances produced by UPGMA, the most widely used hierarchical clustering method in this context. We present these methods and illustrate their use with data from the human microbiome. PMID:23505415

  9. Uranium hydrogeochemical and stream sediment reconnaissance of the Albuquerque NTMS Quadrangle, New Mexico, including concentrations of forty-three additional elements

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Maassen, L.W.; Bolivar, S.L.

    1979-06-01

    The Los Alamos Scientific Laboratory conducted a hydrogeochemical and stream sediment reconnaissance for uranium. Totals of 408 water and 1538 sediment samples were collected from 1802 locations over a 20 100-km/sup 2/ area at an average density of one location per 11 km/sup 2/. Water samples were collected from springs, wells, and streams; sediments samples were collected predominantly from streams, but also from springs. All water samples were analyzed for uranium and 12 other elements. Sediment samples were analyzed for uranium and 42 additional elements. The uranium concentrations in water samples range from below the detection limit of 0.02 ppBmore » to 194.06 ppB. The mean uranium concentration for all water types containing < 40 ppB uranium is 1.98 ppB. Six samples contained uranium concentrations > 40.00 ppB. Well waters have the highest mean uranium concentration; spring waters have the lowest. Clusters of water samples that contain anomalous uranium concentrations are delineated in nine areas. Sediments collected from the quadrangle have uranium concentrations that range between 0.63 ppM and 28.52 ppM, with a mean for all sediments of 3.53 ppM. Eight areas containing clusters of sediments with anomalous uranium concentrations are delineated. One cluster contains sample locations within the Ambrosia Lake uranium district. Five clusters of sediment samples with anomalous uranium concentrations were collected from streams that drain the Jemez volcanic field. Another cluster defines an area just northeast of Albuquerque where streams drain Precambrian rocks, predominantly granites, of the Sandia Mountains. The last cluster, consisting of spring sediments from Mesa Portales, was collected near the contact of the Tertiary Ojo Alamo sandstone with underlying Cretaceous sediments. Sediments from these springs exhibit some of the highest uranium values reported and are associated with high uranium/thorium ratios.« less

  10. Motivational Cluster Profiles of Adolescent Athletes: An Examination of Differences in Physical-Self Perception

    PubMed Central

    Çağlar, Emine; Aşçı, F. Hülya

    2010-01-01

    The primary purpose of the present study was to identify motivational profiles of adolescent athletes using cluster analysis in non-Western culture. A second purpose was to examine relationships between physical self-perception differences of adolescent athletes and motivational profiles. One hundred and thirty six male (Mage = 17.46, SD = 1.25 years) and 80 female adolescent athletes (Mage = 17.61, SD = 1.19 years) from a variety of team sports including basketball, soccer, volleyball, and handball volunteered to participate in this study. The Sport Motivation Scale (SMS) and Physical Self-Perception Profile (PSPP) were administered to all participants. Hierarchical cluster analysis revealed a four-cluster solution for this sample: amotivated, low motivated, moderate motivated, and highly motivated. A 4 x 5 (Cluster x PSPP Subscales) MANOVA revealed no significant main effect of motivational clusters on physical self-perception levels (p > 0.05). As a result, findings of the present study showed that motivational types of the adolescent athletes constituted four different motivational clusters. Highly and moderate motivated athletes consistently scored higher than amotivated athletes on the perceived sport competence, physical condition, and physical self-worth subscales of PSPP. This study identified motivational profiles of competitive youth-sport participants. Key points Highly motivated athletes have a tendency to perceive themselves competent in psychomotor domains as compared to the amotivated athletes As the athletes feel more competent in psychomotor domain, they are more intrinsically motivated. The information about motivational profiles of adolescent athletes could be used for developing strategies and interventions designed to improve the strength and quality of sport participants’ motivation. PMID:24149690

  11. Topological side-chain classification of beta-turns: ideal motifs for peptidomimetic development.

    PubMed

    Tran, Tran Trung; McKie, Jim; Meutermans, Wim D F; Bourne, Gregory T; Andrews, Peter R; Smythe, Mark L

    2005-08-01

    Beta-turns are important topological motifs for biological recognition of proteins and peptides. Organic molecules that sample the side chain positions of beta-turns have shown broad binding capacity to multiple different receptors, for example benzodiazepines. Beta-turns have traditionally been classified into various types based on the backbone dihedral angles (phi2, psi2, phi3 and psi3). Indeed, 57-68% of beta-turns are currently classified into 8 different backbone families (Type I, Type II, Type I', Type II', Type VIII, Type VIa1, Type VIa2 and Type VIb and Type IV which represents unclassified beta-turns). Although this classification of beta-turns has been useful, the resulting beta-turn types are not ideal for the design of beta-turn mimetics as they do not reflect topological features of the recognition elements, the side chains. To overcome this, we have extracted beta-turns from a data set of non-homologous and high-resolution protein crystal structures. The side chain positions, as defined by C(alpha)-C(beta) vectors, of these turns have been clustered using the kth nearest neighbor clustering and filtered nearest centroid sorting algorithms. Nine clusters were obtained that cluster 90% of the data, and the average intra-cluster RMSD of the four C(alpha)-C(beta) vectors is 0.36. The nine clusters therefore represent the topology of the side chain scaffold architecture of the vast majority of beta-turns. The mean structures of the nine clusters are useful for the development of beta-turn mimetics and as biological descriptors for focusing combinatorial chemistry towards biologically relevant topological space.

  12. Analysis of β-Subgroup Proteobacterial Ammonia Oxidizer Populations in Soil by Denaturing Gradient Gel Electrophoresis Analysis and Hierarchical Phylogenetic Probing

    PubMed Central

    Stephen, John R.; Kowalchuk, George A.; Bruns, Mary-Ann V.; McCaig, Allison E.; Phillips, Carol J.; Embley, T. Martin; Prosser, James I.

    1998-01-01

    A combination of denaturing gradient gel electrophoresis (DGGE) and oligonucleotide probing was used to investigate the influence of soil pH on the compositions of natural populations of autotrophic β-subgroup proteobacterial ammonia oxidizers. PCR primers specific to this group were used to amplify 16S ribosomal DNA (rDNA) from soils maintained for 36 years at a range of pH values, and PCR products were analyzed by DGGE. Genus- and cluster-specific probes were designed to bind to sequences within the region amplified by these primers. A sequence specific to all β-subgroup ammonia oxidizers could not be identified, but probes specific for Nitrosospira clusters 1 to 4 and Nitrosomonas clusters 6 and 7 (J. R. Stephen, A. E. McCaig, Z. Smith, J. I. Prosser, and T. M. Embley, Appl. Environ. Microbiol. 62:4147–4154, 1996) were designed. Elution profiles of probes against target sequences and closely related nontarget sequences indicated a requirement for high-stringency hybridization conditions to distinguish between different clusters. DGGE banding patterns suggested the presence of Nitrosomonas cluster 6a and Nitrosospira clusters 2, 3, and 4 in all soil plots, but results were ambiguous because of overlapping banding patterns. Unambiguous band identification of the same clusters was achieved by combined DGGE and probing of blots with the cluster-specific radiolabelled probes. The relative intensities of hybridization signals provided information on the apparent selection of different Nitrosospira genotypes in samples of soil of different pHs. The signal from the Nitrosospira cluster 3 probe decreased significantly, relative to an internal control probe, with decreasing soil pH in the range of 6.6 to 3.9, while Nitrosospira cluster 2 hybridization signals increased with increasing soil acidity. Signals from Nitrosospira cluster 4 were greatest at pH 5.5, decreasing at lower and higher values, while Nitrosomonas cluster 6a signals did not vary significantly with pH. These findings are in agreement with a previous molecular study (J. R. Stephen, A. E. McCaig, Z. Smith, J. I. Prosser, and T. M. Embley, Appl. Environ. Microbiol 62:4147–4154, 1996) of the same sites, which demonstrated the presence of the same four clusters of ammonia oxidizers and indicated that selection might be occurring for clusters 2 and 3 at acid and neutral pHs, respectively. The two studies used different sets of PCR primers for amplification of 16S rDNA sequences from soil, and the similar findings suggest that PCR bias was unlikely to be a significant factor. The present study demonstrates the value of DGGE and probing for rapid analysis of natural soil communities of β-subgroup proteobacterial ammonia oxidizers, indicates significant pH-associated differences in Nitrosospira populations, and suggests that Nitrosospira cluster 2 may be of significance for ammonia-oxidizing activity in acid soils. PMID:9687457

  13. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Laskin, Julia; Johnson, Grant E.; Prabhakaran, Venkateshkumar

    Immobilization of complex molecules and clusters on supports plays an important role in a variety of disciplines including materials science, catalysis and biochemistry. In particular, deposition of clusters on surfaces has attracted considerable attention due to their non-scalable, highly size-dependent properties. The ability to precisely control the composition and morphology of clusters and small nanoparticles on surfaces is crucial for the development of next generation materials with rationally tailored properties. Soft- and reactive landing of ions onto solid or liquid surfaces introduces unprecedented selectivity into surface modification by completely eliminating the effect of solvent and sample contamination on the qualitymore » of the film. The ability to select the mass-to-charge ratio of the precursor ion, its kinetic energy and charge state along with precise control of the size, shape and position of the ion beam on the deposition target makes soft-landing an attractive approach for surface modification. High-purity uniform thin films on surfaces generated using mass-selected ion deposition facilitate understanding of critical interfacial phenomena relevant to catalysis, energy generation and storage, and materials science. Our efforts have been directed toward understanding charge retention by soft-landed metal and metal-oxide cluster ions, which may affect both their structure and reactivity. Specifically, we have examined the effect of the surface on charge retention by both positively and negatively charged cluster ions. We found that the electronic properties of the surface play an important role in charge retention by cluster cations. Meanwhile, the electron binding energy is a key factor determining charge retention by cluster anions. These findings provide the scientific foundation for the rational design of interfaces for advanced catalysts and energy storage devices. Further optimization of electrode-electrolyte interfaces for applications in energy storage and electrocatalysis may be achieved by understanding and controlling the properties of soft-landed cluster ions.« less

  14. Dyspnea descriptors developed in Brazil: application in obese patients and in patients with cardiorespiratory diseases.

    PubMed

    Teixeira, Christiane Aires; Rodrigues Júnior, Antonio Luiz; Straccia, Luciana Cristina; Vianna, Elcio Dos Santos Oliveira; Silva, Geruza Alves da; Martinez, José Antônio Baddini

    2011-01-01

    To develop a set of descriptive terms applied to the sensation of dyspnea (dyspnea descriptors) for use in Brazil and to investigate the usefulness of these descriptors in four distinct clinical conditions that can be accompanied by dyspnea. We collected 111 dyspnea descriptors from 67 patients and 10 health professionals. These descriptors were analyzed and reduced to 15 based on their frequency of use, similarity of meaning, and potential pathophysiological value. Those 15 descriptors were applied in 50 asthma patients, 50 COPD patients, 30 patients with heart failure, and 50 patients with class II or III obesity. The three best descriptors, as selected by the patients, were studied by cluster analysis. Potential associations between the identified clusters and the four clinical conditions were also investigated. The use of this set of descriptors led to a solution with seven clusters, designated sufoco (suffocating), aperto (tight), rápido (rapid), fadiga (fatigue), abafado (stuffy), trabalho/inspiração (work/inhalation), and falta de ar (shortness of breath). Overlapping of descriptors was quite common among the patients, regardless of their clinical condition. Asthma was significantly associated with the sufoco and trabalho/inspiração clusters, whereas COPD and heart failure were associated with the sufoco, trabalho/inspiração, and falta de ar clusters. Obesity was associated only with the falta de ar cluster. In Brazil, patients who are accustomed to perceiving dyspnea employ various descriptors in order to describe the symptom, and these descriptors can be grouped into similar clusters. In our study sample, such clusters showed no usefulness in differentiating among the four clinical conditions evaluated.

  15. Inference from clustering with application to gene-expression microarrays.

    PubMed

    Dougherty, Edward R; Barrera, Junior; Brun, Marcel; Kim, Seungchan; Cesar, Roberto M; Chen, Yidong; Bittner, Michael; Trent, Jeffrey M

    2002-01-01

    There are many algorithms to cluster sample data points based on nearness or a similarity measure. Often the implication is that points in different clusters come from different underlying classes, whereas those in the same cluster come from the same class. Stochastically, the underlying classes represent different random processes. The inference is that clusters represent a partition of the sample points according to which process they belong. This paper discusses a model-based clustering toolbox that evaluates cluster accuracy. Each random process is modeled as its mean plus independent noise, sample points are generated, the points are clustered, and the clustering error is the number of points clustered incorrectly according to the generating random processes. Various clustering algorithms are evaluated based on process variance and the key issue of the rate at which algorithmic performance improves with increasing numbers of experimental replications. The model means can be selected by hand to test the separability of expected types of biological expression patterns. Alternatively, the model can be seeded by real data to test the expected precision of that output or the extent of improvement in precision that replication could provide. In the latter case, a clustering algorithm is used to form clusters, and the model is seeded with the means and variances of these clusters. Other algorithms are then tested relative to the seeding algorithm. Results are averaged over various seeds. Output includes error tables and graphs, confusion matrices, principal-component plots, and validation measures. Five algorithms are studied in detail: K-means, fuzzy C-means, self-organizing maps, hierarchical Euclidean-distance-based and correlation-based clustering. The toolbox is applied to gene-expression clustering based on cDNA microarrays using real data. Expression profile graphics are generated and error analysis is displayed within the context of these profile graphics. A large amount of generated output is available over the web.

  16. The evolution in the stellar mass of brightest cluster galaxies over the past 10 billion years

    NASA Astrophysics Data System (ADS)

    Bellstedt, Sabine; Lidman, Chris; Muzzin, Adam; Franx, Marijn; Guatelli, Susanna; Hill, Allison R.; Hoekstra, Henk; Kurinsky, Noah; Labbe, Ivo; Marchesini, Danilo; Marsan, Z. Cemile; Safavi-Naeini, Mitra; Sifón, Cristóbal; Stefanon, Mauro; van de Sande, Jesse; van Dokkum, Pieter; Weigel, Catherine

    2016-08-01

    Using a sample of 98 galaxy clusters recently imaged in the near-infrared with the European Southern Observatory (ESO) New Technology Telescope, WIYN telescope and William Herschel Telescope, supplemented with 33 clusters from the ESO archive, we measure how the stellar mass of the most massive galaxies in the universe, namely brightest cluster galaxies (BCGs), increases with time. Most of the BCGs in this new sample lie in the redshift range 0.2 < z < 0.6, which has been noted in recent works to mark an epoch over which the growth in the stellar mass of BCGs stalls. From this sample of 132 clusters, we create a subsample of 102 systems that includes only those clusters that have estimates of the cluster mass. We combine the BCGs in this subsample with BCGs from the literature, and find that the growth in stellar mass of BCGs from 10 billion years ago to the present epoch is broadly consistent with recent semi-analytic and semi-empirical models. As in other recent studies, tentative evidence indicates that the stellar mass growth rate of BCGs may be slowing in the past 3.5 billion years. Further work in collecting larger samples, and in better comparing observations with theory using mock images, is required if a more detailed comparison between the models and the data is to be made.

  17. Positive outcome expectancy mediates the relationship between social influence and Internet addiction among senior high-school students.

    PubMed

    Lin, Min-Pei; Wu, Jo Yung-Wei; Chen, Chao-Jui; You, Jianing

    2018-06-28

    Background and aims Based on the foundations of Bandura's social cognitive theory and theory of triadic influence (TTI) theoretical framework, this study was designed to examine the mediating role of positive outcome expectancy of Internet use in the relationship between social influence and Internet addiction (IA) in a large representative sample of senior high-school students in Taiwan. Methods Using a cross-sectional design, 1,922 participants were recruited from senior high schools throughout Taiwan using both stratified and cluster sampling, and a comprehensive survey was administered. Results Structural equation modeling and bootstrap analyses results showed that IA severity was significantly and positively predicted by social influence, and fully mediated through positive outcome expectancy of Internet use. Discussion and conclusions The results not only support Bandura's social cognitive theory and TTI framework, but can also serve as a reference to help educational agencies and mental health organizations design programs and create policies that will help in the prevention of IA among adolescents.

  18. VizieR Online Data Catalog: Star clusters distances and extinctions. II. (Buckner+, 2014)

    NASA Astrophysics Data System (ADS)

    Buckner, A. S. M.; Froebrich, D.

    2015-04-01

    Until now, it has been impossible to observationally measure how star cluster scaleheight evolves beyond 1Gyr as only small samples have been available. Here, we establish a novel method to determine the scaleheight of a cluster sample using modelled distributions and Kolmogorov-Smirnov tests. This allows us to determine the scaleheight with a 25% accuracy for samples of 38 clusters or more. We apply our method to investigate the temporal evolution of cluster scaleheight, using homogeneously selected sub-samples of Kharchenko et al. (MWSC, 2012, Cat. J/A+A/543/A156, 2013, J/A+A/558/A53 ), Dias et al. (DAML02, 2002A&A...389..871D, Cat. B/ocl), WEBDA, and Froebrich et al. (FSR, 2007MNRAS.374..399F, Cat. J/MNRAS/374/399). We identify a linear relationship between scaleheight and log(age/yr) of clusters, considerably different from field stars. The scaleheight increases from about 40pc at 1Myr to 75pc at 1Gyr, most likely due to internal evolution and external scattering events. After 1Gyr, there is a marked change of the behaviour, with the scaleheight linearly increasing with log(age/yr) to about 550pc at 3.5Gyr. The most likely interpretation is that the surviving clusters are only observable because they have been scattered away from the mid-plane in their past. A detailed understanding of this observational evidence can only be achieved with numerical simulations of the evolution of cluster samples in the Galactic disc. Furthermore, we find a weak trend of an age-independent increase in scaleheight with Galactocentric distance. There are no significant temporal or spatial variations of the cluster distribution zero-point. We determine the Sun's vertical displacement from the Galactic plane as Z⊙=18.5+/-1.2pc. (1 data file).

  19. Massive and refined: A sample of large galaxy clusters simulated at high resolution. I: Thermal gas and properties of shock waves

    NASA Astrophysics Data System (ADS)

    Vazza, F.; Brunetti, G.; Gheller, C.; Brunino, R.

    2010-11-01

    We present a sample of 20 massive galaxy clusters with total virial masses in the range of 6 × 10 14 M ⊙ ⩽ Mvir ⩽ 2 × 10 15 M ⊙, re-simulated with a customized version of the 1.5. ENZO code employing adaptive mesh refinement. This technique allowed us to obtain unprecedented high spatial resolution (≈25 kpc/h) up to the distance of ˜3 virial radii from the clusters center, and makes it possible to focus with the same level of detail on the physical properties of the innermost and of the outermost cluster regions, providing new clues on the role of shock waves and turbulent motions in the ICM, across a wide range of scales. In this paper, a first exploratory study of this data set is presented. We report on the thermal properties of galaxy clusters at z = 0. Integrated and morphological properties of gas density, gas temperature, gas entropy and baryon fraction distributions are discussed, and compared with existing outcomes both from the observational and from the numerical literature. Our cluster sample shows an overall good consistency with the results obtained adopting other numerical techniques (e.g. Smoothed Particles Hydrodynamics), yet it provides a more accurate representation of the accretion patterns far outside the cluster cores. We also reconstruct the properties of shock waves within the sample by means of a velocity-based approach, and we study Mach numbers and energy distributions for the various dynamical states in clusters, giving estimates for the injection of Cosmic Rays particles at shocks. The present sample is rather unique in the panorama of cosmological simulations of massive galaxy clusters, due to its dynamical range, statistics of objects and number of time outputs. For this reason, we deploy a public repository of the available data, accessible via web portal at http://data.cineca.it.

  20. Technique for fast and efficient hierarchical clustering

    DOEpatents

    Stork, Christopher

    2013-10-08

    A fast and efficient technique for hierarchical clustering of samples in a dataset includes compressing the dataset to reduce a number of variables within each of the samples of the dataset. A nearest neighbor matrix is generated to identify nearest neighbor pairs between the samples based on differences between the variables of the samples. The samples are arranged into a hierarchy that groups the samples based on the nearest neighbor matrix. The hierarchy is rendered to a display to graphically illustrate similarities or differences between the samples.

  1. Cluster Analysis of the Yale Global Tic Severity Scale (YGTSS): Symptom Dimensions and Clinical Correlates in an Outpatient Youth Sample

    ERIC Educational Resources Information Center

    Kircanski, Katharina; Woods, Douglas W.; Chang, Susanna W.; Ricketts, Emily J.; Piacentini, John C.

    2010-01-01

    Tic disorders are heterogeneous, with symptoms varying widely both within and across patients. Exploration of symptom clusters may aid in the identification of symptom dimensions of empirical and treatment import. This article presents the results of two studies investigating tic symptom clusters using a sample of 99 youth (M age = 10.7, 81% male,…

  2. Relations between the Woodcock-Johnson III Clinical Clusters and Measures of Executive Functions from the Delis-Kaplan Executive Function System

    ERIC Educational Resources Information Center

    Floyd, Randy G.; McCormack, Allison C.; Ingram, Elizabeth L.; Davis, Amy E.; Bergeron, Renee; Hamilton, Gloria

    2006-01-01

    This study examined the convergent relations between scores from four clinical clusters from the Woodcock-Johnson III Tests of Cognitive Abilities (WJ III) and measures of executive functions using a sample of school-aged children and a sample of adults. The WJ III clinical clusters included the Working Memory, Cognitive Fluency, Broad Attention,…

  3. Identification of population substructure among Jews using STR markers and dependence on reference populations included.

    PubMed

    Listman, Jennifer B; Hasin, Deborah; Kranzler, Henry R; Malison, Robert T; Mutirangura, Apiwat; Sughondhabirom, Atapol; Aharonovich, Efrat; Spivak, Baruch; Gelernter, Joel

    2010-06-14

    Detecting population substructure is a critical issue for association studies of health behaviors and other traits. Whether inherent in the population or an artifact of marker choice, determining aspects of a population's genetic history as potential sources of substructure can aid in design of future genetic studies. Jewish populations, among which association studies are often conducted, have a known history of migrations. As a necessary step in understanding population structure to conduct valid association studies of health behaviors among Israeli Jews, we investigated genetic signatures of this history and quantified substructure to facilitate future investigations of these phenotypes in this population. Using 32 autosomal STR markers and the program STRUCTURE, we differentiated between Ashkenazi (AJ, N = 135) and non-Ashkenazi (NAJ, N = 226) Jewish populations in the form of Northern and Southern geographic genetic components (AJ north 73%, south 23%, NAJ north 33%, south 60%). The ability to detect substructure within these closely related populations using a small STR panel was contingent on including additional samples representing major continental populations in the analyses. Although clustering programs such as STRUCTURE are designed to assign proportions of ancestry to individuals without reference population information, when Jewish samples were analyzed in the absence of proxy parental populations, substructure within Jews was not detected. Generally, for samples with a given grandparental country of birth, STRUCTURE assignment values to Northern, Southern, African and Asian clusters agreed with mitochondrial DNA and Y-chromosomal data from previous studies as well as historical records of migration and intermarriage.

  4. ASA-FTL: An adaptive separation aware flash translation layer for solid state drives

    DOE PAGES

    Xie, Wei; Chen, Yong; Roth, Philip C

    2016-11-03

    Here, the flash-memory based Solid State Drive (SSD) presents a promising storage solution for increasingly critical data-intensive applications due to its low latency (high throughput), high bandwidth, and low power consumption. Within an SSD, its Flash Translation Layer (FTL) is responsible for exposing the SSD’s flash memory storage to the computer system as a simple block device. The FTL design is one of the dominant factors determining an SSD’s lifespan and performance. To reduce the garbage collection overhead and deliver better performance, we propose a new, low-cost, adaptive separation-aware flash translation layer (ASA-FTL) that combines sampling, data clustering and selectivemore » caching of recency information to accurately identify and separate hot/cold data while incurring minimal overhead. We use sampling for light-weight identification of separation criteria, and our dedicated selective caching mechanism is designed to save the limited RAM resource in contemporary SSDs. Using simulations of ASA-FTL with both real-world and synthetic workloads, we have shown that our proposed approach reduces the garbage collection overhead by up to 28% and the overall response time by 15% compared to one of the most advanced existing FTLs. We find that the data clustering using a small sample size provides significant performance benefit while only incurring a very small computation and memory cost. In addition, our evaluation shows that ASA-FTL is able to adapt to the changes in the access pattern of workloads, which is a major advantage comparing to existing fixed data separation methods.« less

  5. Identification of population substructure among Jews using STR markers and dependence on reference populations included

    PubMed Central

    2010-01-01

    Background Detecting population substructure is a critical issue for association studies of health behaviors and other traits. Whether inherent in the population or an artifact of marker choice, determining aspects of a population's genetic history as potential sources of substructure can aid in design of future genetic studies. Jewish populations, among which association studies are often conducted, have a known history of migrations. As a necessary step in understanding population structure to conduct valid association studies of health behaviors among Israeli Jews, we investigated genetic signatures of this history and quantified substructure to facilitate future investigations of these phenotypes in this population. Results Using 32 autosomal STR markers and the program STRUCTURE, we differentiated between Ashkenazi (AJ, N = 135) and non-Ashkenazi (NAJ, N = 226) Jewish populations in the form of Northern and Southern geographic genetic components (AJ north 73%, south 23%, NAJ north 33%, south 60%). The ability to detect substructure within these closely related populations using a small STR panel was contingent on including additional samples representing major continental populations in the analyses. Conclusions Although clustering programs such as STRUCTURE are designed to assign proportions of ancestry to individuals without reference population information, when Jewish samples were analyzed in the absence of proxy parental populations, substructure within Jews was not detected. Generally, for samples with a given grandparental country of birth, STRUCTURE assignment values to Northern, Southern, African and Asian clusters agreed with mitochondrial DNA and Y-chromosomal data from previous studies as well as historical records of migration and intermarriage. PMID:20546593

  6. Early Results from Swift AGN and Cluster Survey

    NASA Astrophysics Data System (ADS)

    Dai, Xinyu; Griffin, Rhiannon; Nugent, Jenna; Kochanek, Christopher S.; Bregman, Joel N.

    2016-04-01

    The Swift AGN and Cluster Survey (SACS) uses 125 deg^2 of Swift X-ray Telescope serendipitous fields with variable depths surrounding gamma-ray bursts to provide a medium depth (4 × 10^-15 erg cm^-2 s^-1) and area survey filling the gap between deep, narrow Chandra/XMM-Newton surveys and wide, shallow ROSAT surveys. Here, we present the first two papers in a series of publications for SACS. In the first paper, we introduce our method and catalog of 22,563 point sources and 442 extended sources. SACS provides excellent constraints on the AGN and cluster number counts at the bright end with negligible uncertainties due to cosmic variance, and these constraints are consistent with previous measurements. The depth and areal coverage of SACS is well suited for galaxy cluster surveys outside the local universe, reaching z > 1 for massive clusters. In the second paper, we use SDSS DR8 data to study the 203 extended SACS sources that are located within the SDSS footprint. We search for galaxy over-densities in 3-D space using SDSS galaxies and their photometric redshifts near the Swift galaxy cluster candidates. We find 103 Swift clusters with a > 3σ over-density. The remaining targets are potentially located at higher redshifts and require deeper optical follow-up observations for confirmations as galaxy clusters. We present a series of cluster properties including the redshift, BCG magnitude, BCG-to-X-ray center offset, optical richness, X-ray luminosity and red sequences. We compare the observed redshift distribution of the sample with a theoretical model, and find that our sample is complete for z ≤ 0.3 and 80% complete for z ≤ 0.4, consistent with the survey depth of SDSS. These analysis results suggest that our Swift cluster selection algorithm presented in our first paper has yielded a statistically well-defined cluster sample for further studying cluster evolution and cosmology. In the end, we will discuss our ongoing optical identification of z>0.5 cluster sample, using MDM, KPNO, CTIO, and Magellan data, and discuss SACS as a pilot for eROSITA deep surveys.

  7. Do X-ray dark or underluminous galaxy clusters exist?

    NASA Astrophysics Data System (ADS)

    Andreon, S.; Moretti, A.

    2011-12-01

    We study the X-ray properties of a color-selected sample of clusters at 0.1 < z < 0.3, to quantify the real aboundance of the population of X-ray dark or underluminous clusters and at the same time the spurious detection contamination level of color-selected cluster catalogs. Starting from a local sample of color-selected clusters, we restrict our attention to those with sufficiently deep X-ray observations to probe their X-ray luminosity down to very faint values and without introducing any X-ray bias. This allowed us to have an X-ray- unbiased sample of 33 clusters to measure the LX-richness relation. Swift 1.4 Ms X-ray observations show that at least 89% of the color-detected clusters are real objects with a potential well deep enough to heat and retain an intracluster medium. The percentage rises to 94% when one includes the single spectroscopically confirmed color-selected cluster whose X-ray emission is not secured. Looking at our results from the opposite perspective, the percentage of X-ray dark clusters among color-selected clusters is very low: at most about 11 per cent (at 90% confidence). Supplementing our data with those from literature, we conclude that X-ray- and color- cluster surveys sample the same population and consequently that in this regard we can safely use clusters selected with any of the two methods for cosmological purposes. This is an essential and promising piece of information for upcoming surveys in both the optical/IR (DES, EUCLID) and X-ray (eRosita). Richness correlates with X-ray luminosity with a large scatter, 0.51 ± 0.08 (0.44 ± 0.07) dex in lgLX at a given richness, when Lx is measured in a 500 (1070) kpc aperture. We release data and software to estimate the X-ray flux, or its upper limit, of a source with over-Poisson background fluctuations (found in this work to be ~20% on cluster angular scales) and to fit X-ray luminosity vs richness if there is an intrinsic scatter. These Bayesian applications rigorously account for boundaries (e.g., the X-ray luminosity and the richness cannot be negative).

  8. Cluster-sample surveys and lot quality assurance sampling to evaluate yellow fever immunisation coverage following a national campaign, Bolivia, 2007.

    PubMed

    Pezzoli, Lorenzo; Pineda, Silvia; Halkyer, Percy; Crespo, Gladys; Andrews, Nick; Ronveaux, Olivier

    2009-03-01

    To estimate the yellow fever (YF) vaccine coverage for the endemic and non-endemic areas of Bolivia and to determine whether selected districts had acceptable levels of coverage (>70%). We conducted two surveys of 600 individuals (25 x 12 clusters) to estimate coverage in the endemic and non-endemic areas. We assessed 11 districts using lot quality assurance sampling (LQAS). The lot (district) sample was 35 individuals with six as decision value (alpha error 6% if true coverage 70%; beta error 6% if true coverage 90%). To increase feasibility, we divided the lots into five clusters of seven individuals; to investigate the effect of clustering, we calculated alpha and beta by conducting simulations where each cluster's true coverage was sampled from a normal distribution with a mean of 70% or 90% and standard deviations of 5% or 10%. Estimated coverage was 84.3% (95% CI: 78.9-89.7) in endemic areas, 86.8% (82.5-91.0) in non-endemic and 86.0% (82.8-89.1) nationally. LQAS showed that four lots had unacceptable coverage levels. In six lots, results were inconsistent with the estimated administrative coverage. The simulations suggested that the effect of clustering the lots is unlikely to have significantly increased the risk of making incorrect accept/reject decisions. Estimated YF coverage was high. Discrepancies between administrative coverage and LQAS results may be due to incorrect population data. Even allowing for clustering in LQAS, the statistical errors would remain low. Catch-up campaigns are recommended in districts with unacceptable coverage.

  9. Implementation of a Care Pathway for Primary Palliative Care in 5 research clusters in Belgium: quasi-experimental study protocol and innovations in data collection (pro-SPINOZA).

    PubMed

    Leysen, Bert; Van den Eynden, Bart; Gielen, Birgit; Bastiaens, Hilde; Wens, Johan

    2015-09-28

    Starting with early identification of palliative care patients by general practitioners (GPs), the Care Pathway for Primary Palliative Care (CPPPC) is believed to help primary health care workers to deliver patient- and family-centered care in the last year of life. The care pathway has been pilot-tested, and will now be implemented in 5 Belgian regions: 2 Dutch-speaking regions, 2 French-speaking regions and the bilingual capital region of Brussels. The overall aim of the CPPPC is to provide better quality of primary palliative care, and in the end to reduce the hospital death rate. The aim of this article is to describe the quantitative design and innovative data collection strategy used in the evaluation of this complex intervention. A quasi-experimental stepped wedge cluster design is set up with the 5 regions being 5 non-randomized clusters. The primary outcome is reduced hospital death rate per GPs' patient population. Secondary outcomes are increased death at home and health care consumption patterns suggesting high quality palliative care. Per research cluster, GPs will be recruited via convenience sampling. These GPs -volunteering to be involved will recruit people with reduced life expectancy and their informal care givers. Health care consumption data in the last year of life, available for all deceased people having lived in the research clusters in the study period, will be used for comparison between patient populations of participating GPs and patient populations of non-participating GPs. Description of baseline characteristics of participating GPs and patients and monitoring of the level of involvement by GPs, patients and informal care givers will happen through regular, privacy-secured web-surveys. Web-survey data and health consumption data are linked in a secure way, respecting Belgian privacy laws. To evaluate this complex intervention, a quasi-experimental stepped wedge cluster design has been set up. Context characteristics and involvement level of participants are important parameters in evaluating complex interventions. It is possible to securely link survey data with health consumption data. By appealing to IT solutions we hope to be able to partly reduce respondent burden, a known problem in palliative care research. ClinicalTrials.gov Identifier: NCT02266069.

  10. Knowledge and beliefs about tuberculosis among non-working women in Ravensmead, Cape Town.

    PubMed

    Metcalf, C A; Bradshaw, D; Stindt, W W

    1990-04-21

    The results of a community-based survey on knowledge and beliefs about tuberculosis in non-working women are presented. The women in the sample showed a very good knowledge of the important aspects of tuberculosis: 90% were aware that it is a problem in their area; 97% knew that it affects the chest; 94% said that it could be fatal; 85% considered it to be infectious and 88% knew that the local clinic provided treatment. Their knowledge of symptoms was good overall but the study revealed misconceptions about the causes and transmission of tuberculosis; 16% indicated that they would not be keen to associate with people with tuberculosis owing to fear of infection. The design effect of cluster sampling was considered in the analysis. The highest design effects (i.e. the most clustering of responses) were found for responses to questions on the causes of tuberculosis and places where treatment could be obtained, possibly reflecting that these beliefs are influenced by neighbourhood contacts. Future tuberculosis education in this group needs to build on existing knowledge and awareness and should focus on changing attitudes such as misconceptions about transmission and the stigmatisation of the disease. Health workers face the challenge of changing behaviour in this community to ensure that people with symptoms present early for screening and that people diagnosed as having tuberculosis comply with treatment.

  11. Effects of a worksite tobacco control intervention in India: the Mumbai worksite tobacco control study, a cluster-randomised trial.

    PubMed

    Sorensen, Glorian; Pednekar, Mangesh; Cordeira, Laura Shulman; Pawar, Pratibha; Nagler, Eve M; Stoddard, Anne M; Kim, Hae-Young; Gupta, Prakash C

    2017-03-01

    We assessed a worksite intervention designed to promote tobacco control among workers in the manufacturing sector in Greater Mumbai, India. We used a cluster-randomised design to test an integrated health promotion/health protection intervention, the Healthy, Safe, and Tobacco-free Worksites programme. Between July 2012 and July 2013, we recruited 20 worksites on a rolling basis and randomly assigned them to intervention or delayed-intervention control conditions. The follow-up survey was conducted between December 2013 and November 2014. The difference in 30-day quit rates between intervention and control conditions was statistically significant for production workers (OR=2.25, p=0.03), although not for the overall sample (OR=1.70; p=0.12). The intervention resulted in a doubling of the 6-month cessation rates among workers in the intervention worksites compared to those in the control, for production workers (OR=2.29; p=0.07) and for the overall sample (OR=1.81; p=0.13), but the difference did not reach statistical significance. These findings demonstrate the potential impact of a tobacco control intervention that combined tobacco control and health protection programming within Indian manufacturing worksites. NCT01841879. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.

  12. Structure of clusters and building blocks in amylopectin from African rice accessions.

    PubMed

    Gayin, Joseph; Abdel-Aal, El-Sayed M; Marcone, Massimo; Manful, John; Bertoft, Eric

    2016-09-05

    Enzymatic hydrolysis in combination with gel-permeation and anion-exchange chromatography techniques were employed to characterise the composition of clusters and building blocks of amylopectin from two African rice (Oryza glaberrima) accessions-IRGC 103759 and TOG 12440. The samples were compared with one Asian rice (Oryza sativa) sample (cv WITA 4) and one O. sativa×O. glaberrima cross (NERICA 4). The average DP of clusters from the African rice accessions (ARAs) was marginally larger (DP=83) than in WITA 4 (DP=81). However, regarding average number of chains, clusters from the ARAs represented both the smallest and largest clusters. Overall, the result suggested that the structure of clusters in TOG 12440 was dense with short chains and high degree of branching, whereas the situation was the opposite in NERICA 4. IRGC 103759 and WITA 4 possessed clusters with intermediate characteristics. The commonest type of building blocks in all samples was group 2 (single branched dextrins) representing 40.3-49.4% of the blocks, while groups 3-6 were found in successively lower numbers. The average number of building blocks in the clusters was significantly larger in NERICA 4 (5.8) and WITA 4 (5.7) than in IRGC 103759 and TOG 12440 (5.1 and 5.3, respectively). Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. The evolution of active galactic nuclei in clusters of galaxies from the Dark Energy Survey

    DOE PAGES

    Bufanda, E.; Hollowood, D.; Jeltema, T. E.; ...

    2016-12-13

    The correlation between active galactic nuclei (AGN) and environment provides important clues to AGN fueling and the relationship of black hole growth to galaxy evolution. Here, we analyze the fraction of galaxies in clusters hosting AGN as a function of redshift and cluster richness for X-ray detected AGN associated with clusters of galaxies in Dark Energy Survey (DES) Science Verification data. The present sample includes 33 AGN with L_X > 10 43 ergs s -1 in non-central, host galaxies with luminosity greater than 0.5 L* from a total sample of 432 clusters in the redshift range of 0.10.7. Our resultmore » is in good agreement with previous work and parallels the increase in star formation in cluster galaxies over the same redshift range. But, the AGN fraction in clusters is observed to have no significant correlation with cluster mass. Future analyses with DES Year 1 through Year 3 data will be able to clarify whether AGN activity is correlated to cluster mass and will tightly constrain the relationship between cluster AGN populations and redshift.« less

  14. The use of hierarchical clustering for the design of optimized monitoring networks

    NASA Astrophysics Data System (ADS)

    Soares, Joana; Makar, Paul Andrew; Aklilu, Yayne; Akingunola, Ayodeji

    2018-05-01

    Associativity analysis is a powerful tool to deal with large-scale datasets by clustering the data on the basis of (dis)similarity and can be used to assess the efficacy and design of air quality monitoring networks. We describe here our use of Kolmogorov-Zurbenko filtering and hierarchical clustering of NO2 and SO2 passive and continuous monitoring data to analyse and optimize air quality networks for these species in the province of Alberta, Canada. The methodology applied in this study assesses dissimilarity between monitoring station time series based on two metrics: 1 - R, R being the Pearson correlation coefficient, and the Euclidean distance; we find that both should be used in evaluating monitoring site similarity. We have combined the analytic power of hierarchical clustering with the spatial information provided by deterministic air quality model results, using the gridded time series of model output as potential station locations, as a proxy for assessing monitoring network design and for network optimization. We demonstrate that clustering results depend on the air contaminant analysed, reflecting the difference in the respective emission sources of SO2 and NO2 in the region under study. Our work shows that much of the signal identifying the sources of NO2 and SO2 emissions resides in shorter timescales (hourly to daily) due to short-term variation of concentrations and that longer-term averages in data collection may lose the information needed to identify local sources. However, the methodology identifies stations mainly influenced by seasonality, if larger timescales (weekly to monthly) are considered. We have performed the first dissimilarity analysis based on gridded air quality model output and have shown that the methodology is capable of generating maps of subregions within which a single station will represent the entire subregion, to a given level of dissimilarity. We have also shown that our approach is capable of identifying different sampling methodologies as well as outliers (stations' time series which are markedly different from all others in a given dataset).

  15. Galaxy properties in clusters. II. Backsplash galaxies

    NASA Astrophysics Data System (ADS)

    Muriel, H.; Coenda, V.

    2014-04-01

    Aims: We explore the properties of galaxies on the outskirts of clusters and their dependence on recent dynamical history in order to understand the real impact that the cluster core has on the evolution of galaxies. Methods: We analyse the properties of more than 1000 galaxies brighter than M0.1r = - 19.6 on the outskirts of 90 clusters (1 < r/rvir < 2) in the redshift range 0.05 < z < 0.10. Using the line of sight velocity of galaxies relative to the cluster's mean, we selected low and high velocity subsamples. Theoretical predictions indicate that a significant fraction of the first subsample should be backsplash galaxies, that is, objects that have already orbited near the cluster centre. A significant proportion of the sample of high relative velocity (HV) galaxies seems to be composed of infalling objects. Results: Our results suggest that, at fixed stellar mass, late-type galaxies in the low-velocity (LV) sample are systematically older, redder, and have formed fewer stars during the last 3 Gyrs than galaxies in the HV sample. This result is consistent with models that assume that the central regions of clusters are effective in quenching the star formation by means of processes such as ram pressure stripping or strangulation. At fixed stellar mass, LV galaxies show some evidence of having higher surface brightness and smaller size than HV galaxies. These results are consistent with the scenario where galaxies that have orbited the central regions of clusters are more likely to suffer tidal effects, producing loss of mass as well as a re-distribution of matter towards more compact configurations. Finally, we found a higher fraction of ET galaxies in the LV sample, supporting the idea that the central region of clusters of galaxies may contribute to the transformation of morphological types towards earlier types.

  16. Role of Anions Associated with the Formation and Properties of Silver Clusters.

    PubMed

    Wang, Quan-Ming; Lin, Yu-Mei; Liu, Kuan-Guan

    2015-06-16

    Metal clusters have been very attractive due to their aesthetic structures and fascinating properties. Different from nanoparticles, each cluster of a macroscopic sample has a well-defined structure with identical composition, size, and shape. As the disadvantages of polydispersity are ruled out, informative structure-property relationships of metal clusters can be established. The formation of a high-nuclearity metal cluster involves the organization of metal ions into a complex entity in an ordered way. To achieve controllable preparation of metal clusters, it is helpful to introduce a directing agent in the formation process of a cluster. To this end, anion templates have been used to direct the formation of high nuclearity clusters. In this Account, the role of anions played in the formation of a variety of silver clusters has been reviewed. Silver ions are positively charged, so anionic species could be utilized to control the formation of silver clusters on the basis of electrostatic interactions, and the size and shape of the resulted clusters can be dictated by the templating anions. In addition, since the anion is an integral component in the silver clusters described, the physical properties of the clusters can be modulated by functional anions. The templating effects of simple inorganic anions and polyoxometales are shown in silver alkynyl clusters and silver thiolate clusters. Intercluster compounds are also described regarding the importance of anions in determining the packing of the ion pairs and making contribution to electron communications between the positive and negative counterparts. The role of the anions is threefold: (a) an anion is advantageous in stabilizing a cluster via balancing local positive charges of the metal cations; (b) an anion template could help control the size and shape of a cluster product; (c) an anion can be a key factor in influencing the function of a cluster through bringing in its intrinsic properties. Properties including electron communication, luminescent thermochromism, single-molecule magnet, and intercluster charge transfer associated with anion-directed silver clusters have been discussed. We intend to attract chemists' attention to the role that anions could play in determining the structures and properties of metal complexes, especially clusters. We hope that this Account will stimulate more efforts in exploiting new role of anions in various metal cluster systems. Anions can do much more than counterions for charge balance, and they should be considered in the design and synthesis of cluster-based functional materials.

  17. Pressure of the hot gas in simulations of galaxy clusters

    NASA Astrophysics Data System (ADS)

    Planelles, S.; Fabjan, D.; Borgani, S.; Murante, G.; Rasia, E.; Biffi, V.; Truong, N.; Ragone-Figueroa, C.; Granato, G. L.; Dolag, K.; Pierpaoli, E.; Beck, A. M.; Steinborn, Lisa K.; Gaspari, M.

    2017-06-01

    We analyse the radial pressure profiles, the intracluster medium (ICM) clumping factor and the Sunyaev-Zel'dovich (SZ) scaling relations of a sample of simulated galaxy clusters and groups identified in a set of hydrodynamical simulations based on an updated version of the treepm-SPH GADGET-3 code. Three different sets of simulations are performed: the first assumes non-radiative physics, the others include, among other processes, active galactic nucleus (AGN) and/or stellar feedback. Our results are analysed as a function of redshift, ICM physics, cluster mass and cluster cool-coreness or dynamical state. In general, the mean pressure profiles obtained for our sample of groups and clusters show a good agreement with X-ray and SZ observations. Simulated cool-core (CC) and non-cool-core (NCC) clusters also show a good match with real data. We obtain in all cases a small (if any) redshift evolution of the pressure profiles of massive clusters, at least back to z = 1. We find that the clumpiness of gas density and pressure increases with the distance from the cluster centre and with the dynamical activity. The inclusion of AGN feedback in our simulations generates values for the gas clumping (√{C}_{ρ }˜ 1.2 at R200) in good agreement with recent observational estimates. The simulated YSZ-M scaling relations are in good accordance with several observed samples, especially for massive clusters. As for the scatter of these relations, we obtain a clear dependence on the cluster dynamical state, whereas this distinction is not so evident when looking at the subsamples of CC and NCC clusters.

  18. Clusters of Monoisotopic Elements for Calibration in (TOF) Mass Spectrometry

    NASA Astrophysics Data System (ADS)

    Kolářová, Lenka; Prokeš, Lubomír; Kučera, Lukáš; Hampl, Aleš; Peňa-Méndez, Eladia; Vaňhara, Petr; Havel, Josef

    2017-03-01

    Precise calibration in TOF MS requires suitable and reliable standards, which are not always available for high masses. We evaluated inorganic clusters of the monoisotopic elements gold and phosphorus (Au n +/Au n - and P n +/P n -) as an alternative to peptides or proteins for the external and internal calibration of mass spectra in various experimental and instrumental scenarios. Monoisotopic gold or phosphorus clusters can be easily generated in situ from suitable precursors by laser desorption/ionization (LDI) or matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS). Their use offers numerous advantages, including simplicity of preparation, biological inertness, and exact mass determination even at lower mass resolution. We used citrate-stabilized gold nanoparticles to generate gold calibration clusters, and red phosphorus powder to generate phosphorus clusters. Both elements can be added to samples to perform internal calibration up to mass-to-charge ( m/z) 10-15,000 without significantly interfering with the analyte. We demonstrated the use of the gold and phosphorous clusters in the MS analysis of complex biological samples, including microbial standards and total extracts of mouse embryonic fibroblasts. We believe that clusters of monoisotopic elements could be used as generally applicable calibrants for complex biological samples.

  19. Enumerative and binomial sequential sampling plans for the multicolored Asian lady beetle (Coleoptera: Coccinellidae) in wine grapes.

    PubMed

    Galvan, T L; Burkness, E C; Hutchison, W D

    2007-06-01

    To develop a practical integrated pest management (IPM) system for the multicolored Asian lady beetle, Harmonia axyridis (Pallas) (Coleoptera: Coccinellidae), in wine grapes, we assessed the spatial distribution of H. axyridis and developed eight sampling plans to estimate adult density or infestation level in grape clusters. We used 49 data sets collected from commercial vineyards in 2004 and 2005, in Minnesota and Wisconsin. Enumerative plans were developed using two precision levels (0.10 and 0.25); the six binomial plans reflected six unique action thresholds (3, 7, 12, 18, 22, and 31% of cluster samples infested with at least one H. axyridis). The spatial distribution of H. axyridis in wine grapes was aggregated, independent of cultivar and year, but it was more randomly distributed as mean density declined. The average sample number (ASN) for each sampling plan was determined using resampling software. For research purposes, an enumerative plan with a precision level of 0.10 (SE/X) resulted in a mean ASN of 546 clusters. For IPM applications, the enumerative plan with a precision level of 0.25 resulted in a mean ASN of 180 clusters. In contrast, the binomial plans resulted in much lower ASNs and provided high probabilities of arriving at correct "treat or no-treat" decisions, making these plans more efficient for IPM applications. For a tally threshold of one adult per cluster, the operating characteristic curves for the six action thresholds provided binomial sequential sampling plans with mean ASNs of only 19-26 clusters, and probabilities of making correct decisions between 83 and 96%. The benefits of the binomial sampling plans are discussed within the context of improving IPM programs for wine grapes.

  20. Prevalence of forced sex and associated factors among women and men in Kisumu, Kenya.

    PubMed

    Adudans, Maureen K; Montandon, Michele; Kwena, Zachary; Bukusi, Elizabeth A; Cohen, Craig R

    2011-12-01

    Sexual violence is a well-recognized global health problem, albeit with limited population-based data available from sub-Saharan Africa. We sought to measure the prevalence of forced sex in Kisumu, Kenya, and identify its associated factors. The data were drawn from a population-based cross-sectional survey. A two-stage sampling design was used: 40 clusters within Kisumu municipality were enumerated and households within each cluster selected by systematic random sampling. Demographic and sexual histories, including questions on forced sex, were collected privately using a structured questionnaire. The prevalence of forced sex was 13% (women) and 4.5% (men). After adjusting for age and cluster, forced sex among women was associated with transactional sex (OR 2.33; 95%CI 1.38-3.95), having more than two lifetime partners (OR 1.9; 95%CI 1.20-3.30), having postprimary education (OR 1.49; 95%CI 1.04-2.14) and a high economic status (OR 1.87; 95%CI 1.2-2.9). No factors were significantly associated with forced sex among the male respondents. Intimate partners were the most common perpetrators of forced sex among both women (50%) and men (62.1%). Forced sex prevention programs need to target the identified associated factors, and educate the public on the high rate of forced sex perpetrated by intimate partners.

  1. Corrections for Cluster-Plot Slop

    Treesearch

    Harry T. Valentine; Mark J. Ducey; Jeffery H. Gove; Adrian Lanz; David L.R. Affleck

    2006-01-01

    Cluster-plot designs, including the design used by the Forest Inventory and Analysis program of the USDA Forest Service (FIA), are attended by a complicated boundary slopover problem. Slopover occurs where inclusion zones of objects of interest cross the boundary of the area of interest. The dispersed nature of inclusion zones that arise from the use of cluster plots...

  2. Under What Circumstances Does External Knowledge about the Correlation Structure Improve Power in Cluster Randomized Designs?

    ERIC Educational Resources Information Center

    Rhoads, Christopher

    2014-01-01

    Recent publications have drawn attention to the idea of utilizing prior information about the correlation structure to improve statistical power in cluster randomized experiments. Because power in cluster randomized designs is a function of many different parameters, it has been difficult for applied researchers to discern a simple rule explaining…

  3. Clustering analysis of proteins from microbial genomes at multiple levels of resolution.

    PubMed

    Zaslavsky, Leonid; Ciufo, Stacy; Fedorov, Boris; Tatusova, Tatiana

    2016-08-31

    Microbial genomes at the National Center for Biotechnology Information (NCBI) represent a large collection of more than 35,000 assemblies. There are several complexities associated with the data: a great variation in sampling density since human pathogens are densely sampled while other bacteria are less represented; different protein families occur in annotations with different frequencies; and the quality of genome annotation varies greatly. In order to extract useful information from these sophisticated data, the analysis needs to be performed at multiple levels of phylogenomic resolution and protein similarity, with an adequate sampling strategy. Protein clustering is used to construct meaningful and stable groups of similar proteins to be used for analysis and functional annotation. Our approach is to create protein clusters at three levels. First, tight clusters in groups of closely-related genomes (species-level clades) are constructed using a combined approach that takes into account both sequence similarity and genome context. Second, clustroids of conservative in-clade clusters are organized into seed global clusters. Finally, global protein clusters are built around the the seed clusters. We propose filtering strategies that allow limiting the protein set included in global clustering. The in-clade clustering procedure, subsequent selection of clustroids and organization into seed global clusters provides a robust representation and high rate of compression. Seed protein clusters are further extended by adding related proteins. Extended seed clusters include a significant part of the data and represent all major known cell machinery. The remaining part, coming from either non-conservative (unique) or rapidly evolving proteins, from rare genomes, or resulting from low-quality annotation, does not group together well. Processing these proteins requires significant computational resources and results in a large number of questionable clusters. The developed filtering strategies allow to identify and exclude such peripheral proteins limiting the protein dataset in global clustering. Overall, the proposed methodology allows the relevant data at different levels of details to be obtained and data redundancy eliminated while keeping biologically interesting variations.

  4. Navigating complex sample analysis using national survey data.

    PubMed

    Saylor, Jennifer; Friedmann, Erika; Lee, Hyeon Joo

    2012-01-01

    The National Center for Health Statistics conducts the National Health and Nutrition Examination Survey and other national surveys with probability-based complex sample designs. Goals of national surveys are to provide valid data for the population of the United States. Analyses of data from population surveys present unique challenges in the research process but are valuable avenues to study the health of the United States population. The aim of this study was to demonstrate the importance of using complex data analysis techniques for data obtained with complex multistage sampling design and provide an example of analysis using the SPSS Complex Samples procedure. Illustration of challenges and solutions specific to secondary data analysis of national databases are described using the National Health and Nutrition Examination Survey as the exemplar. Oversampling of small or sensitive groups provides necessary estimates of variability within small groups. Use of weights without complex samples accurately estimates population means and frequency from the sample after accounting for over- or undersampling of specific groups. Weighting alone leads to inappropriate population estimates of variability, because they are computed as if the measures were from the entire population rather than a sample in the data set. The SPSS Complex Samples procedure allows inclusion of all sampling design elements, stratification, clusters, and weights. Use of national data sets allows use of extensive, expensive, and well-documented survey data for exploratory questions but limits analysis to those variables included in the data set. The large sample permits examination of multiple predictors and interactive relationships. Merging data files, availability of data in several waves of surveys, and complex sampling are techniques used to provide a representative sample but present unique challenges. In sophisticated data analysis techniques, use of these data is optimized.

  5. Dynamical Competition of IC-Industry Clustering from Taiwan to China

    NASA Astrophysics Data System (ADS)

    Tsai, Bi-Huei; Tsai, Kuo-Hui

    2009-08-01

    Most studies employ qualitative approach to explore the industrial clusters; however, few research has objectively quantified the evolutions of industry clustering. The purpose of this paper is to quantitatively analyze clustering among IC design, IC manufacturing as well as IC packaging and testing industries by using the foreign direct investment (FDI) data. The Lotka-Volterra system equations are first adopted here to capture the competition or cooperation among such three industries, thus explaining their clustering inclinations. The results indicate that the evolution of FDI into China for IC design industry significantly inspire the subsequent FDI of IC manufacturing as well as IC packaging and testing industries. Since IC design industry lie in the upstream stage of IC production, the middle-stream IC manufacturing and downstream IC packing and testing enterprises tend to cluster together with IC design firms, in order to sustain a steady business. Finally, Taiwan IC industry's FDI amount into China is predicted to cumulatively increase, which supports the industrial clustering tendency for Taiwan IC industry. Particularly, the FDI prediction of Lotka-Volterra model performs superior to that of the conventional Bass model after the forecast accuracy of these two models are compared. The prediction ability is dramatically improved as the industrial mutualism among each IC production stage is taken into account.

  6. Scientific Cluster Deployment and Recovery - Using puppet to simplify cluster management

    NASA Astrophysics Data System (ADS)

    Hendrix, Val; Benjamin, Doug; Yao, Yushu

    2012-12-01

    Deployment, maintenance and recovery of a scientific cluster, which has complex, specialized services, can be a time consuming task requiring the assistance of Linux system administrators, network engineers as well as domain experts. Universities and small institutions that have a part-time FTE with limited time for and knowledge of the administration of such clusters can be strained by such maintenance tasks. This current work is the result of an effort to maintain a data analysis cluster (DAC) with minimal effort by a local system administrator. The realized benefit is the scientist, who is the local system administrator, is able to focus on the data analysis instead of the intricacies of managing a cluster. Our work provides a cluster deployment and recovery process (CDRP) based on the puppet configuration engine allowing a part-time FTE to easily deploy and recover entire clusters with minimal effort. Puppet is a configuration management system (CMS) used widely in computing centers for the automatic management of resources. Domain experts use Puppet's declarative language to define reusable modules for service configuration and deployment. Our CDRP has three actors: domain experts, a cluster designer and a cluster manager. The domain experts first write the puppet modules for the cluster services. A cluster designer would then define a cluster. This includes the creation of cluster roles, mapping the services to those roles and determining the relationships between the services. Finally, a cluster manager would acquire the resources (machines, networking), enter the cluster input parameters (hostnames, IP addresses) and automatically generate deployment scripts used by puppet to configure it to act as a designated role. In the event of a machine failure, the originally generated deployment scripts along with puppet can be used to easily reconfigure a new machine. The cluster definition produced in our CDRP is an integral part of automating cluster deployment in a cloud environment. Our future cloud efforts will further build on this work.

  7. A terrain-based paired-site sampling design to assess biodiversity losses from eastern hemlock decline

    USGS Publications Warehouse

    Young, J.A.; Smith, D.R.; Snyder, C.D.; Lemarie, D.P.

    2002-01-01

    Biodiversity surveys are often hampered by the inability to control extraneous sources of variability introduced into comparisons of populations across a heterogenous landscape. If not specifically accounted for a priori, this noise can weaken comparisons between sites, and can make it difficult to draw inferences about specific ecological processes. We developed a terrain-based, paired-site sampling design to analyze differences in aquatic biodiversity between streams draining eastern hemlock (Tsuga canadensis) forests, and those draining mixed hardwood forests in Delaware Water Gap National Recreation Area (USA). The goal of this design was to minimize variance due to terrain influences on stream communities, while representing the range of hemlock dominated stream environments present in the park. We used geographic information systems (GIS) and cluster analysis to define and partition hemlock dominated streams into terrain types based on topographic variables and stream order. We computed similarity of forest stands within terrain types and used this information to pair hemlock-dominated streams with hardwood counterparts prior to sampling. We evaluated the effectiveness of the design through power analysis and found that power to detect differences in aquatic invertebrate taxa richness was highest when sites were paired and terrain type was included as a factor in the analysis. Precision of the estimated difference in mean richness was nearly doubled using the terrain-based, paired site design in comparison to other evaluated designs. Use of this method allowed us to sample stream communities representative of park-wide forest conditions while effectively controlling for landscape variability.

  8. [Clustering patterns of behavioral risk factors linked to chronic disease among young adults in two localities in Bogota, Colombia: importance of sex differences].

    PubMed

    Gómez Gutiérrez, Luis Fernando; Lucumí Cuesta, Diego Iván; Girón Vargas, Sandra Lorena; Espinosa García, Gladys

    2004-01-01

    The characterization of clustering behavioral risk factors may be used as a guideline for interventions aimed at preventing chronic diseases. This study determined the clustering patterns of some behavioral risk factors in young adults aged 18 to 29 years and established the factors associated with having two or more of them. Patterns of clustering by gender were established in four behavioral risk factors (low consumption of fruits and vegetables, physical inactivity in leisure time, current tobacco consumption and acute alcohol consumption), in 1465 young adults participants through a multistage probabilistic sample. Regression models identified the sociodemografic variables associated with having two or more of the aforementioned behavioral risk factors. Having one, 32.9% two and 17.7% three or four. Acute alcohol consumption was the risk factor most frequent in the combined risk factor patterns among males; physical inactivity during leisure time being the most frequent among females. Among the females, having two or more behavioral risk factors was linked to be separated or divorced, this having been linked to work having been the main activity over the past 30 days among males. The combinations of behavioral risk factors studied and the factors associated with clustering show different patterns among males and females. These findings stressed the need of designing interventions sensitive to gender differences.

  9. Small Sample Performance of Bias-corrected Sandwich Estimators for Cluster-Randomized Trials with Binary Outcomes

    PubMed Central

    Li, Peng; Redden, David T.

    2014-01-01

    SUMMARY The sandwich estimator in generalized estimating equations (GEE) approach underestimates the true variance in small samples and consequently results in inflated type I error rates in hypothesis testing. This fact limits the application of the GEE in cluster-randomized trials (CRTs) with few clusters. Under various CRT scenarios with correlated binary outcomes, we evaluate the small sample properties of the GEE Wald tests using bias-corrected sandwich estimators. Our results suggest that the GEE Wald z test should be avoided in the analyses of CRTs with few clusters even when bias-corrected sandwich estimators are used. With t-distribution approximation, the Kauermann and Carroll (KC)-correction can keep the test size to nominal levels even when the number of clusters is as low as 10, and is robust to the moderate variation of the cluster sizes. However, in cases with large variations in cluster sizes, the Fay and Graubard (FG)-correction should be used instead. Furthermore, we derive a formula to calculate the power and minimum total number of clusters one needs using the t test and KC-correction for the CRTs with binary outcomes. The power levels as predicted by the proposed formula agree well with the empirical powers from the simulations. The proposed methods are illustrated using real CRT data. We conclude that with appropriate control of type I error rates under small sample sizes, we recommend the use of GEE approach in CRTs with binary outcomes due to fewer assumptions and robustness to the misspecification of the covariance structure. PMID:25345738

  10. The effect of work-based mentoring on patient outcome in musculoskeletal physiotherapy: study protocol for a randomised controlled trial.

    PubMed

    Williams, Aled L; Phillips, Ceri J; Watkins, Alan; Rushton, Alison B

    2014-10-25

    Despite persistent calls to measure the effectiveness of educational interventions on patient outcomes, few studies have been conducted. Within musculoskeletal physiotherapy, the effects of postgraduate clinical mentoring on physiotherapist performance have been assessed, but the impact of this mentoring on patient outcomes remains unknown. The objective of this trial is to assess the effectiveness of a work-based mentoring programme to facilitate physiotherapist clinical reasoning on patient outcomes in musculoskeletal physiotherapy. A stepped wedge cluster randomised controlled trial (CRCT) has been designed to recruit a minimum of 12 senior physiotherapists who work in musculoskeletal outpatient departments of a large National Health Service (NHS) organization. Participating physiotherapists will be randomised by cluster to receive the intervention at three time periods. Patients will be blinded to whether their physiotherapist has received the intervention. The primary outcome measure will be the Patient-Specific Functional Scale; secondary outcome measures will include the EQ-5D, patient activation, patient satisfaction and physiotherapist performance. Sample size considerations used published methods describing stepped wedge designs, conventional values of 0.80 for statistical power and 0.05 for statistical significance, and pragmatic groupings of 12 participating physiotherapists in three clusters. Based on an intergroup difference of 1.0 on the PSFS with a standard deviation of 2.0, 10 patients are required to complete outcome measures per physiotherapist, at time period 1 (prior to intervention roll-out) and at each of time periods 2, 3 and 4, giving a sample size of 480 patients. To account for the potential loss to follow-up of 33%, 720 sets of patient outcomes will be collected.All physiotherapist participants will receive 150 hours of mentored clinical practice as the intervention and usual in-service training as control. Consecutive, consenting patients attending treatment by the participating physiotherapists during data collection periods will complete outcome measures at baseline, discharge and 12 months post-baseline. The lead researcher will be blinded to the allocation of the physiotherapist when analyzing outcome data; statistical analysis will involve classical linear models incorporating both an intervention effect and a random intercept term to reflect systematic differences among clusters. Assigned 31 July 2012: ISRCTN79599220.

  11. Efficient sampling of complex network with modified random walk strategies

    NASA Astrophysics Data System (ADS)

    Xie, Yunya; Chang, Shuhua; Zhang, Zhipeng; Zhang, Mi; Yang, Lei

    2018-02-01

    We present two novel random walk strategies, choosing seed node (CSN) random walk and no-retracing (NR) random walk. Different from the classical random walk sampling, the CSN and NR strategies focus on the influences of the seed node choice and path overlap, respectively. Three random walk samplings are applied in the Erdös-Rényi (ER), Barabási-Albert (BA), Watts-Strogatz (WS), and the weighted USAir networks, respectively. Then, the major properties of sampled subnets, such as sampling efficiency, degree distributions, average degree and average clustering coefficient, are studied. The similar conclusions can be reached with these three random walk strategies. Firstly, the networks with small scales and simple structures are conducive to the sampling. Secondly, the average degree and the average clustering coefficient of the sampled subnet tend to the corresponding values of original networks with limited steps. And thirdly, all the degree distributions of the subnets are slightly biased to the high degree side. However, the NR strategy performs better for the average clustering coefficient of the subnet. In the real weighted USAir networks, some obvious characters like the larger clustering coefficient and the fluctuation of degree distribution are reproduced well by these random walk strategies.

  12. Posttraumatic Stress Disorder Symptom Clusters and the Interpersonal Theory of Suicide in a Large Military Sample.

    PubMed

    Pennings, Stephanie M; Finn, Joseph; Houtsma, Claire; Green, Bradley A; Anestis, Michael D

    2017-10-01

    Prior studies examining posttraumatic stress disorder (PTSD) symptom clusters and the components of the interpersonal theory of suicide (ITS) have yielded mixed results, likely stemming in part from the use of divergent samples and measurement techniques. This study aimed to expand on these findings by utilizing a large military sample, gold standard ITS measures, and multiple PTSD factor structures. Utilizing a sample of 935 military personnel, hierarchical multiple regression analyses were used to test the association between PTSD symptom clusters and the ITS variables. Additionally, we tested for indirect effects of PTSD symptom clusters on suicidal ideation through thwarted belongingness, conditional on levels of perceived burdensomeness. Results indicated that numbing symptoms are positively associated with both perceived burdensomeness and thwarted belongingness and hyperarousal symptoms (dysphoric arousal in the 5-factor model) are positively associated with thwarted belongingness. Results also indicated that hyperarousal symptoms (anxious arousal in the 5-factor model) were positively associated with fearlessness about death. The positive association between PTSD symptom clusters and suicidal ideation was inconsistent and modest, with mixed support for the ITS model. Overall, these results provide further clarity regarding the association between specific PTSD symptom clusters and suicide risk factors. © 2016 The American Association of Suicidology.

  13. Microsatellite variation and genetic structuring in Mugil liza (Teleostei: Mugilidae) populations from Argentina and Brazil

    NASA Astrophysics Data System (ADS)

    Mai, Ana C. G.; Miño, Carolina I.; Marins, Luis F. F.; Monteiro-Neto, Cassiano; Miranda, Laura; Schwingel, Paulo R.; Lemos, Valéria M.; Gonzalez-Castro, Mariano; Castello, Jorge P.; Vieira, João P.

    2014-08-01

    The mullet Mugil liza is distributed along the Atlantic coast of South America, from Argentina to Venezuela, and it is heavily exploited in Brazil. We assessed patterns of distribution of neutral nuclear genetic variation in 250 samples from the Brazilian states of Rio de Janeiro, São Paulo, Santa Catarina and Rio Grande do Sul (latitudinal range of 23-31°S) and from Buenos Aires Province in Argentina (36°S). Nine microsatellite loci revealed 131 total alleles, 3-23 alleles per locus, He: 0.69 and Ho: 0.67. Significant genetic differentiation was observed between Rio de Janeiro samples (23°S) and those from all other locations, as indicated by FST, hierarchical analyses of genetic structure, Bayesian cluster analyses and assignment tests. The presence of two different demographic clusters better explains the allelic diversity observed in mullets from the southernmost portion of the Atlantic coast of Brazil and from Argentina. This may be taken into account when designing fisheries management plans involving Brazilian, Uruguayan and Argentinean M. liza populations.

  14. Uniform deposition of size-selected clusters using Lissajous scanning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Beniya, Atsushi; Watanabe, Yoshihide, E-mail: e0827@mosk.tytlabs.co.jp; Hirata, Hirohito

    2016-05-15

    Size-selected clusters can be deposited on the surface using size-selected cluster ion beams. However, because of the cross-sectional intensity distribution of the ion beam, it is difficult to define the coverage of the deposited clusters. The aggregation probability of the cluster depends on coverage, whereas cluster size on the surface depends on the position, despite the size-selected clusters are deposited. It is crucial, therefore, to deposit clusters uniformly on the surface. In this study, size-selected clusters were deposited uniformly on surfaces by scanning the cluster ions in the form of Lissajous pattern. Two sets of deflector electrodes set in orthogonalmore » directions were placed in front of the sample surface. Triangular waves were applied to the electrodes with an irrational frequency ratio to ensure that the ion trajectory filled the sample surface. The advantages of this method are simplicity and low cost of setup compared with raster scanning method. The authors further investigated CO adsorption on size-selected Pt{sub n} (n = 7, 15, 20) clusters uniformly deposited on the Al{sub 2}O{sub 3}/NiAl(110) surface and demonstrated the importance of uniform deposition.« less

  15. The XXL Survey. II. The bright cluster sample: catalogue and luminosity function

    NASA Astrophysics Data System (ADS)

    Pacaud, F.; Clerc, N.; Giles, P. A.; Adami, C.; Sadibekova, T.; Pierre, M.; Maughan, B. J.; Lieu, M.; Le Fèvre, J. P.; Alis, S.; Altieri, B.; Ardila, F.; Baldry, I.; Benoist, C.; Birkinshaw, M.; Chiappetti, L.; Démoclès, J.; Eckert, D.; Evrard, A. E.; Faccioli, L.; Gastaldello, F.; Guennou, L.; Horellou, C.; Iovino, A.; Koulouridis, E.; Le Brun, V.; Lidman, C.; Liske, J.; Maurogordato, S.; Menanteau, F.; Owers, M.; Poggianti, B.; Pomarède, D.; Pompei, E.; Ponman, T. J.; Rapetti, D.; Reiprich, T. H.; Smith, G. P.; Tuffs, R.; Valageas, P.; Valtchanov, I.; Willis, J. P.; Ziparo, F.

    2016-06-01

    Context. The XXL Survey is the largest survey carried out by the XMM-Newton satellite and covers a total area of 50 square degrees distributed over two fields. It primarily aims at investigating the large-scale structures of the Universe using the distribution of galaxy clusters and active galactic nuclei as tracers of the matter distribution. The survey will ultimately uncover several hundreds of galaxy clusters out to a redshift of ~2 at a sensitivity of ~10-14 erg s-1 cm-2 in the [0.5-2] keV band. Aims: This article presents the XXL bright cluster sample, a subsample of 100 galaxy clusters selected from the full XXL catalogue by setting a lower limit of 3 × 10-14 erg s-1 cm-2 on the source flux within a 1' aperture. Methods: The selection function was estimated using a mixture of Monte Carlo simulations and analytical recipes that closely reproduce the source selection process. An extensive spectroscopic follow-up provided redshifts for 97 of the 100 clusters. We derived accurate X-ray parameters for all the sources. Scaling relations were self-consistently derived from the same sample in other publications of the series. On this basis, we study the number density, luminosity function, and spatial distribution of the sample. Results: The bright cluster sample consists of systems with masses between M500 = 7 × 1013 and 3 × 1014 M⊙, mostly located between z = 0.1 and 0.5. The observed sky density of clusters is slightly below the predictions from the WMAP9 model, and significantly below the prediction from the Planck 2015 cosmology. In general, within the current uncertainties of the cluster mass calibration, models with higher values of σ8 and/or ΩM appear more difficult to accommodate. We provide tight constraints on the cluster differential luminosity function and find no hint of evolution out to z ~ 1. We also find strong evidence for the presence of large-scale structures in the XXL bright cluster sample and identify five new superclusters. Based on observations obtained with XMM-Newton, an ESA science mission with instruments and contributions directly funded by ESA Member States and NASA. Based on observations made with ESO Telescopes at the La Silla and Paranal Observatories under programme ID 089.A-0666 and LP191.A-0268.The Master Catalogue is available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/592/A2

  16. Weak-Lensing Mass Calibration of the Atacama Cosmology Telescope Equatorial Sunyaev-Zeldovich Cluster Sample with the Canada-France-Hawaii Telescope Stripe 82 Survey

    NASA Technical Reports Server (NTRS)

    Battaglia, N.; Leauthaud, A.; Miyatake, H.; Hasseleld, M.; Gralla, M. B.; Allison, R.; Bond, J. R.; Calabrese, E.; Crichton, D.; Devlin, M. J.; hide

    2016-01-01

    Mass calibration uncertainty is the largest systematic effect for using clustersof galaxies to constrain cosmological parameters. We present weak lensing mass measurements from the Canada-France-Hawaii Telescope Stripe 82 Survey for galaxy clusters selected through their high signal-to-noise thermal Sunyaev-Zeldovich (tSZ) signal measured with the Atacama Cosmology Telescope (ACT). For a sample of 9 ACT clusters with a tSZ signal-to-noise greater than five, the average weak lensing mass is (4.8 plus or minus 0.8) times 10 (sup 14) solar mass, consistent with the tSZ mass estimate of (4.7 plus or minus 1.0) times 10 (sup 14) solar mass, which assumes a universal pressure profile for the cluster gas. Our results are consistent with previous weak-lensing measurements of tSZ-detected clusters from the Planck satellite. When comparing our results, we estimate the Eddington bias correction for the sample intersection of Planck and weak-lensing clusters which was previously excluded.

  17. Orbits of Selected Globular Clusters in the Galactic Bulge

    NASA Astrophysics Data System (ADS)

    Pérez-Villegas, A.; Rossi, L.; Ortolani, S.; Casotto, S.; Barbuy, B.; Bica, E.

    2018-05-01

    We present orbit analysis for a sample of eight inner bulge globular clusters, together with one reference halo object. We used proper motion values derived from long time base CCD data. Orbits are integrated in both an axisymmetric model and a model including the Galactic bar potential. The inclusion of the bar proved to be essential for the description of the dynamical behaviour of the clusters. We use the Monte Carlo scheme to construct the initial conditions for each cluster, taking into account the uncertainties in the kinematical data and distances. The sample clusters show typically maximum height to the Galactic plane below 1.5 kpc, and develop rather eccentric orbits. Seven of the bulge sample clusters share the orbital properties of the bar/bulge, having perigalactic and apogalatic distances, and maximum vertical excursion from the Galactic plane inside the bar region. NGC 6540 instead shows a completely different orbital behaviour, having a dynamical signature of the thick disc. Both prograde and prograde-retrograde orbits with respect to the direction of the Galactic rotation were revealed, which might characterise a chaotic behaviour.

  18. The dysregulated cluster in personality profiling research: Longitudinal stability and associations with bulimic behaviors and correlates

    PubMed Central

    Slane, Jennifer D.; Klump, Kelly L.; Donnellan, M. Brent; McGue, Matthew; Iacono, William G.

    2013-01-01

    Among cluster analytic studies of the personality profiles associated with bulimia nervosa, a group of individuals characterized by emotional lability and behavioral dysregulation (i.e., a dysregulated cluster) has emerged most consistently. However, previous studies have all been cross-sectional and mostly used clinical samples. This study aimed to replicate associations between the dysregulated personality cluster and bulimic symptoms and related characteristics using a longitudinal, population-based sample. Participants were females assessed at ages 17 and 25 from the Minnesota Twin Family Study, clustered based on their personality traits. The Dysregulated cluster was successfully identified at both time points and was more stable across time than either the Resilient or Sensation Seeking clusters. Rates of bulimic symptoms and related behaviors (e.g., alcohol use problems) were also highest in the dysregulated group. Findings suggest that the dysregulated cluster is a relatively stable and robust profile that is associated with bulimic symptoms. PMID:23398096

  19. Seven common mistakes in population genetics and how to avoid them.

    PubMed

    Meirmans, Patrick G

    2015-07-01

    As the data resulting from modern genotyping tools are astoundingly complex, genotyping studies require great care in the sampling design, genotyping, data analysis and interpretation. Such care is necessary because, with data sets containing thousands of loci, small biases can easily become strongly significant patterns. Such biases may already be present in routine tasks that are present in almost every genotyping study. Here, I discuss seven common mistakes that can be frequently encountered in the genotyping literature: (i) giving more attention to genotyping than to sampling, (ii) failing to perform or report experimental randomization in the laboratory, (iii) equating geopolitical borders with biological borders, (iv) testing significance of clustering output, (v) misinterpreting Mantel's r statistic, (vi) only interpreting a single value of k and (vii) forgetting that only a small portion of the genome will be associated with climate. For every of those issues, I give some suggestions how to avoid the mistake. Overall, I argue that genotyping studies would benefit from establishing a more rigorous experimental design, involving proper sampling design, randomization and better distinction of a priori hypotheses and exploratory analyses. © 2015 John Wiley & Sons Ltd.

  20. Application of self-organizing feature maps to analyze the relationships between ignitable liquids and selected mass spectral ions.

    PubMed

    Frisch-Daiello, Jessica L; Williams, Mary R; Waddell, Erin E; Sigman, Michael E

    2014-03-01

    The unsupervised artificial neural networks method of self-organizing feature maps (SOFMs) is applied to spectral data of ignitable liquids to visualize the grouping of similar ignitable liquids with respect to their American Society for Testing and Materials (ASTM) class designations and to determine the ions associated with each group. The spectral data consists of extracted ion spectra (EIS), defined as the time-averaged mass spectrum across the chromatographic profile for select ions, where the selected ions are a subset of ions from Table 2 of the ASTM standard E1618-11. Utilization of the EIS allows for inter-laboratory comparisons without the concern of retention time shifts. The trained SOFM demonstrates clustering of the ignitable liquid samples according to designated ASTM classes. The EIS of select samples designated as miscellaneous or oxygenated as well as ignitable liquid residues from fire debris samples are projected onto the SOFM. The results indicate the similarities and differences between the variables of the newly projected data compared to those of the data used to train the SOFM. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  1. Use of LANDSAT imagery for wildlife habitat mapping in northeast and eastcentral Alaska

    NASA Technical Reports Server (NTRS)

    Lent, P. C. (Principal Investigator)

    1976-01-01

    The author has identified the following significant results. There is strong indication that spatially rare feature classes may be missed in clustering classifications based on 2% random sampling. Therefore, it seems advisable to augment random sampling for cluster analysis with directed sampling of any spatially rare features which are relevant to the analysis.

  2. SU-G-TeP3-14: Three-Dimensional Cluster Model in Inhomogeneous Dose Distribution

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wei, J; Penagaricano, J; Narayanasamy, G

    2016-06-15

    Purpose: We aim to investigate 3D cluster formation in inhomogeneous dose distribution to search for new models predicting radiation tissue damage and further leading to new optimization paradigm for radiotherapy planning. Methods: The aggregation of higher dose in the organ at risk (OAR) than a preset threshold was chosen as the cluster whose connectivity dictates the cluster structure. Upon the selection of the dose threshold, the fractional density defined as the fraction of voxels in the organ eligible to be part of the cluster was determined according to the dose volume histogram (DVH). A Monte Carlo method was implemented tomore » establish a case pertinent to the corresponding DVH. Ones and zeros were randomly assigned to each OAR voxel with the sampling probability equal to the fractional density. Ten thousand samples were randomly generated to ensure a sufficient number of cluster sets. A recursive cluster searching algorithm was developed to analyze the cluster with various connectivity choices like 1-, 2-, and 3-connectivity. The mean size of the largest cluster (MSLC) from the Monte Carlo samples was taken to be a function of the fractional density. Various OARs from clinical plans were included in the study. Results: Intensive Monte Carlo study demonstrates the inverse relationship between the MSLC and the cluster connectivity as anticipated and the cluster size does not change with fractional density linearly regardless of the connectivity types. An initially-slow-increase to exponential growth transition of the MSLC from low to high density was observed. The cluster sizes were found to vary within a large range and are relatively independent of the OARs. Conclusion: The Monte Carlo study revealed that the cluster size could serve as a suitable index of the tissue damage (percolation cluster) and the clinical outcome of the same DVH might be potentially different.« less

  3. VizieR Online Data Catalog: NORAS II. I. First results (Bohringer+, 2017)

    NASA Astrophysics Data System (ADS)

    Bohringer, H.; Chon, G.; Retzlaff, J.; Trumper, J.; Meisenheimer, K.; Schartel, N.

    2017-08-01

    The NOrthern ROSAT All-Sky (NORAS) galaxy cluster survey project is based on the ROSAT All-Sky Survey (RASS; Trumper 1993Sci...260.1769T), which is the only full-sky survey conducted with an imaging X-ray telescope. We have already used RASS for the construction of the cluster catalogs of the NORAS I project. While NORAS I was as a first step focused on the identification of galaxy clusters among the RASS X-ray sources showing a significant extent, the complementary REFLEX I sample in the southern sky was strictly constructed as a flux-limited cluster sample. A major extension of the REFLEX I sample, which roughly doubles the number of clusters, REFLEX II (Bohringer et al. 2013, Cat. J/A+A/555/A30), was recently completed. It is by far the largest high-quality sample of X-ray-selected galaxy clusters. The NORAS II survey now reaches a flux limit of 1.8*10-12erg/s/cm2 in the 0.1-2.4keV band. Redshifts have been obtained for all of the 860 clusters in the NORAS II catalog, except for 25 clusters for which observing campaigns are scheduled. Thus with 3% missing redshifts we can already obtain a very good view of the properties of the NORAS II cluster sample and obtain some first results. The NORAS II survey covers the sky region north of the equator outside the band of the Milky Way (|bII|>=20°). We also excise a region around the nearby Virgo cluster of galaxies that extends over several degrees on the sky, where the detection of background clusters is hampered by bright X-ray emission. This region is bounded in right ascension by R.A.=185°-191.25° and in declination by decl.=6°-15° (an area of ~53deg2). With this excision, the survey area covers 4.18 steradian (13519deg2, a fraction of 32.7% of the sky). NORAS II is based on the RASS product RASS III (Voges et al. 1999, Cat. IX/10), which was also used for REFLEX II. The NORAS II survey was constructed in a way identical to REFLEX II with a nominal flux limit of 1.8*10-12erg/s/cm2. (3 data files).

  4. ATCA observations of the MACS-Planck Radio Halo Cluster Project. II. Radio observations of an intermediate redshift cluster sample

    NASA Astrophysics Data System (ADS)

    Martinez Aviles, G.; Johnston-Hollitt, M.; Ferrari, C.; Venturi, T.; Democles, J.; Dallacasa, D.; Cassano, R.; Brunetti, G.; Giacintucci, S.; Pratt, G. W.; Arnaud, M.; Aghanim, N.; Brown, S.; Douspis, M.; Hurier, J.; Intema, H. T.; Langer, M.; Macario, G.; Pointecouteau, E.

    2018-04-01

    Aim. A fraction of galaxy clusters host diffuse radio sources whose origins are investigated through multi-wavelength studies of cluster samples. We investigate the presence of diffuse radio emission in a sample of seven galaxy clusters in the largely unexplored intermediate redshift range (0.3 < z < 0.44). Methods: In search of diffuse emission, deep radio imaging of the clusters are presented from wide band (1.1-3.1 GHz), full resolution ( 5 arcsec) observations with the Australia Telescope Compact Array (ATCA). The visibilities were also imaged at lower resolution after point source modelling and subtraction and after a taper was applied to achieve better sensitivity to low surface brightness diffuse radio emission. In case of non-detection of diffuse sources, we set upper limits for the radio power of injected diffuse radio sources in the field of our observations. Furthermore, we discuss the dynamical state of the observed clusters based on an X-ray morphological analysis with XMM-Newton. Results: We detect a giant radio halo in PSZ2 G284.97-23.69 (z = 0.39) and a possible diffuse source in the nearly relaxed cluster PSZ2 G262.73-40.92 (z = 0.421). Our sample contains three highly disturbed massive clusters without clear traces of diffuse emission at the observed frequencies. We were able to inject modelled radio haloes with low values of total flux density to set upper detection limits; however, with our high-frequency observations we cannot exclude the presence of RH in these systems because of the sensitivity of our observations in combination with the high z of the observed clusters. The reduced images are only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/611/A94

  5. A Framework for Designing Cluster Randomized Trials with Binary Outcomes

    ERIC Educational Resources Information Center

    Spybrook, Jessaca; Martinez, Andres

    2011-01-01

    The purpose of this paper is to provide a frame work for approaching a power analysis for a CRT (cluster randomized trial) with a binary outcome. The authors suggest a framework in the context of a simple CRT and then extend it to a blocked design, or a multi-site cluster randomized trial (MSCRT). The framework is based on proportions, an…

  6. Querying Co-regulated Genes on Diverse Gene Expression Datasets Via Biclustering.

    PubMed

    Deveci, Mehmet; Küçüktunç, Onur; Eren, Kemal; Bozdağ, Doruk; Kaya, Kamer; Çatalyürek, Ümit V

    2016-01-01

    Rapid development and increasing popularity of gene expression microarrays have resulted in a number of studies on the discovery of co-regulated genes. One important way of discovering such co-regulations is the query-based search since gene co-expressions may indicate a shared role in a biological process. Although there exist promising query-driven search methods adapting clustering, they fail to capture many genes that function in the same biological pathway because microarray datasets are fraught with spurious samples or samples of diverse origin, or the pathways might be regulated under only a subset of samples. On the other hand, a class of clustering algorithms known as biclustering algorithms which simultaneously cluster both the items and their features are useful while analyzing gene expression data, or any data in which items are related in only a subset of their samples. This means that genes need not be related in all samples to be clustered together. Because many genes only interact under specific circumstances, biclustering may recover the relationships that traditional clustering algorithms can easily miss. In this chapter, we briefly summarize the literature using biclustering for querying co-regulated genes. Then we present a novel biclustering approach and evaluate its performance by a thorough experimental analysis.

  7. Classification of different types of beer according to their colour characteristics

    NASA Astrophysics Data System (ADS)

    Nikolova, Kr T.; Gabrova, R.; Boyadzhiev, D.; Pisanova, E. S.; Ruseva, J.; Yanakiev, D.

    2017-01-01

    Twenty-two samples from different beers have been investigated in two colour systems - XYZ and SIELab - and have been characterised according to their colour parameters. The goals of the current study were to conduct correlation and discriminant analysis and to find the inner relation between the studied indices. K-means cluster has been used to compare and group the tested types of beer based on their similarity. To apply the K-Cluster analysis it is required that the number of clusters be determined in advance. The variant K = 4 was worked out. The first cluster unified all bright beers, the second one contained samples with fruits, the third one contained samples with addition of lemon, the fourth unified the samples of dark beers. By applying the discriminant analysis it is possible to help selections in the establishment of the type of beer. The proposed model correctly describes the types of beer on the Bulgarian market and it can be used for determining the affiliation of the beer which is not used in obtained model. One sample has been chosen from each cluster and the digital image has been obtained. It confirms the color parameters in the color system XYZ and SIELab. These facts can be used for elaboration for express estimation of beer by color.

  8. Impact of Different Visual Field Testing Paradigms on Sample Size Requirements for Glaucoma Clinical Trials.

    PubMed

    Wu, Zhichao; Medeiros, Felipe A

    2018-03-20

    Visual field testing is an important endpoint in glaucoma clinical trials, and the testing paradigm used can have a significant impact on the sample size requirements. To investigate this, this study included 353 eyes of 247 glaucoma patients seen over a 3-year period to extract real-world visual field rates of change and variability estimates to provide sample size estimates from computer simulations. The clinical trial scenario assumed that a new treatment was added to one of two groups that were both under routine clinical care, with various treatment effects examined. Three different visual field testing paradigms were evaluated: a) evenly spaced testing, b) United Kingdom Glaucoma Treatment Study (UKGTS) follow-up scheme, which adds clustered tests at the beginning and end of follow-up in addition to evenly spaced testing, and c) clustered testing paradigm, with clusters of tests at the beginning and end of the trial period and two intermediary visits. The sample size requirements were reduced by 17-19% and 39-40% using the UKGTS and clustered testing paradigms, respectively, when compared to the evenly spaced approach. These findings highlight how the clustered testing paradigm can substantially reduce sample size requirements and improve the feasibility of future glaucoma clinical trials.

  9. The Detection and Statistics of Giant Arcs behind CLASH Clusters

    NASA Astrophysics Data System (ADS)

    Xu, Bingxiao; Postman, Marc; Meneghetti, Massimo; Seitz, Stella; Zitrin, Adi; Merten, Julian; Maoz, Dani; Frye, Brenda; Umetsu, Keiichi; Zheng, Wei; Bradley, Larry; Vega, Jesus; Koekemoer, Anton

    2016-02-01

    We developed an algorithm to find and characterize gravitationally lensed galaxies (arcs) to perform a comparison of the observed and simulated arc abundance. Observations are from the Cluster Lensing And Supernova survey with Hubble (CLASH). Simulated CLASH images are created using the MOKA package and also clusters selected from the high-resolution, hydrodynamical simulations, MUSIC, over the same mass and redshift range as the CLASH sample. The algorithm's arc elongation accuracy, completeness, and false positive rate are determined and used to compute an estimate of the true arc abundance. We derive a lensing efficiency of 4 ± 1 arcs (with length ≥6″ and length-to-width ratio ≥7) per cluster for the X-ray-selected CLASH sample, 4 ± 1 arcs per cluster for the MOKA-simulated sample, and 3 ± 1 arcs per cluster for the MUSIC-simulated sample. The observed and simulated arc statistics are in full agreement. We measure the photometric redshifts of all detected arcs and find a median redshift zs = 1.9 with 33% of the detected arcs having zs > 3. We find that the arc abundance does not depend strongly on the source redshift distribution but is sensitive to the mass distribution of the dark matter halos (e.g., the c-M relation). Our results show that consistency between the observed and simulated distributions of lensed arc sizes and axial ratios can be achieved by using cluster-lensing simulations that are carefully matched to the selection criteria used in the observations.

  10. Comparing the chlorine disinfection of detached biofilm clusters with those of sessile biofilms and planktonic cells in single- and dual-species cultures.

    PubMed

    Behnke, Sabrina; Parker, Albert E; Woodall, Dawn; Camper, Anne K

    2011-10-01

    Although the detachment of cells from biofilms is of fundamental importance to the dissemination of organisms in both public health and clinical settings, the disinfection efficacies of commonly used biocides on detached biofilm particles have not been investigated. Therefore, the question arises whether cells in detached aggregates can be killed with disinfectant concentrations sufficient to inactivate planktonic cells. Burkholderia cepacia and Pseudomonas aeruginosa were grown in standardized laboratory reactors as single species and in coculture. Cluster size distributions in chemostats and biofilm reactor effluent were measured. Chlorine susceptibility was assessed for planktonic cultures, attached biofilm, and particles and cells detached from the biofilm. Disinfection tolerance generally increased with a higher percentage of larger cell clusters in the chemostat and detached biofilm. Samples with a lower percentage of large clusters were more easily disinfected. Thus, disinfection tolerance depended on the cluster size distribution rather than sample type for chemostat and detached biofilm. Intact biofilms were more tolerant to chlorine independent of species. Homogenization of samples led to significantly increased susceptibility in all biofilm samples as well as detached clusters for single-species B. cepacia, B. cepacia in coculture, and P. aeruginosa in coculture. The disinfection efficacy was also dependent on species composition; coculture was advantageous to the survival of both species when grown as a biofilm or as clusters detached from biofilm but, surprisingly, resulted in a lower disinfection tolerance when they were grown as a mixed planktonic culture.

  11. Rhenium Complexes and Clusters Supported on c-Al2O3: Effects of Rhenium Oxidation State and Rhenium Cluster Size on Catalytic Activity for n-butane Hydrogenolysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lobo Lapidus, R.; Gates, B

    2009-01-01

    Supported metals prepared from H{sub 3}Re{sub 3}(CO){sub 12} on {gamma}-Al{sub 2}O{sub 3} were treated under conditions that led to various rhenium structures on the support and were tested as catalysts for n-butane conversion in the presence of H{sub 2} in a flow reactor at 533 K and 1 atm. After use, two samples were characterized by X-ray absorption edge positions of approximately 5.6 eV (relative to rhenium metal), indicating that the rhenium was cationic and essentially in the same average oxidation state in each. But the Re-Re coordination numbers found by extended X-ray absorption fine structure spectroscopy (2.2 and 5.1)more » show that the clusters in the two samples were significantly different in average nuclearity despite their indistinguishable rhenium oxidation states. Spectra of a third sample after catalysis indicate approximately Re{sub 3} clusters, on average, and an edge position of 4.5 eV. Thus, two samples contained clusters approximated as Re{sub 3} (on the basis of the Re-Re coordination number), on average, with different average rhenium oxidation states. The data allow resolution of the effects of rhenium oxidation state and cluster size, both of which affect the catalytic activity; larger clusters and a greater degree of reduction lead to increased activity.« less

  12. The identification of credit card encoders by hierarchical cluster analysis of the jitters of magnetic stripes.

    PubMed

    Leung, S C; Fung, W K; Wong, K H

    1999-01-01

    The relative bit density variation graphs of 207 specimen credit cards processed by 12 encoding machines were examined first visually, and then classified by means of hierarchical cluster analysis. Twenty-nine credit cards being treated as 'questioned' samples were tested by way of cluster analysis against 'controls' derived from known encoders. It was found that hierarchical cluster analysis provided a high accuracy of identification with all 29 'questioned' samples classified correctly. On the other hand, although visual comparison of jitter graphs was less discriminating, it was nevertheless capable of giving a reasonably accurate result.

  13. The Study on Mental Health at Work: Design and sampling.

    PubMed

    Rose, Uwe; Schiel, Stefan; Schröder, Helmut; Kleudgen, Martin; Tophoven, Silke; Rauch, Angela; Freude, Gabriele; Müller, Grit

    2017-08-01

    The Study on Mental Health at Work (S-MGA) generates the first nationwide representative survey enabling the exploration of the relationship between working conditions, mental health and functioning. This paper describes the study design, sampling procedures and data collection, and presents a summary of the sample characteristics. S-MGA is a representative study of German employees aged 31-60 years subject to social security contributions. The sample was drawn from the employment register based on a two-stage cluster sampling procedure. Firstly, 206 municipalities were randomly selected from a pool of 12,227 municipalities in Germany. Secondly, 13,590 addresses were drawn from the selected municipalities for the purpose of conducting 4500 face-to-face interviews. The questionnaire covers psychosocial working and employment conditions, measures of mental health, work ability and functioning. Data from personal interviews were combined with employment histories from register data. Descriptive statistics of socio-demographic characteristics and logistic regressions analyses were used for comparing population, gross sample and respondents. In total, 4511 face-to-face interviews were conducted. A test for sampling bias revealed that individuals in older cohorts participated more often, while individuals with an unknown educational level, residing in major cities or with a non-German ethnic background were slightly underrepresented. There is no indication of major deviations in characteristics between the basic population and the sample of respondents. Hence, S-MGA provides representative data for research on work and health, designed as a cohort study with plans to rerun the survey 5 years after the first assessment.

  14. MULTI-K: accurate classification of microarray subtypes using ensemble k-means clustering

    PubMed Central

    Kim, Eun-Youn; Kim, Seon-Young; Ashlock, Daniel; Nam, Dougu

    2009-01-01

    Background Uncovering subtypes of disease from microarray samples has important clinical implications such as survival time and sensitivity of individual patients to specific therapies. Unsupervised clustering methods have been used to classify this type of data. However, most existing methods focus on clusters with compact shapes and do not reflect the geometric complexity of the high dimensional microarray clusters, which limits their performance. Results We present a cluster-number-based ensemble clustering algorithm, called MULTI-K, for microarray sample classification, which demonstrates remarkable accuracy. The method amalgamates multiple k-means runs by varying the number of clusters and identifies clusters that manifest the most robust co-memberships of elements. In addition to the original algorithm, we newly devised the entropy-plot to control the separation of singletons or small clusters. MULTI-K, unlike the simple k-means or other widely used methods, was able to capture clusters with complex and high-dimensional structures accurately. MULTI-K outperformed other methods including a recently developed ensemble clustering algorithm in tests with five simulated and eight real gene-expression data sets. Conclusion The geometric complexity of clusters should be taken into account for accurate classification of microarray data, and ensemble clustering applied to the number of clusters tackles the problem very well. The C++ code and the data sets tested are available from the authors. PMID:19698124

  15. Distinct Gene Expression Patterns between Nasal Mucosal Cells and Blood Collected from Allergic Rhinitis Sufferers.

    PubMed

    Watts, Annabelle M; West, Nicholas P; Cripps, Allan W; Smith, Pete K; Cox, Amanda J

    2018-06-19

    Investigations of gene expression in allergic rhinitis (AR) typically rely on invasive nasal biopsies (site of inflammation) or blood samples (systemic immunity) to obtain sufficient genetic material for analysis. New methodologies to circumvent the need for invasive sample collection offer promise to further the understanding of local immune mechanisms relevant in AR. A within-subject design was employed to compare immune gene expression profiles obtained from nasal washing/brushing and whole blood samples collected during peak pollen season. Twelve adults (age: 46.3 ± 12.3 years) with more than a 2-year history of AR and a confirmed grass pollen allergy participated in the study. Gene expression analysis was performed using a panel of 760 immune genes with the NanoString nCounter platform on nasal lavage/brushing cell lysates and compared to RNA extracted from blood. A total of 355 genes were significantly differentially expressed between sample types (9.87 to -9.71 log2 fold change). The top 3 genes significantly upregulated in nasal lysate samples were Mucin 1 (MUC1), Tight Junction Protein 1 (TJP1), and Lipocalin-2 (LCN2). The top 3 genes significantly upregulated in blood samples were cluster of differentiation 3e (CD3E), FYN Proto-Oncogene Src Family Tyrosine Kinase (FYN) and cluster of differentiation 3d (CD3D). Overall, the blood and nasal lavage samples showed vastly distinct gene expression profiles and functional gene pathways which reflect their anatomical and functional origins. Evaluating immune gene expression of the nasal mucosa in addition to blood samples may be beneficial in understanding AR pathophysiology and response to allergen challenge. © 2018 S. Karger AG, Basel.

  16. Whole-genome analysis of mycobacteria from birds at the San Diego Zoo.

    PubMed

    Pfeiffer, Wayne; Braun, Josephine; Burchell, Jennifer; Witte, Carmel L; Rideout, Bruce A

    2017-01-01

    Mycobacteria isolated from more than 100 birds diagnosed with avian mycobacteriosis at the San Diego Zoo and its Safari Park were cultured postmortem and had their whole genomes sequenced. Computational workflows were developed and applied to identify the mycobacterial species in each DNA sample, to find single-nucleotide polymorphisms (SNPs) between samples of the same species, to further differentiate SNPs between as many as three different genotypes within a single sample, and to identify which samples are closely clustered genomically. Nine species of mycobacteria were found in 123 samples from 105 birds. The most common species were Mycobacterium avium and Mycobacterium genavense, which were in 49 and 48 birds, respectively. Most birds contained only a single mycobacterial species, but two birds contained a mixture of two species. The M. avium samples represent diverse strains of M. avium avium and M. avium hominissuis, with many pairs of samples differing by hundreds or thousands of SNPs across their common genome. By contrast, the M. genavense samples are much closer genomically; samples from 46 of 48 birds differ from each other by less than 110 SNPs. Some birds contained two, three, or even four genotypes of the same bacterial species. Such infections were found in 4 of 49 birds (8%) with M. avium and in 11 of 48 birds (23%) with M. genavense. Most were mixed infections, in which the bird was infected by multiple mycobacterial strains, but three infections with two genotypes differing by ≤ 10 SNPs were likely the result of within-host evolution. The samples from 31 birds with M. avium can be grouped into nine clusters within which any sample is ≤ 12 SNPs from at least one other sample in the cluster. Similarly, the samples from 40 birds with M. genavense can be grouped into ten such clusters. Information about these genomic clusters is being used in an ongoing, companion study of mycobacterial transmission to help inform management of bird collections.

  17. Whole-genome analysis of mycobacteria from birds at the San Diego Zoo

    PubMed Central

    Pfeiffer, Wayne; Braun, Josephine; Burchell, Jennifer; Witte, Carmel L.; Rideout, Bruce A.

    2017-01-01

    Methods Mycobacteria isolated from more than 100 birds diagnosed with avian mycobacteriosis at the San Diego Zoo and its Safari Park were cultured postmortem and had their whole genomes sequenced. Computational workflows were developed and applied to identify the mycobacterial species in each DNA sample, to find single-nucleotide polymorphisms (SNPs) between samples of the same species, to further differentiate SNPs between as many as three different genotypes within a single sample, and to identify which samples are closely clustered genomically. Results Nine species of mycobacteria were found in 123 samples from 105 birds. The most common species were Mycobacterium avium and Mycobacterium genavense, which were in 49 and 48 birds, respectively. Most birds contained only a single mycobacterial species, but two birds contained a mixture of two species. The M. avium samples represent diverse strains of M. avium avium and M. avium hominissuis, with many pairs of samples differing by hundreds or thousands of SNPs across their common genome. By contrast, the M. genavense samples are much closer genomically; samples from 46 of 48 birds differ from each other by less than 110 SNPs. Some birds contained two, three, or even four genotypes of the same bacterial species. Such infections were found in 4 of 49 birds (8%) with M. avium and in 11 of 48 birds (23%) with M. genavense. Most were mixed infections, in which the bird was infected by multiple mycobacterial strains, but three infections with two genotypes differing by ≤ 10 SNPs were likely the result of within-host evolution. The samples from 31 birds with M. avium can be grouped into nine clusters within which any sample is ≤ 12 SNPs from at least one other sample in the cluster. Similarly, the samples from 40 birds with M. genavense can be grouped into ten such clusters. Information about these genomic clusters is being used in an ongoing, companion study of mycobacterial transmission to help inform management of bird collections. PMID:28267758

  18. Study protocol for a cluster randomized trial of the Community of Voices choir intervention to promote the health and well-being of diverse older adults.

    PubMed

    Johnson, Julene K; Nápoles, Anna M; Stewart, Anita L; Max, Wendy B; Santoyo-Olsson, Jasmine; Freyre, Rachel; Allison, Theresa A; Gregorich, Steven E

    2015-10-13

    Older adults are the fastest growing segment of the United States population. There is an immediate need to identify novel, cost-effective community-based approaches that promote health and well-being for older adults, particularly those from diverse racial/ethnic and socioeconomic backgrounds. Because choral singing is multi-modal (requires cognitive, physical, and psychosocial engagement), it has the potential to improve health outcomes across several dimensions to help older adults remain active and independent. The purpose of this study is to examine the effect of a community choir program (Community of Voices) on health and well-being and to examine its costs and cost-effectiveness in a large sample of diverse, community-dwelling older adults. In this cluster randomized controlled trial, diverse adults age 60 and older were enrolled at Administration on Aging-supported senior centers and completed baseline assessments. The senior centers were randomly assigned to either start the choir immediately (intervention group) or wait 6 months to start (control). Community of Voices is a culturally tailored choir program delivered at the senior centers by professional music conductors that reflects three components of engagement (cognitive, physical, and psychosocial). We describe the nature of the study including the cluster randomized trial study design, sampling frame, sample size calculation, methods of recruitment and assessment, and primary and secondary outcomes. The study involves conducting a randomized trial of an intervention as delivered in "real-world" settings. The choir program was designed using a novel translational approach that integrated evidence-based research on the benefits of singing for older adults, community best practices related to community choirs for older adults, and the perspective of the participating communities. The practicality and relatively low cost of the choir intervention means it can be incorporated into a variety of community settings and adapted to diverse cultures and languages. If successful, this program will be a practical and acceptable community-based approach for promoting health and well-being of older adults. ClinicalTrials.gov NCT01869179 registered 9 January 2013.

  19. Population impact of a high cardiovascular risk management program delivered by village doctors in rural China: design and rationale of a large, cluster-randomized controlled trial.

    PubMed

    Yan, Lijing L; Fang, Weigang; Delong, Elizabeth; Neal, Bruce; Peterson, Eric D; Huang, Yining; Sun, Ningling; Yao, Chen; Li, Xian; MacMahon, Stephen; Wu, Yangfeng

    2014-04-11

    The high-risk strategy has been proven effective in preventing cardiovascular disease; however, the population benefits from these interventions remain unknown. This study aims to assess, at the population level, the effects of an evidence-based high cardiovascular risk management program delivered by village doctors in rural China. The study will employ a cluster-randomized controlled trial in which a total of 120 villages in five northern provinces of China, will be assigned to either intervention (60 villages) or control (60 villages). Village doctors in intervention villages will be trained to implement a simple evidence-based management program designed to identify, treat and follow-up as many as possible individuals at high-risk of cardiovascular disease in the village. The intervention will also include performance feedback as well as a performance-based incentive payment scheme and will last for 2 years. We will draw two different (independent) random samples, before and after the intervention, 20 men aged≥50 years and 20 women aged≥60 years from each village in each sample and a total of 9,600 participants from 2 samples to measure the study outcomes at the population level. The primary outcome will be the pre-post difference in mean systolic blood pressure, analyzed with a generalized estimating equations extension of linear regression model to account for cluster effect. Secondary outcomes will include monthly clinic visits, provision of lifestyle advice, use of antihypertensive medications and use of aspirin. Process and economic evaluations will also be conducted. This trial will be the first implementation trial in the world to evaluate the population impact of the high-risk strategy in prevention and control of cardiovascular disease. The results are expected to provide important information (effectiveness, cost-effectiveness, feasibility and acceptability) to guide policy making for rural China as well as other resource-limited countries. The trial is registered at ClinicalTrials.gov (NCT01259700). Date of initial registration is December 13, 2010.

  20. CCD photometry of NGC 6101 - Another globular cluster with blue straggler stars

    NASA Technical Reports Server (NTRS)

    Sarajedini, Ata; Da Costa, G. S.

    1991-01-01

    Results are presented on CCD photometric observations of a large sample of stars in the southern globular cluster NGC 6101, and the procedures used to derive the color-magnitude (C-M) diagram of the cluster are described. No indication was found of any difference in age, at the less than 2 Gyr level, between NGC 6101 cluster and other clusters of similar abundance, such as M92. The C-M diagram revealed a significant blue straggler population. It was found that, in NGC 6101, these stars are more centrally concentrated than the cluster subgiants of similar magnitude, indicating that the blue stragglers have larger masses. Results on the magnitude and luminosity function of the sample are consistent with the bianry mass transfer or merger hypotheses for the origin of blue straggler stars.

Top